38
Biological variation in large groups is common. e.g : BP, wt What is normal variation? and How to measure? Measure of dispersion helps to find how individual observations are dispersed around the central tendency of a large series Deviation = Observation - Mean 07/03/22 1 STATISTICS

Presentation1group b

Embed Size (px)

Citation preview

Page 1: Presentation1group b

Biological variation in large groups is common. e.g : BP, wt

What is normal variation? and How to measure?

Measure of dispersion helps to find how individual observations are dispersed around the central tendency of a large series

Deviation = Observation - Mean

04/12/23 1STATISTICS

Page 2: Presentation1group b

Range

Quartile deviation

Mean deviation

Standard deviation

Variance

Coefficient of variance : indicates relative variability (SD/Mean) x100

04/12/23 2STATISTICS

Page 3: Presentation1group b

Range : difference between the highest and the lowest value

Problem: Systolic and diastolic pressure of 10 medical students are as follows:

140/70, 120/88, 160/90, 140/80, 110/70, 90/60, 124/64, 100/62, 110/70 & 154/90. Find out the range of systolic and diastolic blood pressure

Solution: Range of systolic blood pressure of medical students: 90-160 or 70 Range of diastolic blood pressure of medical students: 60-90 or 30

Mean Deviation: average deviations of observations from mean value _ Σ (X – X ) __ Mean deviation (M.D) = --------------- , ( where X = observation, X = Mean n n= number of observation )

04/12/23 3STATISTICS

Page 4: Presentation1group b

 Problem: Find out the mean deviation of incubation period of measles of 7 children, which are as follows: 10, 9, 11, 7, 8, 9, 9.

Solution:

Observation (X)

__Mean ( X )

__Deviation (X - X)

10 __

X = Σ X / n = 63 / 7 = 9

1

9 0

11 2

7 -2

8 -1

9 0

9 0

ΣX=63 _Σ (X-X) = 6, ignoring + or - signs

Mean deviation (MD) = _ Σ X - X = ------------ n

= 6 / 7 = 0.85

04/12/23 4STATISTICS

Page 5: Presentation1group b

It is the most frequently used measure of dispersion

S.D is the Root-Means-Square-Deviation

S.D is denoted by σ or S.D ___________ Σ ( X – X ) 2 S.D (σ) = γ---------------------- n

04/12/23 5STATISTICS

Page 6: Presentation1group b

Calculate the mean ↓ Calculate difference between each observation and mean ↓ Square the differences ↓ Sum the squared values ↓ Divide the sum of squares by the no. observations (n) to get ‘mean square

deviation’ or variances (σ2). [For sample size < 30, it will be divided by (n-1)] ↓ Find the square root of variance to get Root-Means-Square-Deviation or S.D

(σ)

04/12/23 6STATISTICS

Page 7: Presentation1group b

Observation (X)

__Mean ( X )

_Deviation (X- X)

__

(X-X) 2

58 __ X = Σ X / n = 984/12 = 82

-12 576

66 -16 256

70 -12 144

74 -8 64

80 -2 4

86 -4 16

90 8 64

100 18 324

79 -3 9

96 14 196

88 6 36

97 15 225

Σ X = 984 _ Σ (X - X)2 =1914

S.D (σ ) = = Σ(X –X) 2 / n-1

=(√1924/ (12-1) _____= √174

= 13.2

04/12/23 7STATISTICS

Page 8: Presentation1group b

x

The Empirical Rule(applies to bell-shaped distributions)FIGURE 2-15

04/12/23 8STATISTICS

Page 9: Presentation1group b

x - s x x + s

68% within1 standard deviation

34% 34%

The Empirical Rule(applies to bell-shaped distributions)FIGURE 2-15

04/12/23 9STATISTICS

Page 10: Presentation1group b

x - 2s x - s x x + 2sx + s

68% within1 standard deviation

34% 34%

95% within 2 standard deviations

The Empirical Rule(applies to bell-shaped distributions)

13.5% 13.5%

FIGURE 2-15

04/12/23 10STATISTICS

Page 11: Presentation1group b

x - 3s x - 2s x - s x x + 2s x + 3sx + s

68% within1 standard deviation

34% 34%

95% within 2 standard deviations

99.7% of data are within 3 standard deviations of the mean

The Empirical Rule(applies to bell-shaped distributions)

0.1% 0.1%

2.4% 2.4%

13.5% 13.5%

FIGURE 2-15

04/12/23 11STATISTICS

Page 12: Presentation1group b

Other names : Frequency distribution curve, Normal curve, Gaussian Curve etc.

Most of the biological variables (continuous) follow normal distribution

Applicable for quantitative data (when large no. of observations)

Quantitative data - represented by a histogram & by joining midpoints of each rectangle in the histogram we can get a frequency polygon

When number of observations become very large and class interval very much reduced, the frequency polygon loses its angulations and gives rise to a smooth curve known as frequency curve.

04/12/23 12STATISTICS

Page 13: Presentation1group b

Mean 1 SD limit, includes 68.27% of all the observations

Mean 1.96 SD limit, includes 95% of all observations

Mean 2 SD limit, includes 95.45% of all observations

Mean 2.58 SD limit, includes 99% of all observations

Mean 3 SD limit, includes 99.73% of all observations04/12/23 13STATISTICS

Page 14: Presentation1group b

Observations of a continuous variable, those are normally distributed in a popln., when plotted as a frequency curve give rise to Normal Curve

The characteristics of Normal Curve:

- A smooth bell shaped symmetrical curve - A area under the curve is 1 or 100%. - Mean, median and mode - identical (at same point). - Never touch the base line. - Limit on either side is called ‘Confidence limit’. - Curve tells the probability of occurrence by chance (sample

variability) or how many times an observation can occur normally in the popln. - Distribution of observations under normal curve follows the same pattern of Normal Distribution 04/12/23 14STATISTICS

Page 15: Presentation1group b

Each observation under a normal curve has a ‘Z’ value

Z (standard normal variate or relative deviate or critical ratio) is the measure of distance of the observation from mean in terms of standard deviation

__ Z=(Observation-Mean)/S.D=( X - X ) / S.D

So, if ‘Z’ score is – 2, it means that the observation is 2 S.D away from mean on left hand side. Similarly, Z is + 2, it means that the observation is 2 S.D away from mean on right hand side.

When ‘Z’ score is expressed in terms of absolute value, suppose, 2, it means that the observation is 2 S.D away from mean irrespective of the direction.

If all observations of normal curves are replaced by ‘Z’ score, virtually all curves become the same. This standardized curve is known as

STANDARD NORMAL CURVE

04/12/23 15STATISTICS

Page 16: Presentation1group b

Properties : - All properties of Normal Curve - Area under the curve is 1 - Mean, median & mode coincide and they are 0 - Standard deviation is 1

The Standard Normal Curve and Areas within 1, 2, 3 SD's of the Mean

04/12/23 16STATISTICS

Page 17: Presentation1group b

Areas within 1 & 2 S.D's of the Mean ( Mean-36, SD-8) and (Mean-70, SD-3)

04/12/23 17STATISTICS

Page 18: Presentation1group b

The confidence level or reliability is the expected percentage of times that the actual values will fall within the stated precision limit.

Thus 95 % CI mean that there are 95 chances in 100 (or 0.95 in 1) that the sample results represent the true condition of population within a specified precision range against 5 chances in 100 (0.05 in 1) that it does not.

Precision is the range within which the answer may vary and still be accepted

CI indicates the chance that the answer will fall within that range & Significance level indicates the likelihood that the answer will fall outside that range

We always remember that if the confidence level is 95%, then the significance level will be (100-95) i.e., 5%; if the confidence level is 99%, significance level is (100-99) i.e.,1%

Area of normal curve within precision limits for the specified CI constitutes the accepted zone and area of curve outside this limit in either direction constitutes the rejection zone.

04/12/23 18STATISTICS

Page 19: Presentation1group b

__ __

CI= Mean ± Z SE (Mean) = X ± Z SE (X)

  _ _ 95% CI = X ± 1.96 SE (X)  _ _ 99% CI = X ± 2.58 SE (X )  

04/12/23 19STATISTICS

Page 20: Presentation1group b

Large sample- sample size > 30 Small sample- sample size > 30Hypothesis – Null ( H0 )- assumes that there is no difference b/w

two values such as population means or proportions Ho : Mean of popn. A = Mean of popn. B µ1= µ2 OR P1 =P2

b. Alternative ( H1 )-hypothesis that differs from HoH1: µ1≠ µ2 or µ1 > µ2 or µ1 < µ2

6. Sampling errors – a. Type 1 error b. Type 2 error

Page 21: Presentation1group b

State the Null Hypothesis State the Alternative Hypothesis Decide whether to use 1 or 2 tail test Specify the level of significance(5 or 1%) Select appropriate test, follow calculation

based on type of the test Compare calculated value with the

theoretical value If calculated value> theoretical value,

reject Null Hypothesis and if <, then accept it

Make conclusion on the basis of the above

Page 22: Presentation1group b

Tests of Significance

DATA

Discrete (Qualitative)

Continuous

Non- Parametric Test

Chi- square, Fishers exact sign, Mann Whitney

Parametric Tests

Z-test, t-test

ANOVA test

04/12/23 22STATISTICS

Page 23: Presentation1group b

Conditions to apply 2 test: - Applicable on qualitative data, obtained from random sample. - Based on frequency, not on parameter like %, rates, ratios, mean or S.D - Observed frequency not less than 5

Application of 2 test: - Comparison of proportions of two or more than two samples - Comparison of observed proportion with a hypothesized one (goodness of fit) - Comparison of paired observations (Mc Nemar 2 test) - Trend 2 test

N.B : Yates’ correction: When the expected frequency in any cell of the (2x2) table is less than 5 then Yates’ correction (correction for continuity) done

04/12/23 23STATISTICS

Page 24: Presentation1group b

Step - 1: Write down the null hypothesis

Step –2: Make a contingency table & calculate the Expected frequencies Expected Frequency= (Row total X Column total) / Grand total

Step-3: Compute the value of 2 test

2 = Sum (observed value-Expected value) 2/ Expected value = (O-E) 2 / E

Step-4: Find out the degree of freedom d.f= (r-1) (c-1) Step-5: Obtain the tabulated value under the column p=0.05 or p=0.01, of 2 test table

Step-6: Compare 2 calculated with table value. If calculated value of 2 test is greater than table value, reject null hypothesis, otherwise accept it.

Step-7: Write down the conclusion

04/12/23 24STATISTICS

Page 25: Presentation1group b

Cure rate of treatment A & B are 90%out of 100 patients & 70% out of

150 patients. Are treatment A & B equally effective?

1. Ho :No difference in cure rate b/w t/t A & B

2. 2 Χ2 contigency table3. Computation of value of

T/t Outcome Total

Cure

NotCured

A 90 10 100

B 105 45 150

Total 195 55 250

Observed value

Page 26: Presentation1group b

Calculated value 13.99 > tabulated Value 3.84Null hypothesis rejectedConclusion:-

Treatment A more effective thanTreatment B

T/t Outcome Total

Cure

NotCured

A 78 22 100

B 117 33 150

Total 195 55 250

Expected value

2ג =∑ (O-E)2

E(90-78)2 + (10-22)2 +(105-117)2+(45-33)2

78 22 117 33 = 13.99

Page 27: Presentation1group b

A pharmaceutical claimed that their new product can cure 80% of pts. But on trial, it was revealed that 56 have been cured out of 80( 70%).Do you agree with the company that cure rate is 80%

T/t Outcome with new drug

Total

Cure

NotCured

Obs.value

56 24 80

Hypotheticalvalue

64 16 80

Total 120 40 160

5= 2ג

It is >3.84Reject HoEfficacy -80%

Page 28: Presentation1group b

Comparison of i. Proportions of >=2 samplesii. Observed proportion with a hypothesized one ( goodness of

fit )iii. Paired observations (McNemar test)LIMITATIONS – A. Yates’ correction reqd. if the expected value in each cell is

<5 ∑{ O-E - ½} 2

E

Or, =[(ad –bc)- n/2]2 ΧN (a+b)(c+d)(a+c)(b+d)B. In tables larger than 2Χ2, Yates’ correction not applicableC. Does n’t measure the strength, but tells of presence or

absence of any associationD. Statistical finding of relation doesnot indicate cause and effect

Page 29: Presentation1group b

Identify your objective

Collect sample data

Use a random procedure that   avoids bias

Analyze the data and form   conclusions

04/12/23 29STATISTICS

Page 30: Presentation1group b

Convenience Sampling - use results that are readily available

04/12/23 30STATISTICS

Page 31: Presentation1group b

Random Sampling - selection so that each has an equal chance of being selected

04/12/23 31STATISTICS

Page 32: Presentation1group b

Systematic Sampling - Select some starting point and then select every K th element in the population

04/12/23 32STATISTICS

Page 33: Presentation1group b

Stratified Sampling - subdivide the population into subgroups that share the same characteristic, then draw a sample from each stratum

04/12/23 33STATISTICS

Page 34: Presentation1group b

Cluster Sampling - divide the population into sections (or clusters); randomly select some of those clusters; choose all members from selected clusters

04/12/23 34STATISTICS

Page 35: Presentation1group b

Sampling Error the difference between a sample result and the true

population result; such an error results from chance sample fluctuations.

Nonsampling Error sample data that are incorrectly collected, recorded, or analyzed (such as by selecting a biased sample, using a defective instrument, or copying the data incorrectly).

Definitions

04/12/23 35STATISTICS

Page 36: Presentation1group b

a c e b d

04/12/23 36STATISTICS

Page 37: Presentation1group b

When Null Hypothesis is true,but still rejected,it is Type 1 (α) error

When Null Hypothesis is false,but still accepted,it is Type 2 (β) error

Level of Significance- The prob.of committing Type 1 error.

Power of test – Ability of the test to correctly reject Ho in favour of H1 when Ho is false. It is the prob.of committing Type 2error.

04/12/23 37STATISTICS

Page 38: Presentation1group b

Population Conclusion based on sampleNull hypothesis Null hypothesisRejected Accepted

Null hypothesisTrue

Type 1 error Correct decision

Null hypothesisFalse

Correct decision

Type 2 error

SAMPLING ERRORS

04/12/23 38STATISTICS