Parametric tests seminar

Preview:

Citation preview

1

PARAMETRIC TESTS

DR DEEPIKA G

1ST YEAR PG

DEPT OF PHARMACOLOGY

2

CONTENTS:

INTRODUCTION

STATISTICAL DEFINITIONS

MEASURES OF CENTRAL TENDENCY AND

DISPERSION

DISTRIBUTION AND HYPOTHESIS

PARAMETRIC TESTS

REFERENCES

3

INTRODUCTION

• Statistics:- science of data

- study of uncertainty

• Biostatistics: data from: Medicine, Biological

sciences (business, education, psychology,

agriculture, economics...)

• Types: Descriptive statistics

Inferential statistics

4

1. Descriptive Statistics - overview

of the attributes of a data set. These include

measurements of central tendency (frequency

histograms, mean, median, & mode) and

dispersion (range, variance & standard

deviation)

2. Inferential Statistics - provide measures of how

well data support hypothesis and if

data are generalizable beyond what was

tested (significance tests)

5

Data: Observations recorded during research

Types of data:

1. Nominal data synonymous with categorical

data, assigned names/ categories based on

characters with out ranking between categories.

ex. male/female, yes/no, death /survival

6

2. Ordinal data ordered or graded data,

expressed as Scores or ranks

ex. pain graded as mild, moderate and severe

3. Interval data an equal and definite interval

between two measurements

it can be continuous or discrete

ex. weight expressed as 20, 21,22,23,24

interval between 20 & 21 is same as 23 &24

7

Measures of Central Tendencies:

•In a normal distribution, mean and median are the same

•If median and mean are different, indicates that the data are not normally distributed

•The mode is of little if any practical use

8

MEASURES OF VARIABILITY

Range: It is the interval between the highest and lowest observations.• Ex. Diastolic BP of 5 individuals is90,80,78,84,98.

Highest observation is 98

Lowest observation is 78

Range is 98-78= 20.

9

Standard deviation(SD): it is defined as positive square root of arithmetic mean of the square of the deviations taken from the arithmetic mean.

• It describes the variability of the observation about the mean.

Variance: average square deviation around the

mean.

variance =∑(X-X-)2 or ∑(X-X-)2

n n-1 valuesofNumber

Value) Mean - Value l(Individua of Sum SD

2

10

Coefficient Of Variance(cv):

It is the standard deviation(SD) expressed as a

percentage of the mean.

CV= SD / mean* 100

• It is dimensionless (independent of any unit of

measurement)

11

Correlation coefficient:

It measures relationship between two variables.

denoted by ‘r’ , unitless quantity,

it is a pure number.

values lie between -1 and +1

if variables not correlated CC will be zero.

12

PROBABILTY DISTRIBUTIONS1. Binomial Distribution:The conditions to be fulfilled i. There is fixed number(n) of trials;

ii. Only two outcomes, ‘success’ and ‘failure’, are possible at each trial;

iii. The trials are independent,

iv. There is constant probability ) of success at each trial;

v. The variable is the total number of successes in n trials.

13

2. Poisson Distribution:

• There are situations in which number of times an

event occurs is meaningful and can be counted

but the number of times the event did not occur is

meaningless or can not be counted.

• It is discrete and has an infinite number of

possible values.

• It has single parameter .

14

3.Gaussian or Normal Distribution:

Important characteristics are:

i. The shape of the distribution resembles a bell

and is symmetric around the midpoint;

ii. At the centre of distribution which is peaked,

mean median and mode coincide;

15

iii. The area under the curve between any two

points which correspond to the proportion of

observations between any two values of the

variate can be found out in terms of a

relationship between the mean and the

standard deviation.

iv. Parameters used mean() and SD()

16

• Standard Error Of Mean:The square root of the variance of the sample means

SE of sample mean = SD/

SE of sample proportion = • Applications of SEM:

i. To determine whether a sample is drawn from the same population or not when its mean is known.

ii. To work out the limits of desired confidence within which the population mean should lie.

17

Confidence Interval Or Fiducial Limits:• Confidence limits are two extremes of measurements within which 95% of observations would lie.

Lower confidence limit = mean – ( t0.05 X SEM)

Upper confidence limit = mean + ( t0.05 X SEM)

• The important difference between ‘p’ value and confidence interval is confidence interval represents clinical significance and ‘p’ value indicates statistical significance.

18

Standard Normal Distribution

Mean +/- 1 SD encompasses 68% of observations

Mean +/- 2 SD encompasses 95% of observations

Mean +/- 3SD encompasses 99.7% of observations

19

Statistical Hypothesis:• They are hypothesis that are stated in such a way that they may be evaluated by appropriate statistical techniques.

• There are two types of hypothesis testing: • Null hypothesis H0: It is the hypothesis which assumes that there is no difference between two values. H0:

• Alternative hypothesis HA : It is the hypothesis that differs from null hypothesis.

• HA:

20

Hypothesis Errors:

Type-I Error:

• It is probability of finding difference; when

no such difference actually exists.

• Acceptance of inactive compound

• It is also known as error/ false positive

21

Type-II Error:

• It is probability of inability to detect difference;

when such difference actually exists, thus

resulting in rejection of active compound as an

inactive.

• It is called as error/ false negative.

22

Level of significance(l.o.s):• The probability of committing type I error • Denoted by • L.o.s of 0.05% means risk of making wrong decisions only is 5 out of 100 cases i.e 95% confident

Power of the test:• It is probability of committing type II error• Denoted by 1- is power of the test• Power is probability of rejecting H0 when H0 is false i.e correct decision.

23

• The p-value is defined as the smallest

value of α for which the null hypothesis can

be rejected.

• If the p-value is less than α ,we reject the

null hypothesis (pα)

• If the p-value is greater than α ,we do not

reject the null hypothesis (p α)

24

Critical RegionOne tailed test:

• The rejection is in one or other tail of distribution

• The difference could only be their in one

direction/ possibility

• Ex. English men are taller than Indian men.

25

Two Tailed Test:

• The rejection is split between two sides or tails of

distribution

• The difference could be in both direction/

possibility

• Ex. Comparative study of drug ‘X’ with atenolol

for antihypertensive property

26

27

SAMPLE SIZE:• Large Sample : sample of size is more than 30• Small Sample: sample of size less than or equal to 30

• Many statistical test are based upon the assumption that the data are sampled from a Gaussian distribution.

• Procedures for testing hypotheses about parameters in a population described by a specified distributional form, (normal distribution) are called parametric tests.

28

Types of Parametric tests

  1. Large sample tests

Z-test

2. Small sample tests

t-test

* Independent/ unpaired t-test

* Paired t-test

ANOVA (Analysis of variance) * One way ANOVA

* Two way ANOVA

29

Z- Test:

• A z-test is used for testing the mean of a

population versus a standard, or comparing

the means of two populations, with large (n

≥ 30) samples whether you know the

population standard deviation or not.

30

• It is also used for testing the proportion of some

characteristic versus a standard proportion, or

comparing the proportions of two populations.

Ex. Comparing the average engineering salaries

of men versus women.

Ex. Comparing the fraction defectives from two

production lines.

31

T- test: Derived by W S Gosset in 1908.• Properties of t distribution:

i. It has mean 0

ii. It has variance greater than one

iii. It is bell shaped symmetrical distribution about mean

• Assumption for t test:

i. Sample must be random, observations independent

ii. Standard deviation is not known

iii. Normal distribution of population

32

Uses of t test:

i. The mean of the sample

ii. The difference between means or to compare

two samples

iii. Correlation coefficient

Types of t test:

a. Paired t test

b. Unpaired t test

33

Paired t test:

• Consists of a sample of matched pairs of similar 

units, or one group of units that has been tested

twice (a "repeated measures" t-test).

• Ex. where subjects are tested prior to a

treatment, say for high blood pressure, and the

same subjects are tested again after treatment

with a blood-pressure lowering medication.

34

Unpaired t test:

• When two separate sets of 

independent and identically distributed samples are

obtained, one from each of the two populations being

compared.

• Ex: 1. compare the height of girls and boys.

2. compare 2 stress reduction interventions

when one group practiced mindfulness meditation

while the other learned progressive muscle

relaxation. 

35

ANALYSIS OF VARIANCE(ANOVA):

• Analysis of variance (ANOVA) is a collection of 

statistical models used to analyze the differences between

group means and their associated procedures (such as

"variation" among and between groups),

• Compares multiple groups at one time

• Developed by R.A. Fisher.

• Two types: i. One way ANOVA

ii. Two way ANOVA

36

It compares three or more unmatched groups

when data are categorized in one way

Ex.

1. Compare control group with three different

doses of aspirin in rats

2. Effect of supplementation of vit C in each

subject before , during and after the treatment.

One Way ANOVA:

37

Two way ANOVA:

• Used to determine the effect of two nominal

predictor variables on a continuous outcome

variable.

• A two-way ANOVA test analyzes the effect of the

independent variables on the expected outcome

along with their relationship to the outcome itself.

38

Difference between one & two way ANOVA

• An example of when a one-way ANOVA could be

used is if we want to determine if there is a

difference in the mean height of stalks of three

different types of seeds. Since there is more than

one mean, we can use a one-way ANOVA since

there is only one factor that could be making the

heights different. 

39

• Now, if we take these three different types of

seeds, and then add the possibility that three

different types of fertilizer is used, then we would

want to use a two-way ANOVA.

• The mean height of the stalks could be different

for a combination of several reasons: 

40

• The types of seed could cause the change, 

the types of fertilizer could cause the change,

and/or there is an interaction between the type of

seed and the type of fertilizer. 

• There are two factors here (type of seed and type

of fertilizer), so, if the assumptions hold, then we

can use a two-way ANOVA.

41

Summary of parametric tests applied for different type of data

Sl no Type of Group Parametric test

1. Comparison of two paired groups Paired ‘t’ test

2. Comparison of two unpaired groups Unpaired ‘t’ test

3. Comparison of three or more matched groups

Two way ANOVA

4. Comparison of three or more matched groups

One way ANOVA

5. Correlation between two variables Pearson correlation

42

References:1. Dr J V Dixit’s Principles and practice of

biostatistics 5th edition.

2. Rao & Murthy’s applied statistics in health sciences 2nd edition.

3. Sarmukaddam’s fundamentals of biostatistics 1st edition.

4. Internet sources…….

43

Recommended