Upload
fatima-beena
View
222
Download
0
Embed Size (px)
Citation preview
8/6/2019 LLJ110705
1/71
Overview of HypothesisTesting
Laura Lee Johnson, Ph.D.StatisticianNational Center for Complementary
and Alternative Medicine
Monday, November 7, 2005
8/6/2019 LLJ110705
2/71
Objectives
Discuss commonly used terms
P-value
Power
Type I and Type II errors
Present a few commonly usedstatistical tests for comparing two
groups
8/6/2019 LLJ110705
3/71
Outline
Estimation and Hypotheses
Continuous Outcome/Known Variance
Test statistic, tests, p-values,confidence intervals
Unknown Variance
Different Outcomes/Similar TestStatistics
Additional Information
8/6/2019 LLJ110705
4/71
Statistical Inference
Inferences about a population are
made on the basis of results
obtained from a sample drawnfrom that population
Want to talk about the larger
population from which the subjectsare drawn, not the particular
subjects!
8/6/2019 LLJ110705
5/71
What Do We Test
Effect or Difference we are interested in Difference in Means or Proportions
Odds Ratio (OR)
Relative Risk (RR) Correlation Coefficient
Clinically important difference Smallest difference considered
biologically or clinically relevant Medicine: usually 2 group comparison
of population means
8/6/2019 LLJ110705
6/71
Vocabulary (1)
Statistic
Compute from sample
Sampling Distribution All possible values that statistic can have
Compute from samples of a given size
randomly drawn from the same population Parameter
Compute from population
8/6/2019 LLJ110705
7/71
Estimation: From the Sample
Point estimation
Mean
Median
Change in mean/median
Interval estimation95% Confidence interval
Variation
8/6/2019 LLJ110705
8/71
Parameters and
Reference Distributions Continuous outcome data
Normal distribution: N( ,2)
t distribution: t[ ([ = degrees of freedom) Mean = (sample mean)
Variance = s2 (sample variance)
Binary outcome data Binomial distribution: B (n, p)
Mean = np, Variance = np(1-p)
X
8/6/2019 LLJ110705
9/71
Hypothesis Testing
Null hypothesis
Alternative hypothesis
Is a value of the parameterconsistent with sample data
8/6/2019 LLJ110705
10/71
Hypotheses and Probability
Specify structure
Build mathematical model and
specify assumptions Specify parameter values
What do you expect to happen?
8/6/2019 LLJ110705
11/71
Null Hypothesis
Usually that there is no effect
Mean = 0
OR = 1RR = 1
Correlation Coefficient = 0
Generally fixed value: mean = 2 If an equivalence trial, look at NEJM
paper or other specific resources
8/6/2019 LLJ110705
12/71
Alternative Hypothesis
Contradicts the null
There is an effect
What you want to prove
If equivalence trial, special way to
do this
8/6/2019 LLJ110705
13/71
Example Hypotheses
H0: 1 = 2
HA: 1 2
Two-sided test
HA: 1 > 2
One-sided test
8/6/2019 LLJ110705
14/71
1 vs. 2 Sided Tests
Two-sided test
No a priorireason 1 group shouldhave stronger effect
Used for most tests
One-sided test
Specific interest in only one direction
Not scientifically relevant/interestingif reverse situation true
8/6/2019 LLJ110705
15/71
Outline
Estimation and Hypotheses
Continuous Outcome/Known Variance
Test statistic, tests, p-values,confidence intervals
Unknown Variance
Different Outcomes/Similar TestStatistics
Additional Information
8/6/2019 LLJ110705
16/71
Experiment
Develop hypotheses
Collect sample/Conduct
experiment Calculate test statistic
Compare test statistic with what is
expected when H0 is true Reference distribution
Assumptions about distribution ofoutcome variable
8/6/2019 LLJ110705
17/71
Example:
Hypertension/Cholesterol Mean cholesterol hypertensive men
Mean cholesterol in male general
population (20-74 years old) In the 20-74 year old male
population the mean serum
cholesterol is 211 mg/ml with astandard deviation of 46 mg/ml
8/6/2019 LLJ110705
18/71
Cholesterol Hypotheses
H0: 1 = 2 H0: = 211 mg/ml
= population mean serumcholesterol for male hypertensives
Mean cholesterol for hypertensivemen = mean for general male
population HA: 1 2 HA: 211 mg/ml
8/6/2019 LLJ110705
19/71
Cholesterol Sample Data
25 hypertensive men
Mean serum cholesterol level is
220mg/ml ( = 220 mg/ml)Point estimate of the mean
Sample standard deviation: s =
38.6 mg/ml
Point estimate of the variance = s2
X
8/6/2019 LLJ110705
20/71
Experiment
Develop hypotheses
Collect sample/Conduct experiment
Calculate test statistic Compare test statistic with what is
expected when H0 is true
Reference distribution
Assumptions about distribution ofoutcome variable
8/6/2019 LLJ110705
21/71
Test Statistic
Basic test statistic for a mean
= standard deviation
For 2-sided test: Reject H0
when thetest statistic is in the upper or lower100*/2% of the reference distribution
What is ?
point estimate of
point estimate of - target value oftest statistic =
Q
Q Q
W
8/6/2019 LLJ110705
22/71
Vocabulary (2)
Types of errors
Type I ()
Type II ()
Related words
Significance Level: levelPower: 1-
8/6/2019 LLJ110705
23/71
8/6/2019 LLJ110705
24/71
Type I Error
= P( reject H0 | H0 true)
Probability reject the null hypothesis
given the null is true False positive
Probability reject that hypertensives
=211mg/ml when in truth the meancholesterol for hypertensives is 211
8/6/2019 LLJ110705
25/71
Type II Error (or, 1-Power)
= P( do not reject H0 | H1 true )
False Negative
Power = 1- = P( reject H0 | H1 true )
Everyone wants high power, and
therefore low Type II error
8/6/2019 LLJ110705
26/71
Z Test Statistic
Want to test continuous outcome
Known variance
Under H0
Therefore,0
0
0 0 0
Reject H if 1.96 (gives a 2-sided =0.05 test)/
Reject H if X > 1.96 or X < 1.96
X
n
n n
QE
W
W WQ Q
"
0 ~ (0,1)/
X
Nn
Q
W
8/6/2019 LLJ110705
27/71
Z or Standard Normal
Distribution
8/6/2019 LLJ110705
28/71
General Formula (1-)%
Rejection Region for Mean Point
Estimate
Note that +Z(/2) = - Z(1-/2) 90% CI : Z = 1.645 95% CI :Z = 1.96
99% CI : Z = 2.58
1 / 2 1 / 2,Z Z
n n
E EW W
Q Q
8/6/2019 LLJ110705
29/71
Do Not Reject H0
0 220 211 0.98 1.96
/ 46 / 25
XZ
n
Q
W
! ! !
0
0
46220=X > 1.96 =211+1.96* 228.03 NO!
2546
220=X < 1.96 211-1.96* 192.97 NO!25
n
n
WQ
WQ
!
! !
8/6/2019 LLJ110705
30/71
P-value
Smallest the observed sample would
reject H0
Given H0 is true, probability ofobtaining a result as extreme or more
extreme than the actual sample
MUST be based on a model
Normal, t, binomial, etc.
8/6/2019 LLJ110705
31/71
Cholesterol Example
P-value for two sided test
= 220 mg/ml, = 46 mg/ml
n = 25
H0: = 211 mg/ml
HA: 211 mg/ml
2* [ 220] 0.33P X " !
X
8/6/2019 LLJ110705
32/71
Determining Statistical
Significance: Critical Value Method Compute the test statistic Z (0.98)
Compare to the critical value
Standard Normal value at -level (1.96)
If |test statistic| > critical value
Reject H0 Results are statisticallysignificant
If |test statistic| < critical value Do not reject H0 Results are notstatisticallysignificant
8/6/2019 LLJ110705
33/71
Determining Statistical
Significance: P-Value Method
Compute the exact p-value (0.33)
Compare to the predetermined -level(0.05)
If p-value < predetermined -level
Reject H0 Results are statisticallysignificant
If p-value > predetermined -level Do not reject H0 Results are notstatisticallysignificant
8/6/2019 LLJ110705
34/71
Hypothesis Testing
and Confidence Intervals
Hypothesis testing focuses on
where the sample mean is located
Confidence intervals focus onplausible values for the population
mean
8/6/2019 LLJ110705
35/71
General Formula (1-)% CI for
Construct an interval around the point
estimate Look to see if the population/null mean
is inside
1 / 2 1 / 2,Z Z
X X
n n
E EW W
8/6/2019 LLJ110705
36/71
Outline
Estimation and Hypotheses
Continuous Outcome/Known Variance
Test statistic, tests, p-values,confidence intervals
Unknown Variance
Different Outcomes/Similar TestStatistics
Additional Information
8/6/2019 LLJ110705
37/71
T-Test Statistic
Want to test continuous outcome
Unknown variance
Under H0
Critical values: statistics books or
computer t-distribution approximately normal for
degrees of freedom (df) >30
0( 1)~
/n
Xt
s nQ
8/6/2019 LLJ110705
38/71
Cholesterol: t-statistic
Using data
For = 0.05, two-sided test from t(24)distribution the critical value = 2.064
| T | = 1.17 < 2.064
The difference is not statisticallysignificant at the = 0.05 level
Fail to reject H0
0 220 211 1.17/ 38.6 / 25
XT
s n
Q ! ! !
8/6/2019 LLJ110705
39/71
CI for the Mean, Unknown
Variance
Pretty common
Uses the t distribution
Degrees of freedom1,1 / 2 1,1 / 2
,
2.064*38.6 2.064*38.6220 , 22025 25
(204.06, 235.93)
n nt s t s
X X
n n
E E
!
!
8/6/2019 LLJ110705
40/71
Outline
Estimation and Hypotheses
Continuous Outcome/Known Variance
Test statistic, tests, p-values,confidence intervals
Unknown Variance
Different Outcomes/Similar TestStatistics
Additional Information
8/6/2019 LLJ110705
41/71
Paired Tests: Difference
Two Continuous Outcomes
Exact same idea
Known variance: Z test statistic
Unknown variance: t test statistic
H0: d = 0 vs. HA: d 0
Paired Z-test or Paired t-test
/ /
d d Z or T
n s nW! !
8/6/2019 LLJ110705
42/71
Unpaired Tests: Common
Variance
Same idea
Known variance: Z test statistic
Unknown variance: t test statistic
H0: 1 = 2 vs. HA: 1 2
Assume common variance
1/ 1/ 1/ 1/
x y x y Z or T
n m s n mW
! !
8/6/2019 LLJ110705
43/71
Unpaired Tests:
Not Common Variance
Same idea
Known variance: Z test statistic
Unknown variance: t test statistic
H0: 1 = 2 vs. HA: 1 2
2 2 2 2
1 2 1 2/ / / /
x y x y Z or T
n m s n s mW W ! !
8/6/2019 LLJ110705
44/71
Binary Outcomes
Exact same idea
For large samples
Use Z test statistic
Now set up in terms of
proportions, not means
0
0 0
(1 ) /
p pZ
p p n
!
8/6/2019 LLJ110705
45/71
Two Population Proportions
Exact same idea
For large samples use Z test
statistic
1 2
1 1 2 2
(1 ) (1 )
p pZ
p p p p
n m
!
8/6/2019 LLJ110705
46/71
Vocabulary (3)
Null Hypothesis: H0 Alternative Hypothesis: H1 or Ha or HA Significance Level: level
Acceptance/Rejection Region
Statistically Significant
Test Statistic
Critical Value
P-value
8/6/2019 LLJ110705
47/71
Outline
Estimation and Hypotheses
Continuous Outcome/Known Variance
Test statistic, tests, p-values,confidence intervals
Unknown Variance
Different Outcomes/Similar TestStatistics
Additional Information
8/6/2019 LLJ110705
48/71
Linear regression Model for simple linear regression
Yi= 0+ 1x1i + i0= intercept
1
= slope
Assumptions
Observations are independent
Normally distributed with constant
variance Hypothesis testing
H0:1 = 0 vs. HA:1 0
8/6/2019 LLJ110705
49/71
Confidence Interval Note
Cannot determine if a particular intervaldoes/does not contain true mean effect
Can say in the long run
Take many samples
Same sample size
From the same population
95% of similarly constructedconfidence intervals will contain truemean effect
Interpret a 95% Confidence
8/6/2019 LLJ110705
50/71
Interpret a 95% Confidence
Interval (CI) for the population
mean, If we were to find many such
intervals, each from a different
random sample but in exactly thesame fashion, then, in the long run,
about 95% of our intervals would
include the population mean, ,and 5% would not.
8/6/2019 LLJ110705
51/71
How NOT to interpret a 95%
CI There is a 95% probability that the true
mean lies between the two confidence
values we obtain from a particular
sample, but we can say that we are 95%
confident that it does lie between these
two values.
Overlapping CIs do NOT imply non-significance
8/6/2019 LLJ110705
52/71
But I Have All Zeros!
Calculate upper bound
Known # of trials without an event (2.11van Belle 2002, Louis 1981)
Given no observed events in n trials,
95% upper bound on rate of occurrenceis 3 / (n + 1)
No fatal outcomes in 20 operations
95% upper bound on rate of
occurrence = 3/ (20 + 1) = 0.143, sothe rate of occurrence of fatalitiescould be as high as 14.3%
8/6/2019 LLJ110705
53/71
Analysis Follows Design
Questions Hypotheses
Experimental Design Samples
Data Analyses Conclusions
8/6/2019 LLJ110705
54/71
Which is more important, or
?
Depends on the question
Most will say protect against Type I
error Need to think about individual and
population health implications and
costs
8/6/2019 LLJ110705
55/71
Affy Gene Chip
False negative
Miss what could be important
Are these samples going to belooked at again?
False positive
Waste resources following dead
ends
8/6/2019 LLJ110705
56/71
HIV Screening
False positive
Needless worry
Stigma False negative
Thinks everything is ok
Continues to spread disease
For cholesterol example?
8/6/2019 LLJ110705
57/71
What do you need to think
about?
Is it worse to treat those who truly
are not ill or to not treat those who
are ill? That answer will help guide you as
to what amount of error you are
willing to tolerate in your trialdesign.
8/6/2019 LLJ110705
58/71
Little Diagnostic Testing
Lingo False Positive/False Negative
Positive Predictive Value
Probability diseased given POSITIVEtest result
Negative Predictive Value
Probability NOT diseased given
NEGATIVE test result Predictive values depend on disease
prevalence
8/6/2019 LLJ110705
59/71
Outline
Estimation and Hypotheses
Continuous Outcome/Known Variance
Test statistic, tests, p-values,confidence intervals
Unknown Variance
Different Outcomes/Similar TestStatistics
Additional Information
8/6/2019 LLJ110705
60/71
What you should know: CI
Meaning/interpretation of the CI
How to compute a CI for the true
mean when variance is known(normal model)
How to compute a CI for the true
mean when the variance is NOTknown (t distribution)
8/6/2019 LLJ110705
61/71
You Need to Know
How to turn a question into hypotheses
Failing to reject the null hypothesis
DOES NOT mean that the null is true
Every test has assumptions
A statistician can check all the
assumptions
If the data does not meet the assumptions
there are non-parametric versions of the
tests (see text)
8/6/2019 LLJ110705
62/71
P-value Interpretation Reminders
Measure of the strength of
evidence in the data that the null is
not true
A random variable whose value
lies between 0 and 1
NOT the probability that the null
hypothesis is true.
8/6/2019 LLJ110705
63/71
Avoid Common Mistakes:
Hypothesis Testing
If you have paired data, use a
paired test
If you dont then you can lose power If you do NOT have paired data, do
NOT use a paired test
You can have the wrong inference
8/6/2019 LLJ110705
64/71
Common Mistakes:
Hypothesis Testing
These tests have assumptions ofindependence
Taking multiple samples per subject ?
Statistician MUST know Different statistical analyses MUST be
used and they can be difficult!
Distribution of the observations
Histogram of the observations
Highly skewed data - t test - incorrectresults
8/6/2019 LLJ110705
65/71
Common Mistakes:
Hypothesis Testing
Assume equal variances and the
variances are not equal
Did not show variance test
Not that good of a test
ALWAYS graph your data first to
assess symmetry and variance
Not talking to a statistician
8/6/2019 LLJ110705
66/71
Misconceptions
Smaller p-value larger effect
Effect size is determined by the
difference in the sample mean orproportion between 2 groups
P-value = inferential tool
Helps demonstrate that populationmeans in two groups are not equal
8/6/2019 LLJ110705
67/71
Misconceptions
A small p-value means the difference is
statistically significant, not that the
difference is clinically significant
A large sample size can help get a
small p-value
Failing to reject H0 There is not enough evidence to reject H0
Does NOT mean H0 is true
8/6/2019 LLJ110705
68/71
Analysis Follows Design
Questions Hypotheses
Experimental Design Samples
Data Analyses Conclusions
8/6/2019 LLJ110705
69/71
Normal/Large Sample Data?
Binomial?
Independent? Nonparametric test
Expected 5
2 sample Z test for
proportions or
contingency table
McNemars test
Fishers Exact
test
No
No
No
No
Yes
Yes
Yes
N l/ S l D ?
8/6/2019 LLJ110705
70/71
Normal/Large Sample Data?
Inference on means?
Independent? Inference on variance?
Variance
known?
Paired t
Z test
Variances equal?
T test w/
pooled
variance
T test w/
unequal
variance
F test for
variances
Yes
Yes
Yes Yes
Yes
Yes
No
No
No
No
8/6/2019 LLJ110705
71/71
Questions?