Upload
helena-morton
View
215
Download
0
Embed Size (px)
Citation preview
How big should my study be?The science and art of choosing your
sample size
Mark PletcherDesigning Clinical Research
Summer 2013
Choosing sample size
• A fundamental decision– A critical determinate of statistical power– A critical determinate of feasibility
Choosing sample size
• “Nothing focuses the mind like a sample size calculation”– Mike Kohn
Choosing sample size
• Ingredients for a sample size calculation– “Focusing the mind” on measurements, etc
• Tools for making the calculation– Tables in the book, Stata, online calculators
• Examples – What drives sample size?– Modifying study design to reduce sample size
• Getting to a final answer for your study– Round peg/square hole? MAKE IT FIT!– Unknown assumptions? GUESS!– Persuasive writing and justification
Example 1
• Alcohol and atrial fibrillation incidence
As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.
Example 1
• Alcohol and atrial fibrillation incidence
As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.
Example 1
(boiled down…)
– If………..[assumptions]– Then……a sample size of 2920 will give us a
90% chance of ending up with a “statistically significant” result
Example 1
(boiled down…)
– If………..[assumptions]– Then……a sample size of 2920 will give us a
90% chance of ending up with a “statistically significant” result
What are the key assumptions?
Key assumptions
• Assumptions (aka “ingredients”)– Testable hypothesis
• Clear measurements• Usually phrased as a “null” hypothesis
– Planned statistical test– Assumption about variability of measurements– An effect size– “Alpha” error (1-sided or 2-sided) threshold
Key assumptions
• Assumptions (aka “ingredients”)– Testable hypothesis
“Does alcohol cause atrial fibrillation?”
Key assumptions
• Assumptions (aka “ingredients”)– Testable hypothesis
“Does alcohol cause atrial fibrillation?”
Too vague!
Key assumptions
• Assumptions (aka “ingredients”)– Testable hypothesis
“Does alcohol cause atrial fibrillation?”
“Is drinking 2+ drinks/day (vs. drinking less) associated with incident atrial fibrillation at 5 years in adults over age 65?”
Key assumptions
• Assumptions (aka “ingredients”)– Testable hypothesis
“Does alcohol cause atrial fibrillation?”
“Is drinking 2+ drinks/day (vs. drinking less) associated with incident atrial fibrillation at 5 years in adults over age 65?
Better, but not phrased as a “null” hypothesis
Key assumptions
• Assumptions (aka “ingredients”)– Testable hypothesis
“Does alcohol cause atrial fibrillation?”
“Is drinking 2+ drinks/day (vs. drinking less) associated with incident atrial fibrillation at 5 years in adults over age 65?
“H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65”
The Null Hypothesis…
• Why do we need a NULL hypothesis?
The Null Hypothesis…
• Why do we need a NULL hypothesis?– Theoretically speaking, we can only
DISPROVE something (or say it’s unlikely), we can never PROVE something*
– So we state a NULL hypothesis, and then say that it is very unlikely to be true
“H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65”
*Karl Popper, The Logic of Scientific Discovery, 1934
Key assumptions
• Assumptions (aka “ingredients”)– Testable hypothesis
• Clear measurements• Usually phrased as a “null” hypothesis
– Planned statistical test– Assumption about variability of measurements– An effect size– “Alpha” error (1-sided or 2-sided) threshold
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
PREDICTOR
OUTCOME Dichotomous Continuous
Dichotomous chi-squared t-test
Continuous t-test correlation
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
PREDICTOR
OUTCOME Dichotomous Continuous
Dichotomous chi-squared t-test
Continuous t-test correlation
Need to know your variable types!
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
Dichotomous variables have only 2 values.
Male vs. femaleDead vs. aliveHypertension vs. no hypertensionSmoker or non-smoker
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
Continuous variables have many values
Blood pressureAgeQuality of lifeWaist circumference
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
What kind of variable is alcohol use?
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
What kind of variable is alcohol use?
Drinks/dayDrinker vs. non-drinkerHeavy (2+) vs. light drinker (<2 drinks/day)Non-drinker vs. occasional vs. regular vs. heavy
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
What kind of variable is alcohol use?
Drinks/dayDrinker vs. non-drinkerHeavy (2+) vs. light drinker (<2 drinks/day)Non-drinker vs. occasional vs. regular vs. heavy
Not normally distributed?
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
What kind of variable is alcohol use?
Drinks/dayDrinker vs. non-drinkerHeavy (2+) vs. light drinker (<2 drinks/day)Non-drinker vs. occasional vs. regular vs. heavy
4-level categorical variable?
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
What kind of variable is alcohol use?
Drinks/dayDrinker vs. non-drinkerHeavy (2+) vs. light drinker (<2 drinks/day)Non-drinker vs. occasional vs. regular vs. heavy
For the purposes of sample size calculation, you may want to dichotomize…
Easy!
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
What kind of variable is atrial fibrillation?
Person with vs. without afibFrequency of episodesBeats/minuteYears to onset of afib (“time to event”)Proportion onset of afib at 5 years
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
What kind of variable is atrial fibrillation?
Person with vs. without afibFrequency of episodesBeats/minuteYears to onset of afib (“time to event”)Proportion onset of afib at 5 years
Normally distributed?
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
What kind of variable is atrial fibrillation?
Person with vs. without afibFrequency of episodesBeats/minuteYears to onset of afib (“time to event”)Proportion onset of afib at 5 years
“Survival analysis”
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
What kind of variable is atrial fibrillation?
Person with vs. without afibFrequency of episodesBeats/minuteYears to onset of afib (“time to event”)Proportion onset of afib at 5 years
Dichotomous (easy)
Key assumptions
• Assumptions (aka “ingredients”)– Planned statistical test
PREDICTOR
OUTCOME Dichotomous Continuous
Dichotomous chi-squared t-test
Continuous t-test correlation
“H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65”
Key assumptions
• Assumptions (aka “ingredients”)– Testable hypothesis
• Clear measurements• Usually phrased as a “null” hypothesis
– Planned statistical test– Assumption about variability of measurements– An effect size– “Alpha” error (1-sided or 2-sided) threshold
Key assumptions
• Assumptions (aka “ingredients”)– Variability and effect size for chi-squared test
Probability of outcome in each predictor group
P1 = 10%
P2 = 15%
Key assumptions
• Assumptions (aka “ingredients”)– Variability and effect size for chi-squared test
Probability of outcome in each predictor group
P1 = 10% (prob afib at 5 years if <2 drinks)
P2 = 15% (prob afib at 5 years if 2+ drinks)
Key assumptions
• Assumptions (aka “ingredients”)– Variability and effect size for chi-squared test
Probability of outcome in each predictor group
P1 = 10% (prob afib at 5 years if <2 drinks)
P2 = 15% (prob afib at 5 years if 2+ drinks)
Effect size clearly delineated:
Risk difference = 5%; relative risk = 1.5
Key assumptions
• Assumptions (aka “ingredients”)– Variability and effect size for chi-squared test
Probability of outcome in each predictor group
P1 = 10% (prob afib at 5 years if <2 drinks)
P2 = 15% (prob afib at 5 years if 2+ drinks)
Variability is “embedded”…varies with P1…
Key assumptions
• Assumptions (aka “ingredients”)– Variability and effect size for chi-squared test
Probability of outcome in each predictor group
P1 = 10% (prob afib at 5 years if <2 drinks)
P2 = 15% (prob afib at 5 years if 2+ drinks)
Bottom line: Giving both probabilities is clear and unambiguous (…wait for counter-examples)
Key assumptions
• Assumptions (aka “ingredients”)– Testable hypothesis
• Clear measurements• Usually phrased as a “null” hypothesis
– Planned statistical test– Assumption about variability of measurements– An effect size– “Alpha” error (1-sided or 2-sided) threshold
Key assumptions
• Assumptions (aka “ingredients”)– “Alpha” error (1-sided or 2-sided) threshold
Standard p-value threshold: 0.05
(“Type I error” rate = “alpha”)
Key assumptions
• Assumptions (aka “ingredients”)– “Alpha” error (1-sided or 2-sided) threshold
Standard p-value threshold: 0.05
(“Type I error” rate = “alpha”)
Standard choice: 2-sided test
Key assumptions
• Assumptions (aka “ingredients”)– “Alpha” error (1-sided or 2-sided) threshold
Standard p-value threshold: 0.05
(“Type I error” rate = “alpha”)
Standard choice: 2-sided test
Unless uninterested in a large effect in the opposite direction as you expect, choose 2-sided - the clear, safe choice almost always
Key assumptions
• Assumptions (aka “ingredients”)– “Alpha” error (1-sided or 2-sided) threshold
Standard p-value threshold: 0.05(“Type I error” rate = “alpha”)
Standard choice: 2-sided test
Power = 1- “beta” error(so 90% power = 10% beta error)
Example 1
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65
• 2 dichotomous variables chi-squared test• P1 = 10%• P2 = 15%• 2-sided alpha = 0.05, beta = .10
Example 1
Go to page 75 of DCR (4th edition)…
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65
• 2 dichotomous variables chi-squared test• P1 = 10%• P2 = 15%• 2-sided alpha = 0.05, beta = .10
Example 1
Sample size = 958 PER GROUP = 1916 total
Go to page 75 of DCR (4th edition)…
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65
• 2 dichotomous variables chi-squared test• P1 = 10%• P2 = 15%• 2-sided alpha = 0.05, beta = .10
Example 1
Sample size = 1252 x 2 = 2504 total
Go to page 86 of DCR (3rd edition)…
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65
• 2 dichotomous variables chi-squared test• P1 = 15%• P2 = 20% Risk diff = 5%• 2-sided alpha = 0.05, beta = .10
Example 1
Sample size = 1504 x 2 = 3008 total
Go to page 86 of DCR (3rd edition)…
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65
• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 25% Risk diff = 5%• 2-sided alpha = 0.05, beta = .10
Example 1
Sample size = 412 x 2 = 824 total
Go to page 86 of DCR (3rd edition)…
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65
• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 30% RR = 1.5• 2-sided alpha = 0.05, beta = .10
Example 1
Sample size = 412 x 2 = 824 total
Go to page 86 of DCR (3rd edition)…
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65
• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 30% RR = 1.5• 2-sided alpha = 0.05, beta = .10
Not enough to specify an effect size of “5%” or “RR = 1.5” – need to give both probabilities
Back to our paragraph…
As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.
Back to our paragraph…
As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.
Unequal sample sizes!! What do we do?
Tools for making the calculation…
• Options for getting the final answer:– Look at a table in the book (DCR)– Try an online calculator, like at:
• http://www.stat.ubc.ca/~rollin/stats/ssize/• http://www.dcr-4.net
– Fancy program (need to download): PSpower• http://
biostat.mc.vanderbilt.edu/wiki/Main/PowerSampleSize
– Use Stata (sampsi, launch dialog box)
Tools for making the calculation…
• Options for getting the final answer:– Look at a table in the book (DCR)– Try an online calculator, like at:
• http://www.stat.ubc.ca/~rollin/stats/ssize/• http://www.dcr-4.net this one
– Fancy program (need to download): PSpower• http://
biostat.mc.vanderbilt.edu/wiki/Main/PowerSampleSize
– Use Stata (sampsi, launch dialog box)Only some of these allow you to estimate sample size with unequal groups
Example 1
Sample size = 584 + 2336 = 2920
Using the http://www.dcr-4.net calculator…
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65
• 2 dichotomous variables chi-squared test• P1 = 10%• P2 = 15%• 2-sided alpha = 0.05, beta = .10• Proportion with P2 = 20% (ratio = 1:4)
Example 2
• Does alcohol consumption cause high blood pressure?
Example 2
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women
Example 2
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women
• 2 dichotomous variables chi-squared test
Example 2
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women
• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 27%
Example 2
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women
• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 27%• 2-sided alpha = 0.05, beta = .10• Proportion with P2 = 20% (ratio = 1:4)
Sample size = 503 + 2014 = 2518
Using dcr-4.net…
Example 2
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women
• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 27%• 2-sided alpha = 0.05, beta = .10• Proportion with P2 = 20% (ratio = 1:4)• I only have enough money to study 1000 people!
Example 2
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women
• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 27%• 2-sided alpha = 0.05, beta = .10• Proportion with P2 = 20% (ratio = 1:4)• I only have enough money to study 1000 people!
(Stata DEMO: sampsi command)
Example 2
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women
• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 27%• 2-sided alpha = 0.05, beta = .10• Proportion with P2 = 20% (ratio = 1:4)• I only have enough money to survey 1000 people!
If sample size = 200/800=1000, power=.54
Using Stata sampsi command (dialog box)…
“Underpowered” studies…
• Is it unethical to conduct a study with only 54% power?
“Underpowered” studies…
• Is it unethical to conduct a study with only 54% power? – Traditional view: <80% power is unethical
“Underpowered” studies…
• Is it unethical to conduct a study with only 54% power? – Traditional view: <80% power is unethical
– Newer thinking:
Bacchetti, BMC Medicine 2010, 8:17
“Underpowered” studies…
• Is it unethical to conduct a study with only 54% power? – Traditional view: <80% power is unethical
– Newer thinking: Not necessarily!
“Underpowered” studies…
• Is it unethical to conduct a study with only 54% power? – Traditional view: <80% power is unethical
– Newer thinking: Not necessarily!
– But still, you might want to give yourself a better chance at statistical significance…
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
Another option: Redesign with a continuous outcome
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
PREDICTOR
OUTCOME Dichotomous Continuous
Dichotomous chi-squared t-test
Continuous t-test correlation
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test
– Assumption about variability of measurements– An effect size
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test
– Assumption about variability of measurements– An effect size
For continuous outcome, specify mean + standard deviation for each group
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15 (from CARDIA)• Mean2 = 116 +/- 15 (guess at effect size?)
How do I find standard deviation?
• Search for a published study– Be careful you get SD not SE…
• Your own pilot study
• Take reasonable range, divide by 4
• Guess!
How do I find standard deviation?
• Search for a published study– Be careful you get SD not SE…
• Your own pilot study
• Take reasonable range, divide by 4
• Guess!
• Beware:
SD of SBP is NOT equal to
SD of change in SBP
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15 (from CARDIA)• Mean2 = 116 +/- 15 (guess at effect size?)• 2-sided alpha = 0.05, beta = .10
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15 (from CARDIA)• Mean2 = 116 +/- 15 (guess at effect size?)• 2-sided alpha = 0.05, beta = .10
Go to page 73 of DCR (4th edition)…
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15 (from CARDIA)• Mean2 = 116 +/- 15 (guess at effect size?)• 2-sided alpha = 0.05, beta = .10
We need E/S!!!
Go to page 73 of DCR (4th edition)…
Standardized effect size…
• E/S = “Standardized effect size”
= Effect size, in terms of variability
= Difference in means / SD
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15• Mean2 = 116 +/- 15 E/S = (116-111)/15 = .33• 2-sided alpha = 0.05, beta = .10
Go to page 73 of DCR (4th edition)…
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15• Mean2 = 116 +/- 15 E/S = (116-111)/15 = .33• 2-sided alpha = 0.05, beta = .10
Go to page 73 of DCR (4th edition)…
133-235 per group = ~400 total…
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15• Mean2 = 116 +/- 15 E/S = (116-111)/15 = .33• 2-sided alpha = 0.05, beta = .10
Go to page 73 of DCR (4th edition)…
133-235 per group = ~400 total…if equal size!
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15• Mean2 = 116 +/- 15 E/S = (116-111)/15 = .33• 2-sided alpha = 0.05, beta = .10• Proportion with 2+drinks/day = 20% (ratio = 1:4)
Stata sampsi command…
118 + 473 = 591 total
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15• Mean2 = 116 +/- 15 E/S = (116-111)/15 = .33• 2-sided alpha = 0.05, beta = .20• Proportion with 2+drinks/day = 20% (ratio = 1:4)
Stata sampsi command…
88 + 353 = 441 total
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15• Mean2 = 121 +/- 15 E/S = (121-111)/15 = .67• 2-sided alpha = 0.05, beta = .20• Proportion with 2+drinks/day = 20% (ratio = 1:4)
Stata sampsi command…
22 + 88 = 110 total
How to pick your effect size…
• If you knew the answer, you wouldn’t need to do the study!
• No right answer! No right method!*– Lowest possible interesting result?– Highest that you can justify as being possible?– Lowest that you can “afford”? (with fixed sample size)
• Clinical/scientific significance is key– Should be interesting, important and realistic
* - Bacchetti, BMC Medicine 2010, 8:17
How to pick your effect size…
• If you knew the answer, you wouldn’t need to do the study!
• No right answer! No right method!*– Lowest possible interesting result?– Highest that you can justify as being possible?– Lowest that you can “afford”? (with fixed sample size)
• Clinical/scientific significance is key– Should be interesting, important and realistic
* - Bacchetti, BMC Medicine 2010, 8:17
After study completed, only the actual estimate and confidence interval are relevant*
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 10• Mean2 = 121 +/- 10 E/S = (121-111)/10 = 1• 2-sided alpha = 0.05, beta = .20• Proportion with 2+drinks/day = 20% (ratio = 1:4)
Stata sampsi command…
10 + 39 = 49 total
How can you reduce variability?...
• Variability derives from:– Actual population variation– Measurement error
• Reduce variability by reducing measurement error• Consider alternate designs:
– CHANGE over time within-people is often less variable than between-person differences
Example 2, redesigned
• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women
• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 10• Mean2 = 121 +/- 10 E/S = (121-111)/10 = 1• 2-sided alpha = 0.05, beta = .20• Proportion with 2+drinks/day = 20% (ratio = 1:4)
Stata sampsi command…
10 + 39 = 49 total
Redesign and changing assumptions reduced our sample size from 2455 to 49!
Which to choose?
• What is feasible?– What can you afford? – How many patients do you actually have access to?
• Can you convince yourself or a reader that:– Heavy alcohol might really increase SBP by 10 mmHg?– You can SUBSTANTIALLY reduce measurement error such that SD
goes from 15 to 10?– Or that being under-powered is actually OK? (Cite Bacchetti!*)
* - Bacchetti, BMC Medicine 2010, 8:17
Which to choose?
• What is feasible?– What can you afford? – How many patients do you actually have access to?
• Can you convince yourself or a reader that:– Heavy alcohol might really increase SBP by 10 mmHg?– You can SUBSTANTIALLY reduce measurement error such that SD
goes from 15 to 10?– Or that being under-powered is actually OK? (Cite Bacchetti!*)
* - Bacchetti, BMC Medicine 2010, 8:17
No right answers! This is an art, not (just) a science
What we haven’t addressed…
• 2 continuous variables – “correlation”• Descriptive studies (including estimating
sensitivity and specificity)• 3+ categorical variable• Non-normally distributed continuous var’s • Survival analysis• Loss to follow-up• Regression and adjustment for other variables
Critical advice
• Fit your study into the mold!– Dichotomize any variable that “doesn’t fit”– Guess when you need to– Show results of alternate guesses (“sensitivity
analyses”)
• It’s often OK to work backwards
Critical advice
• Fit your study into the mold!– Dichotomize any variable that “doesn’t fit”– Guess when you need to– Show results of alternate guesses (“sensitivity
analyses”)
• It’s often OK to work backwards
• It is really important for you to get all the way through a power calculation
Other key points
• Sample size calculations help you clarify your thinking about measurements
• Present effect size unambiguously– Give BOTH %’s or means, etc
• Watch out for unequal group sizes• “Always” choose 2-sided alpha• More power with continuous variables• More power with better measurement (with less
error, less noise)
Acknowledgements
• Mike Kohn – advice, tools
• Steve Cummings – very nice lecture!
Thanks!
• Questions?