98
How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Embed Size (px)

Citation preview

Page 1: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

How big should my study be?The science and art of choosing your

sample size

Mark PletcherDesigning Clinical Research

Summer 2013

Page 2: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Choosing sample size

• A fundamental decision– A critical determinate of statistical power– A critical determinate of feasibility

Page 3: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Choosing sample size

• “Nothing focuses the mind like a sample size calculation”– Mike Kohn

Page 4: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Choosing sample size

• Ingredients for a sample size calculation– “Focusing the mind” on measurements, etc

• Tools for making the calculation– Tables in the book, Stata, online calculators

• Examples – What drives sample size?– Modifying study design to reduce sample size

• Getting to a final answer for your study– Round peg/square hole? MAKE IT FIT!– Unknown assumptions? GUESS!– Persuasive writing and justification

Page 5: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 1

• Alcohol and atrial fibrillation incidence

As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.

Page 6: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 1

• Alcohol and atrial fibrillation incidence

As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.

Page 7: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 1

(boiled down…)

– If………..[assumptions]– Then……a sample size of 2920 will give us a

90% chance of ending up with a “statistically significant” result

Page 8: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 1

(boiled down…)

– If………..[assumptions]– Then……a sample size of 2920 will give us a

90% chance of ending up with a “statistically significant” result

What are the key assumptions?

Page 9: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Testable hypothesis

• Clear measurements• Usually phrased as a “null” hypothesis

– Planned statistical test– Assumption about variability of measurements– An effect size– “Alpha” error (1-sided or 2-sided) threshold

Page 10: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Testable hypothesis

“Does alcohol cause atrial fibrillation?”

Page 11: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Testable hypothesis

“Does alcohol cause atrial fibrillation?”

Too vague!

Page 12: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Testable hypothesis

“Does alcohol cause atrial fibrillation?”

“Is drinking 2+ drinks/day (vs. drinking less) associated with incident atrial fibrillation at 5 years in adults over age 65?”

Page 13: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Testable hypothesis

“Does alcohol cause atrial fibrillation?”

“Is drinking 2+ drinks/day (vs. drinking less) associated with incident atrial fibrillation at 5 years in adults over age 65?

Better, but not phrased as a “null” hypothesis

Page 14: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Testable hypothesis

“Does alcohol cause atrial fibrillation?”

“Is drinking 2+ drinks/day (vs. drinking less) associated with incident atrial fibrillation at 5 years in adults over age 65?

“H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65”

Page 15: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

The Null Hypothesis…

• Why do we need a NULL hypothesis?

Page 16: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

The Null Hypothesis…

• Why do we need a NULL hypothesis?– Theoretically speaking, we can only

DISPROVE something (or say it’s unlikely), we can never PROVE something*

– So we state a NULL hypothesis, and then say that it is very unlikely to be true

“H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65”

*Karl Popper, The Logic of Scientific Discovery, 1934

Page 17: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Testable hypothesis

• Clear measurements• Usually phrased as a “null” hypothesis

– Planned statistical test– Assumption about variability of measurements– An effect size– “Alpha” error (1-sided or 2-sided) threshold

Page 18: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

PREDICTOR

OUTCOME Dichotomous Continuous

Dichotomous chi-squared t-test

Continuous t-test correlation

Page 19: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

PREDICTOR

OUTCOME Dichotomous Continuous

Dichotomous chi-squared t-test

Continuous t-test correlation

Need to know your variable types!

Page 20: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

Dichotomous variables have only 2 values.

Male vs. femaleDead vs. aliveHypertension vs. no hypertensionSmoker or non-smoker

Page 21: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

Continuous variables have many values

Blood pressureAgeQuality of lifeWaist circumference

Page 22: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

What kind of variable is alcohol use?

Page 23: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

What kind of variable is alcohol use?

Drinks/dayDrinker vs. non-drinkerHeavy (2+) vs. light drinker (<2 drinks/day)Non-drinker vs. occasional vs. regular vs. heavy

Page 24: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

What kind of variable is alcohol use?

Drinks/dayDrinker vs. non-drinkerHeavy (2+) vs. light drinker (<2 drinks/day)Non-drinker vs. occasional vs. regular vs. heavy

Not normally distributed?

Page 25: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

What kind of variable is alcohol use?

Drinks/dayDrinker vs. non-drinkerHeavy (2+) vs. light drinker (<2 drinks/day)Non-drinker vs. occasional vs. regular vs. heavy

4-level categorical variable?

Page 26: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

What kind of variable is alcohol use?

Drinks/dayDrinker vs. non-drinkerHeavy (2+) vs. light drinker (<2 drinks/day)Non-drinker vs. occasional vs. regular vs. heavy

For the purposes of sample size calculation, you may want to dichotomize…

Easy!

Page 27: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

What kind of variable is atrial fibrillation?

Person with vs. without afibFrequency of episodesBeats/minuteYears to onset of afib (“time to event”)Proportion onset of afib at 5 years

Page 28: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

What kind of variable is atrial fibrillation?

Person with vs. without afibFrequency of episodesBeats/minuteYears to onset of afib (“time to event”)Proportion onset of afib at 5 years

Normally distributed?

Page 29: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

What kind of variable is atrial fibrillation?

Person with vs. without afibFrequency of episodesBeats/minuteYears to onset of afib (“time to event”)Proportion onset of afib at 5 years

“Survival analysis”

Page 30: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

What kind of variable is atrial fibrillation?

Person with vs. without afibFrequency of episodesBeats/minuteYears to onset of afib (“time to event”)Proportion onset of afib at 5 years

Dichotomous (easy)

Page 31: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Planned statistical test

PREDICTOR

OUTCOME Dichotomous Continuous

Dichotomous chi-squared t-test

Continuous t-test correlation

“H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65”

Page 32: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Testable hypothesis

• Clear measurements• Usually phrased as a “null” hypothesis

– Planned statistical test– Assumption about variability of measurements– An effect size– “Alpha” error (1-sided or 2-sided) threshold

Page 33: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Variability and effect size for chi-squared test

Probability of outcome in each predictor group

P1 = 10%

P2 = 15%

Page 34: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Variability and effect size for chi-squared test

Probability of outcome in each predictor group

P1 = 10% (prob afib at 5 years if <2 drinks)

P2 = 15% (prob afib at 5 years if 2+ drinks)

Page 35: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Variability and effect size for chi-squared test

Probability of outcome in each predictor group

P1 = 10% (prob afib at 5 years if <2 drinks)

P2 = 15% (prob afib at 5 years if 2+ drinks)

Effect size clearly delineated:

Risk difference = 5%; relative risk = 1.5

Page 36: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Variability and effect size for chi-squared test

Probability of outcome in each predictor group

P1 = 10% (prob afib at 5 years if <2 drinks)

P2 = 15% (prob afib at 5 years if 2+ drinks)

Variability is “embedded”…varies with P1…

Page 37: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Variability and effect size for chi-squared test

Probability of outcome in each predictor group

P1 = 10% (prob afib at 5 years if <2 drinks)

P2 = 15% (prob afib at 5 years if 2+ drinks)

Bottom line: Giving both probabilities is clear and unambiguous (…wait for counter-examples)

Page 38: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– Testable hypothesis

• Clear measurements• Usually phrased as a “null” hypothesis

– Planned statistical test– Assumption about variability of measurements– An effect size– “Alpha” error (1-sided or 2-sided) threshold

Page 39: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– “Alpha” error (1-sided or 2-sided) threshold

Standard p-value threshold: 0.05

(“Type I error” rate = “alpha”)

Page 40: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– “Alpha” error (1-sided or 2-sided) threshold

Standard p-value threshold: 0.05

(“Type I error” rate = “alpha”)

Standard choice: 2-sided test

Page 41: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– “Alpha” error (1-sided or 2-sided) threshold

Standard p-value threshold: 0.05

(“Type I error” rate = “alpha”)

Standard choice: 2-sided test

Unless uninterested in a large effect in the opposite direction as you expect, choose 2-sided - the clear, safe choice almost always

Page 42: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Key assumptions

• Assumptions (aka “ingredients”)– “Alpha” error (1-sided or 2-sided) threshold

Standard p-value threshold: 0.05(“Type I error” rate = “alpha”)

Standard choice: 2-sided test

Power = 1- “beta” error(so 90% power = 10% beta error)

Page 43: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 1

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65

• 2 dichotomous variables chi-squared test• P1 = 10%• P2 = 15%• 2-sided alpha = 0.05, beta = .10

Page 44: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 1

Go to page 75 of DCR (4th edition)…

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65

• 2 dichotomous variables chi-squared test• P1 = 10%• P2 = 15%• 2-sided alpha = 0.05, beta = .10

Page 45: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 1

Sample size = 958 PER GROUP = 1916 total

Go to page 75 of DCR (4th edition)…

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65

• 2 dichotomous variables chi-squared test• P1 = 10%• P2 = 15%• 2-sided alpha = 0.05, beta = .10

Page 46: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 1

Sample size = 1252 x 2 = 2504 total

Go to page 86 of DCR (3rd edition)…

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65

• 2 dichotomous variables chi-squared test• P1 = 15%• P2 = 20% Risk diff = 5%• 2-sided alpha = 0.05, beta = .10

Page 47: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 1

Sample size = 1504 x 2 = 3008 total

Go to page 86 of DCR (3rd edition)…

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65

• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 25% Risk diff = 5%• 2-sided alpha = 0.05, beta = .10

Page 48: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 1

Sample size = 412 x 2 = 824 total

Go to page 86 of DCR (3rd edition)…

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65

• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 30% RR = 1.5• 2-sided alpha = 0.05, beta = .10

Page 49: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 1

Sample size = 412 x 2 = 824 total

Go to page 86 of DCR (3rd edition)…

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65

• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 30% RR = 1.5• 2-sided alpha = 0.05, beta = .10

Not enough to specify an effect size of “5%” or “RR = 1.5” – need to give both probabilities

Page 50: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Back to our paragraph…

As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.

Page 51: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Back to our paragraph…

As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.

Unequal sample sizes!! What do we do?

Page 52: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Tools for making the calculation…

• Options for getting the final answer:– Look at a table in the book (DCR)– Try an online calculator, like at:

• http://www.stat.ubc.ca/~rollin/stats/ssize/• http://www.dcr-4.net

– Fancy program (need to download): PSpower• http://

biostat.mc.vanderbilt.edu/wiki/Main/PowerSampleSize

– Use Stata (sampsi, launch dialog box)

Page 53: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Tools for making the calculation…

• Options for getting the final answer:– Look at a table in the book (DCR)– Try an online calculator, like at:

• http://www.stat.ubc.ca/~rollin/stats/ssize/• http://www.dcr-4.net this one

– Fancy program (need to download): PSpower• http://

biostat.mc.vanderbilt.edu/wiki/Main/PowerSampleSize

– Use Stata (sampsi, launch dialog box)Only some of these allow you to estimate sample size with unequal groups

Page 54: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 1

Sample size = 584 + 2336 = 2920

Using the http://www.dcr-4.net calculator…

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65

• 2 dichotomous variables chi-squared test• P1 = 10%• P2 = 15%• 2-sided alpha = 0.05, beta = .10• Proportion with P2 = 20% (ratio = 1:4)

Page 55: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2

• Does alcohol consumption cause high blood pressure?

Page 56: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women

Page 57: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women

• 2 dichotomous variables chi-squared test

Page 58: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women

• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 27%

Page 59: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women

• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 27%• 2-sided alpha = 0.05, beta = .10• Proportion with P2 = 20% (ratio = 1:4)

Sample size = 503 + 2014 = 2518

Using dcr-4.net…

Page 60: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women

• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 27%• 2-sided alpha = 0.05, beta = .10• Proportion with P2 = 20% (ratio = 1:4)• I only have enough money to study 1000 people!

Page 61: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women

• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 27%• 2-sided alpha = 0.05, beta = .10• Proportion with P2 = 20% (ratio = 1:4)• I only have enough money to study 1000 people!

(Stata DEMO: sampsi command)

Page 62: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and prevalent hypertension in middle-aged women

• 2 dichotomous variables chi-squared test• P1 = 20%• P2 = 27%• 2-sided alpha = 0.05, beta = .10• Proportion with P2 = 20% (ratio = 1:4)• I only have enough money to survey 1000 people!

If sample size = 200/800=1000, power=.54

Using Stata sampsi command (dialog box)…

Page 63: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

“Underpowered” studies…

• Is it unethical to conduct a study with only 54% power?

Page 64: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

“Underpowered” studies…

• Is it unethical to conduct a study with only 54% power? – Traditional view: <80% power is unethical

Page 65: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

“Underpowered” studies…

• Is it unethical to conduct a study with only 54% power? – Traditional view: <80% power is unethical

– Newer thinking:

Bacchetti, BMC Medicine 2010, 8:17

Page 66: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

“Underpowered” studies…

• Is it unethical to conduct a study with only 54% power? – Traditional view: <80% power is unethical

– Newer thinking: Not necessarily!

Page 67: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

“Underpowered” studies…

• Is it unethical to conduct a study with only 54% power? – Traditional view: <80% power is unethical

– Newer thinking: Not necessarily!

– But still, you might want to give yourself a better chance at statistical significance…

Page 68: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

Another option: Redesign with a continuous outcome

Page 69: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

PREDICTOR

OUTCOME Dichotomous Continuous

Dichotomous chi-squared t-test

Continuous t-test correlation

Page 70: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test

Page 71: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test

– Assumption about variability of measurements– An effect size

Page 72: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test

– Assumption about variability of measurements– An effect size

For continuous outcome, specify mean + standard deviation for each group

Page 73: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15 (from CARDIA)• Mean2 = 116 +/- 15 (guess at effect size?)

Page 74: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

How do I find standard deviation?

• Search for a published study– Be careful you get SD not SE…

• Your own pilot study

• Take reasonable range, divide by 4

• Guess!

Page 75: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

How do I find standard deviation?

• Search for a published study– Be careful you get SD not SE…

• Your own pilot study

• Take reasonable range, divide by 4

• Guess!

• Beware:

SD of SBP is NOT equal to

SD of change in SBP

Page 76: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15 (from CARDIA)• Mean2 = 116 +/- 15 (guess at effect size?)• 2-sided alpha = 0.05, beta = .10

Page 77: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15 (from CARDIA)• Mean2 = 116 +/- 15 (guess at effect size?)• 2-sided alpha = 0.05, beta = .10

Go to page 73 of DCR (4th edition)…

Page 78: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15 (from CARDIA)• Mean2 = 116 +/- 15 (guess at effect size?)• 2-sided alpha = 0.05, beta = .10

We need E/S!!!

Go to page 73 of DCR (4th edition)…

Page 79: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Standardized effect size…

• E/S = “Standardized effect size”

= Effect size, in terms of variability

= Difference in means / SD

Page 80: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15• Mean2 = 116 +/- 15 E/S = (116-111)/15 = .33• 2-sided alpha = 0.05, beta = .10

Go to page 73 of DCR (4th edition)…

Page 81: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15• Mean2 = 116 +/- 15 E/S = (116-111)/15 = .33• 2-sided alpha = 0.05, beta = .10

Go to page 73 of DCR (4th edition)…

133-235 per group = ~400 total…

Page 82: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15• Mean2 = 116 +/- 15 E/S = (116-111)/15 = .33• 2-sided alpha = 0.05, beta = .10

Go to page 73 of DCR (4th edition)…

133-235 per group = ~400 total…if equal size!

Page 83: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15• Mean2 = 116 +/- 15 E/S = (116-111)/15 = .33• 2-sided alpha = 0.05, beta = .10• Proportion with 2+drinks/day = 20% (ratio = 1:4)

Stata sampsi command…

118 + 473 = 591 total

Page 84: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15• Mean2 = 116 +/- 15 E/S = (116-111)/15 = .33• 2-sided alpha = 0.05, beta = .20• Proportion with 2+drinks/day = 20% (ratio = 1:4)

Stata sampsi command…

88 + 353 = 441 total

Page 85: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 15• Mean2 = 121 +/- 15 E/S = (121-111)/15 = .67• 2-sided alpha = 0.05, beta = .20• Proportion with 2+drinks/day = 20% (ratio = 1:4)

Stata sampsi command…

22 + 88 = 110 total

Page 86: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

How to pick your effect size…

• If you knew the answer, you wouldn’t need to do the study!

• No right answer! No right method!*– Lowest possible interesting result?– Highest that you can justify as being possible?– Lowest that you can “afford”? (with fixed sample size)

• Clinical/scientific significance is key– Should be interesting, important and realistic

* - Bacchetti, BMC Medicine 2010, 8:17

Page 87: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

How to pick your effect size…

• If you knew the answer, you wouldn’t need to do the study!

• No right answer! No right method!*– Lowest possible interesting result?– Highest that you can justify as being possible?– Lowest that you can “afford”? (with fixed sample size)

• Clinical/scientific significance is key– Should be interesting, important and realistic

* - Bacchetti, BMC Medicine 2010, 8:17

After study completed, only the actual estimate and confidence interval are relevant*

Page 88: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 10• Mean2 = 121 +/- 10 E/S = (121-111)/10 = 1• 2-sided alpha = 0.05, beta = .20• Proportion with 2+drinks/day = 20% (ratio = 1:4)

Stata sampsi command…

10 + 39 = 49 total

Page 89: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

How can you reduce variability?...

• Variability derives from:– Actual population variation– Measurement error

• Reduce variability by reducing measurement error• Consider alternate designs:

– CHANGE over time within-people is often less variable than between-person differences

Page 90: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Example 2, redesigned

• H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and systolic blood pressure in middle-aged women

• 1 dichotomous, 1 continuous t-test• Mean1 = 111 +/- 10• Mean2 = 121 +/- 10 E/S = (121-111)/10 = 1• 2-sided alpha = 0.05, beta = .20• Proportion with 2+drinks/day = 20% (ratio = 1:4)

Stata sampsi command…

10 + 39 = 49 total

Redesign and changing assumptions reduced our sample size from 2455 to 49!

Page 91: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Which to choose?

• What is feasible?– What can you afford? – How many patients do you actually have access to?

• Can you convince yourself or a reader that:– Heavy alcohol might really increase SBP by 10 mmHg?– You can SUBSTANTIALLY reduce measurement error such that SD

goes from 15 to 10?– Or that being under-powered is actually OK? (Cite Bacchetti!*)

* - Bacchetti, BMC Medicine 2010, 8:17

Page 92: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Which to choose?

• What is feasible?– What can you afford? – How many patients do you actually have access to?

• Can you convince yourself or a reader that:– Heavy alcohol might really increase SBP by 10 mmHg?– You can SUBSTANTIALLY reduce measurement error such that SD

goes from 15 to 10?– Or that being under-powered is actually OK? (Cite Bacchetti!*)

* - Bacchetti, BMC Medicine 2010, 8:17

No right answers! This is an art, not (just) a science

Page 93: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

What we haven’t addressed…

• 2 continuous variables – “correlation”• Descriptive studies (including estimating

sensitivity and specificity)• 3+ categorical variable• Non-normally distributed continuous var’s • Survival analysis• Loss to follow-up• Regression and adjustment for other variables

Page 94: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Critical advice

• Fit your study into the mold!– Dichotomize any variable that “doesn’t fit”– Guess when you need to– Show results of alternate guesses (“sensitivity

analyses”)

• It’s often OK to work backwards

Page 95: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Critical advice

• Fit your study into the mold!– Dichotomize any variable that “doesn’t fit”– Guess when you need to– Show results of alternate guesses (“sensitivity

analyses”)

• It’s often OK to work backwards

• It is really important for you to get all the way through a power calculation

Page 96: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Other key points

• Sample size calculations help you clarify your thinking about measurements

• Present effect size unambiguously– Give BOTH %’s or means, etc

• Watch out for unequal group sizes• “Always” choose 2-sided alpha• More power with continuous variables• More power with better measurement (with less

error, less noise)

Page 97: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Acknowledgements

• Mike Kohn – advice, tools

• Steve Cummings – very nice lecture!

Page 98: How big should my study be? The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

Thanks!

• Questions?