Erika Franklin Fowler - Getting a Grip on Statistics

Getting a Grip on Getting a Grip on Statistics:Statistics:

What’s Right & Wrong with What’s Right & Wrong with Numbers in the NewsNumbers in the News

California Endowment Health Journalism FellowshipsJournalism Seminar, Los Angeles, CA

Saturday, October 23

Self-exams effective?Self-exams effective?

Self-exams, Take IISelf-exams, Take II

Self-exams, Take IIISelf-exams, Take III

Self-exams, Take IV (in print)Self-exams, Take IV (in print)

Study: Breast self-exams may not matterOne study, and peace of mind is of no useSelf-exams don’t cut breast-cancer death

risk

Self-exams, Take VSelf-exams, Take V

Breast self-exam headlinesBreast self-exam headlines

Monthly self-breast exams still essentialWomen wasting their timeMore confusing information tonightBreast self-exams may not matterSelf-exams don’t cut breast-cancer death

riskEverything we’ve heard about breast

cancer prevention is upside down [may be wrong]

Coverage of Statistics Coverage of Statistics Matters!Matters!

What’s at stake?◦More Americans get their news from local

sources ◦Media coverage can and does shape public

agendas, public opinion, and ultimately behavior

◦The words you choose and the numbers you present matter!

““The Certainty of Uncertainty”The Certainty of Uncertainty”

“Scientists keep changing their minds…. They tell us that coffee does or doesn’t cause various medical problems, time after time offering different advice. They tell us that a drug works fine, then take it off the market because it’s too risky to use…. To some people, this switching gives science a bad name.

Actually it’s science working just as it’s supposed to work.”

- Victor Cohen & Lewis Cope (2001), p.9-10

Nature of Medical/Social ScienceNature of Medical/Social Science

A Note of Caution!A Note of Caution!

Just because the subject is science and the researchers are medical professionals does not mean usual skepticism should be suspended

Be wary of:◦Anecdotal evidence◦‘Expert’ opinion◦Scientific studies◦Conflicts of interest

Ask for numbers and evaluate the evidence

OverviewOverview

1. Alternative explanations2. Data, distributions, and variability3. Probability and significance4. Sampling5. Power6. Types of studies7. Questions to ask

Cause and Effect? Cause and Effect? (Bivariate Relationships)(Bivariate Relationships)

Crime rates increase with ice cream salesPeople who live together before marriage

are more likely to get divorcedThe longer patients wait for surgery, the

larger their chances of survivalThe Denver Broncos lose more often when

I don’t watch the game

Correlation ≠ CausationCorrelation ≠ Causation

Confounding Factors – Confounding Factors – The The Importance of Multiple Controls!Importance of Multiple Controls!

Alternative explanations? ◦Crime rates increase with ice cream sales◦People who live together before marriage are more

likely to get divorced◦The longer patients wait for surgery, the better their

chances of survival◦The Denver Broncos lose more often when I don’t

watch the gameBe wary of spurious relationshipsDoes association persist controlling for other

factors?

Data & DistributionsData & Distributions

Measures of central tendency (what they can and cannot tell you)◦Mean – arithmetic average

Data & DistributionsData & Distributions

Measures of central tendency (what they can and cannot tell you)◦Mean – arithmetic average◦Median – midpoint ◦Mode – most common

Mean, Median, & ModeMean, Median, & Mode

http://www.brighton-webs.co.uk/statistics/images/central_tendency.gif

Problem with Central TendenciesProblem with Central Tendencies

New hypothetical disease POACompare ERs of Hospitals AMW and ALA

ER at Hospital ALAER at Hospital ALA

City Patients % Surviving

Los Angeles 559 88.9Phoenix 233 96.8San Diego 232 91.7San Francisco 605 83.1Seattle 2146 85.8

Overall 3775 86.7

ER at Hospital AMWER at Hospital AMW

City Patients % Surviving

Los Angeles 811 85.6Phoenix 5255 92.1San Diego 448 85.5San Francisco 449 71.3Seattle 262 76.7

Overall 7225 89.1

POA Survival Rate ComparisonPOA Survival Rate Comparison

Treatment of POA

ALA Number of patients: 3775Percent surviving: 86.7

AMW Number of patients: 7225Percent surviving: 89.1

POA Survival ComparisonPOA Survival Comparison

Percent Surviving POAALA AMW

Overall 86.7 89.1Los Angeles 88.9 85.6Phoenix 96.8 92.1San Diego 91.7 85.5San Francisco 83.1 71.3Seattle 85.8 76.7

Averages can be misleading!Averages can be misleading!

Percent Ontime Arrivals Alaska Air AM West

Overall 86.7 89.1Los Angeles 88.9 85.6Phoenix 96.8 92.1San Diego 91.7 85.5San Francisco 83.1 71.3Seattle 85.8 76.7Source: www.cs.cmu.edu/afs/cs/academic/class/15299/handouts/lecture20

Measures of DispersionMeasures of Dispersion

Given a measure (or measures of central tendency), we still need to know something about the spread or scatter of the distribution of values◦Range (low to high)◦Percentiles◦Standard deviation

ProbabilityProbability

Aristotle: the probable “is what usually happens”

Not was always happensImprobable events can and do occur…and

may be more frequent than we realize!

P-values (‘probability’ values)P-values (‘probability’ values)

In a probabilistic world, all results and events can be affected by chance

A p-value is a measure of the probability that a result is actually meaningful, that is not due to random variation (chance)

The lower the p-value, the higher the likelihood that the finding is a ‘real’ result

P-values (‘probability’ values)P-values (‘probability’ values)

By convention, a p-value of 0.05 or less, is consider statistically significant◦p=0.05 means that 1 in 20 times (5 percent),

the observed result could have happened by chance

◦p=0.001 means that 1 in 1,000 times (1 percent), the observed result is due to chance

Note: this does NOT mean that chance is ruled out!

Error: Type I & Type IIError: Type I & Type II

Finding a result that is not there (Type I Error)◦At standard significance levels, 5 out of 100

researchers will conclude that a treatment helps, when it really has no effect

Not finding a result that is there (Type II)A study may simply include too few subjects to

detect a real result – sufficient ‘power’ is necessary (more on this in a minute)

Confidence IntervalsConfidence Intervals

Repeated tests will produce different results…

Confidence Level – the percentage of times that repeated trials should produce a result within the confidence interval

Confidence Interval – the range within the true value of the result probably lies


Small confidence intervals indicate that the true effect is unlikely to deviate much from the study’s findings

Large confidence intervals mean that the study’s findings are not very precise


No effect Pos. effectNeg. effect

Statistical Confidence vs. Statistical Confidence vs. Substantive Size of EffectsSubstantive Size of Effects

Just because a result is “statistically significant” does not necessarily mean the effect is large

In addition to knowing the size of the confidence interval, we also want to know the size of the effect (“substantive significance”)

Statistical Confidence vs. Statistical Confidence vs. Substantive Size of EffectsSubstantive Size of Effects

No effect Pos. effectNeg. effect

A

B

C

D

Statistical vs. Statistical vs. Substantive SignificanceSubstantive Significance

What does is mean for a result to be important?◦Statistically significant results are not always

important◦Most powerful findings are those that are BOTH

statistically and substantively significant

Size of the population Size of the population (and why it matters!)(and why it matters!)

Results of new treatment for disease in puppies:◦33.3% survived◦33.3% died during treatment◦…and the other one ran away!

What if the third puppy had survived?◦The study would have a 66.7% survival rate

Lesson: small changes in small samples can drastically affect the results!

Large Numbers Yield ‘Power’Large Numbers Yield ‘Power’

Sample size increases our confidenceLaw of Large Numbers – as the number of

cases increases, we can be more confident in the validity (accuracy) and reliability (reproducibility) of the findings

Always ask for the numerator and denominator!

Problems with Small SamplesProblems with Small Samples

Bad coins?Expected number of heads?Let’s say we conduct coin-flipping trials,

with 10 flips per trial


If we repeat our 10 flips per trial a thousand times, how many trials should we expect to get exactly 5 heads?

a) About 500 (50 percent of the trials)b) About 900 (90 percent of the trials)c) About 400 (40 percent of the trials)d) About 250 (25 percent of the trials)


If we repeat our 10 flips per trial a thousand times, how many trials should we expect to get exactly 5 heads?

a) About 500 (50 percent of the trials)b) About 900 (90 percent of the trials)c) About 400 (40 percent of the trials)d) About 250 (25 percent of the trials)

Expected DistributionExpected Distribution10 flips, 1,000 times10 flips, 1,000 times

Sampling example: M&M’sSampling example: M&M’s

M&M Mars produces blue, green, yellow, orange, red, and brown M&M’s according to a specified distribution. Based on your M&M packet, which color do you think they produce most?

Sampling example: M&M’sSampling example: M&M’s

Which color do they produce most?Blue 24%Orange 20%Green 16%Yellow 14%Red 13%Brown 13%

Samples & GeneralizabilitySamples & Generalizability

Researchers use samples to represent larger populations

Findings can only be generalized to the population for which the sample is drawn – be wary of unrepresentative samples!◦Examples:

VA hospital results Surveys assessing disease rates

Types of Studies (Evidence)Types of Studies (Evidence)

Anecdotes, ideas, opinionsDescriptive reports

◦Case studies◦Cross-sectional study

Analytic studies◦Case-controls◦Cohort studies

Experimental studies◦Randomized controlled trials◦Blinded randomized controlled trials

Descriptive Reports (Evidence)Descriptive Reports (Evidence)

Case studies• Identifying some unusual or interesting cases

that alert physicians to potential relationships• Helpful when phenomena stand out by

themselvesCross-sectional (prevalence) study

• Wide-angle shot• Rate of disease in population• Make observations• Snapshot in time • Conclusions may be overstated

Analytic Studies (Evidence)Analytic Studies (Evidence)

Case-controls• Very common in disease outbreaks• Compare sick people (the “cases”) to well

people (the “controls”)• Additional work may be needed to identify

culprit (relish example)Cohort studies

• ‘Motion picture’ studies• Follow people over time, comparing individuals

to their peers• Watch for drop-outs

Experimental Trials (Evidence)Experimental Trials (Evidence)

Randomized controlled trials• Common for testing drugs• Treatment group vs. control group• Example: China breast self-exam study

Blinded, randomized controlled trials• Double-blind, triple-blind• Gold-standard

Gold Standard?Gold Standard?

Simply because the randomized control trial is the gold standard for medical research does NOT mean the results should be believed:◦Did the randomization work?◦How large was the sample?◦What is the sample and to what population can

it be generalized?◦Length of the study?◦Is the analysis appropriate?

Putting Results in Context:Putting Results in Context:Absolute vs. Relative RiskAbsolute vs. Relative Risk

A new study reveals a breakthrough treatment that reduces patients’ cancer risk by one-half◦Where should the story go?

Putting Results in Context:Putting Results in Context:Exercise & Cancer?Exercise & Cancer?

Putting Results in Context: Putting Results in Context: Are Findings Consistent?Are Findings Consistent?

Antidepressants raise risk of suicideConcern mounts about Prozac, Paxil, Zoloft

Antidepressant-Suicide Link Borne Out in Review of 702 Studies

Study links SSRIs to increased suicide risk

Studies Raise Questions About Antidepressant-Suicide Link

Suicide Risk from Antidepressants Remains Unclear

Study of antidepressants and suicide may expandScientists find mixed results after looking at the drugs’ impact on adults

Putting Results in Context: Putting Results in Context: Lessons LearnedLessons Learned

Choose absolute over relative riskProvide known cues (has the study been

published?)Situate the study in the existing body of

evidence, especially if recent evidence is mixed

Avoid anecdotes that contradict the evidence (anecdotes illustrating the issue at hand, however, can be helpful)

Breast self-exams: Breast self-exams: Who is the audience?Who is the audience?

Self-exams?◦266,000 women◦Randomly assigned to 2 groups◦Instruction group taught self exams (and

reinforced)◦Followed women for 10 years

Findings:◦No difference in mortality◦Nearly twice as many benign tumors in

instruction group than in control

““Overhyped Health Headlines Overhyped Health Headlines Revealed,” Popular Science, Aug 2009Revealed,” Popular Science, Aug 2009

Watch the words you use – they matter!

Tips to Use for Every StudyTips to Use for Every Study

Questions to ask and things to consider:◦Where was the study published?◦What type of study was it? ◦What was the size?◦What was the sample? And to what population

can results be generalized?◦What’s the size of the effect in absolute terms?◦Does the study comport with previous findings?◦How soon will the treatment be available?

Under DeadlineUnder Deadline

Things you can do when you have a small amount of time:◦Draw on relationships with trusted sources◦Choose your language carefully◦Use known cues

Lessons Learned…Part ILessons Learned…Part I

1. Qualify the results (e.g. what was the sample?) and use known cues (e.g. has the study been published and reviewed by other experts?)

2. Avoid overreaching statements (e.g. proves, cure, etc)

3. Choose absolute over relative risk (e.g. report a 1 to 2 percent increase rather than saying risk doubles)

4. State who funded the research5. Explain medical terminology

Lessons Learned…Part IILessons Learned…Part II

6. Provide information on alternative treatments where possible

7. Mention when treatment will be available to the public if applicable

8. Avoid anecdotes contradicting evidence9. Mention known negatives of products

(previous wisdom)10.Put the results into context and insert public

health messages where possible…11.Provide follow-up resources!

Getting a Grip on Getting a Grip on Statistics:Statistics:

What’s Right & Wrong with What’s Right & Wrong with Numbers in the NewsNumbers in the News

California Endowment Health Journalism FellowshipsJournalism Seminar, Los Angeles, CA

Saturday, October 23

Documents

Erika Franklin Fowler - Getting a Grip on Statistics