30
Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor, Library

Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Embed Size (px)

Citation preview

Page 1: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Understanding Statistics in Research Articles

Elizabeth Crabtree, MPH, PhD (c)Director of Evidence-Based Practice, Quality ManagementAssistant Professor, Library

Page 2: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,
Page 3: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Statistics – definition and concepts

Statistics are used to describe something, or to examine differences among groups, or relationships among characteristics

– Descriptive Statistics• Mean and median• Standard deviation

– Inferential Statistics• Statistical significance – p-value• Confidence intervals• Odds ratio• Relative Risk• Sensitivity/Specificity• Positive/Negative Predictive Values

Page 4: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Mean and Median

What’s the average cost of a house in this neighborhood?

Page 5: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Mean and Median

What’s the average cost of a house in this neighborhood?

Mean value: $1,009,000

Page 6: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Mean and Median

What’s the average cost of a house in this neighborhood?

Median value: $10,000

Page 7: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Standard Deviation

How spread out is the data from the mean?

Page 8: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

The P value

Taking statistics to the next level…

“factors that raise your chance of divorce include living in a red state, having twins, and contracting cervical or testicular cancer…”

differences between groups relationships between things

Page 9: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Testing for significance

Sample sizeFindingsCharacteristics of population

Page 10: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Testing for significance

Sample sizeFindingsCharacteristics of population

p < 0.05

Page 11: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,
Page 12: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Confidence Intervals: another (and maybe better?) test for statistical significance

Confidence intervals provide information about a range in which the true value lies with a certain degree of probability

Page 13: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Risk Factors for Deep Vein Thrombosis and Pulmonary Embolism (Heit et al.,

2000)Objective: To identify independent risk factors for deep vein thrombosis and pulmonary embolism and to estimate the magnitude of risk for each.Results: “Independent risk factors for VTE included surgery (odds ratio [OR], 21.7; 95% confidence interval [CI], 9.4-49.9), ….”

Page 14: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Interpreting the Results

What does odds ratio 21.7 (95% CI 9.4-49.9) mean?–We can be 95% confident that the odds

ratio will fall between 9.4 and 49.9 if the study were replicated

– OR if we performed the study 100 times, the odds ratio would be between 9.4 and 49.9 in 95 of the studies

Page 15: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

P-values vs. Confidence Intervals

P-values Confidence Intervals

Clearer than confidence intervals

Result given directly at level of data measurement

Allow for rapid decision as to whether a value is statistically significant (binary response)

Provide info about statistical significance as well as direction and STRENGTH of effect

May be overly simplistic (really much difference between 0.04 and 0.06???)

Allow for assessment of clinical relevance

Page 16: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Statistical significance and clinical relevance: one in the same?

Page 17: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,
Page 18: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Odds ratio compares whether the odds of a certain event happening is the same for two groups

The odds of an event happening is found by taking the odds the event will happen/odds the event will not happen– An odds ratio of 1 implies the event is equally likely in

both groups– An odds ratio > 1 implies the event is more likely in the

first group– An odds ratio < 1 implies that the event is less likely in

the first group

Page 19: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Males and Females on the Titanic

Alive Dead Total

Female 308 154 462

Male 142 709 851

Total 450 863 1313

The odds ratio compares the relative odds of death in each group. For females the odds were 154/308=0.5 (or 2 to 1 against dying). For males the odds were almost 5 to 1 in favor of death (709/142=4.993). The odds ratio then is 4.993/0.5=9.986. There is a 10 fold greater odds of death for

males than for females.

Page 20: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Relative Risk (sometimes called the risk ratio) compares the probability of death in each group

Alive Dead Total

Female 308 154 462

Male 142 709 851

Total 450 863 1313

In the case of our Titanic example, the probability of death for females is 154/462=0.3333. For males the probability is 709/851=0.8331. The RR is then 0.8331/0.3333=2.5. There is a 2.5 greater probability of death for males than females.

Relative Risk comes closer to what most people think of when they compare the relative likelihood of events, but sometimes it is not possible to compute RR in a research design.

Page 21: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Relative risk=1

When the relative risk is one, the risk in the exposed group is the same as the risk in the unexposed group. There is indication of neither benefit nor harm.

Relative risk<1

When the relative risk is less than one then the exposure is associated with a protective effect.

Relative risk>1

When the relative risk is greater than one, then the exposed group have greater risk of contracting the disease, so the exposure is associated with harm.

Interpreting Relative Risk

Page 22: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Huh? Odds and Probability Explained

Example: for every 3 attempts there will be one successful outcome

The language differs:“one to two” is an odds; expressed as the number; 0.5“one in three” is a probability; expressed as a fraction; 1/3

Page 23: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Risk Factors for Deep Vein Thrombosis and Pulmonary Embolism (Heit et al.,

2000)Objective: To identify independent risk factors for deep vein thrombosis and pulmonary embolism and to estimate the magnitude of risk for each.Results: “Independent risk factors for VTE included surgery (odds ratio [OR], 21.7; 95% confidence interval [CI], 9.4-49.9), ….”

Page 24: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Interpreting the Results

What does (OR 21.7, 95% CI 9.4 – 49.9) mean?– Patients who have had surgery have a

21.7 to 1 odds of developing a venous thromboembolism, compared to patients who have not undergone surgery

–We can be 95% confident that the odds ratio would be between 9.4 and 49.9 if the study were repeated

Page 25: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Sensitivity and Specificity

• Sensitivity is the proportion of true positives that are correctly identified by a test or measure (e.g., percent of sick people correctly identified as having the condition) • Ex: If 100 patients known to have a disease were tested, and

43 test positive, then the test has 43% sensitivity.

• Specificity is the proportion of true negatives that are correctly identified by the test (e.g., percent of healthy people correctly identified as not having the condition)• Ex: If 100 patients with no disease are tested and 96 return

a negative result, then the test has 96% specificity.

Page 26: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Relationship between results of liver scan and correct diagnosis: sensitivity/specificity

Abnormal Normal

Liver Scan (+) (-) Total

Abnormal 231 32 263

Normal 27 54 81

Total 258 86 344

How good (sensitive/specific) is the liver scan at diagnosing abnormal pathology?

There are 258 true positives and 86 true negatives. The proportions of these two groups that were correctly diagnosed by the scan were 231/258=0.90 and 54/86=0.63.

We can expect that 90% of patients with abnormal pathology to have abnormal (positive) liver scans: 90% sensitivity.

We can expect that 63% of the patients with normal pathology to have normal (negative) liver scans.: 63% specificity.

Page 27: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Patients and clinicians have a different question… Positive and Negative Predictive

Values• Positive predictive value is the probability that a

patient with a positive test result really does have the condition for which the test was conducted.

• Negative predictive value is the probability that a patient with a negative test result really is free of the condition for which the test was conducted

• Predictive values give a direct assessment of the usefulness of the test in practice– influenced by the prevalence of disease in the

population that is being tested

Page 28: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Relationship between results of liver scan and correct diagnosis: +/- predictive values

Abnormal Normal

Liver Scan (+) (-) Total

Abnormal 231 32 263

Normal 27 54 81

Total 258 86 344

Of the 263 patients with abnormal liver scans 231 had abnormal pathology, giving the proportion of correct diagnoses as 231/263 = 0.88. Similarly, among the 81 patients with normal liver scans the proportion of correct diagnoses was 54/81 = 0.67.

Page 29: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

0.75 0.25Sensitivity 0.90 0.90Specificity 0.63 0.63Positive predictive value 0.88 0.45Negative predictive value 0.67 0.95

Total correct predictions 0.83 0.69

Prevalence

Analysis of liver scan data with prevalencesof abnormality of 0.75 and 0.25

Prevalence, Predictive Values and Sensitivity/Specificity

Page 30: Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,

Acknowledgements

Dr. Charles Macias, lecture, Evidence-based medicine: why does it matter?Texas Children’s Hospital Evidence-Based Outcomes Center Evidence-Based Medicine course handoutsTexas Children’s Hospital Lean Six Sigma Green Belt Certification materialCraig Hospital, Those Scary Statistics!