22
1 Subgroup Reporting in the General Medical Literature: Do Investigators Misinterpret Their Own Findings? Erik Fernandez y Garcia, MD MPH University of California, Davis Co-authors: Hien Nguyen, MD (UCD); Naihua Duan, PhD (Columbia University); Nicole Bloser Gabler, MHA MPH (UCD); Diana Liao MPH(UCLA); Richard L. Kravitz, MD MSPH (UCD)

1 Subgroup Reporting in the General Medical Literature: Do Investigators Misinterpret Their Own Findings? Erik Fernandez y Garcia, MD MPH University of

Embed Size (px)

Citation preview

1

Subgroup Reporting in the General Medical Literature: Do Investigators Misinterpret Their

Own Findings?

Erik Fernandez y Garcia, MD MPHUniversity of California, Davis

Co-authors: Hien Nguyen, MD (UCD); Naihua Duan, PhD (Columbia University); Nicole

Bloser Gabler, MHA MPH (UCD); Diana Liao MPH(UCLA); Richard L. Kravitz, MD MSPH

(UCD)

Supported by a grant from Pfizer.

2

RATIONALE

• Randomized-controlled trials (RCTs) yield an average treatment effect.

• In a trial, treatments may have different net benefits and harms for different patients (heterogeneity of treatment effects, or HTE).

• Examination of the HTE is important for optimizing treatment for individual patients.

• Especially critical in era of increasing population diversity and health disparities.

3

RATIONALE

• Subgroup analysis (SGA) in RCTs is one way of investigating HTE.

• The usefulness of SGA is hampered by the problems of

– insufficient power (false negative)

– multiple testing (false positive)

4

RATIONALE

• Under-use: SGA not performed in studies with sufficient power and theoretical rationale to anticipate helpful results.

• Over-use: SGA performed in studies which were underpowered or lacked theoretical rationale to anticipate helpful results.

• Misuse: SGA was performed in the appropriate setting but with inappropriate methodology and/or interpretation.

5

STUDY QUESTIONS

We sought to specifically investigate the potential misuse of SGA by asking:

1) How often are HTE analyses and corresponding covariates prespecified and were the reasons (if any) primarily substantive or statistical?

2) What was the objective evidence for or against the presence of HTE?

3) How did authors interpret their own HTE-related findings, and to what extent did their interpretations match the objective evidence?

6

METHODS

• Design: Systematic Review• Population: JAMA, BMJ, Lancet, NEJM, Annals• Probability Sample:

– Odd months in 1994, 1999, 2004– Initial search: N = 4,863 articles – After additional random sampling and

exclusions, N = 319 clinical trials– Final sample: 87 of 319 trials (27%) reporting

test for HTE

7

Covariates Examined in HTE analyses

Prespecified Not Prespecified

All covariates Some

Substantive Statistical

Number of Covariates with Rationale

None

Types of Reasons

Coding of Covariates

8

Defining Clinicostatistical Divergence

Clinicostatistical Divergence: Clinically meaningful and statistically significant differences between subgroup effects and average effect (coded as “none” “weak” “moderate” “strong”)

– Clinical Divergence (CD): Was the ratio measure of effect in any subgroup at least 25% greater or smaller than in the sample as a whole?

– Statistical Significance (SS): Was a test for interaction associated with a p value of less than or equal to 0.10?

9

Coding Clinicostatistical Divergence

* Denotes absence of data

Strength of Evidence

CD

SS

Prespecified Covariate?

Strong Yes Yes Yes Moderate Yes Yes No Weak Yes No Weak Yes * Weak No Yes Weak * Yes None No No None * No None No * Unable to Classify * *

10

Coding Authors’ Interpretations

• Evidence for HTE sufficient to support different treatment recommendations in one or more subgroups

• Evidence for HTE insufficient to support different treatment recommendations but sufficient to warrant further systematic research

• Evidence for HTE was possibly present but insufficient to warrant further research

• Definite evidence against HTE

• No interpretation of HTE results

11

87 RCTs

Prespecified Not Prespecified

All covariates Some

Substantive Statistical

Trials by Number of Covariates with Rationale

None

Types of Reasons

RESULTS

53 (61%) 34 (39%)

17 (32%) 12 (23%) 24 (45%)

22 (76%) 7 (24%)

Strength of Evidence Author’s Interpretation Unable None Weak Moderate Strong Total Supports Differential Treatment

0 2 5 1 2 10

Warrants Further Research

10 6 9 5 1 31

Possibly Present

0 0 0 0 1 1

Evidence Against

6 2 8 2 2 20

No Interpretation

13 6 3 3 0 25

Total 29 16 25 11 6 87

Strength of Evidence Author’s Interpretation Unable None Weak Moderate Strong Total Supports Differential Treatment

0 2 5 1 2 10

Warrants Further Research

10 6 9 5 1 31

Possibly Present

0 0 0 0 1 1

Evidence Against

6 2 8 2 2 20

No Interpretation

13 6 3 3 0 25

Total 29 16 25 11 6 87

29/87 = 33%

Strength of Evidence Author’s Interpretation Unable None Weak Moderate Strong Total Supports Differential Treatment

2 5 1 2

Warrants Further Research

6 9 5 1

Possibly Present

0 0 0 1

Evidence Against

2 8 2 2

No Interpretation

6 3 3 0

Total 16 25 11 6 58

17/58 = 29%

Strength of Evidence Author’s Interpretation Unable None Weak Moderate Strong Total Supports Differential Treatment

0 2 5 1 2 10

Warrants Further Research

10 6 9 5 1 31

Possibly Present

0 0 0 0 1 1

Evidence Against

6 2 8 2 2 20

No Interpretation

13 6 3 3 0 25

Total 29 16 25 11 6 87

25/87 = 29%

Strength of Evidence Author’s Interpretation Unable None Weak Moderate Strong Total Supports Differential Treatment

0 2 5 1 2 10

Warrants Further Research

10 6 9 5 1 31

Possibly Present

0 0 0 0 1 1

Evidence Against

6 2 8 2 2 20

No Interpretation

Total 62

31/62 = 50%10/62 = 16%31/62 = 50%

Overstated = 27% Understated = 9%

Strength of Evidence Author’s Interpretation Unable None Weak Moderate Strong Total Supports Differential Treatment

0 2 5 1 2 10

Warrants Further Research

10 6 9 5 1 31

Possibly Present

0 0 0 0 1 1

Evidence Against

6 2 8 2 2 20

No Interpretation

13 6 3 3 0 25

Total 29 16 25 11 6 87

18

LIMITATIONS

• Limited number of journals, years, trials reviewed

• Data potentially incomplete– HTE analyses performed but not published

– HTE analyses performed and published in secondary journals

19

CONCLUSIONS

• Analysis and reporting of HTE incomplete

– Prespecification inconsistent

– Rationale incomplete

– Effect measures and p-values (or CIs) incompletely reported

• When reported, objective evidence for clinicostatistical divergence found in approximately 1/3 of trials

• Authors frequently misinterpret own findings (in both directions)

20

IMPLICATIONS• Researchers:

– Ensure that SGA are prespecified with a priori rationales for covariate inclusion, or clearly labeled as exploratory

– Include a statistical test for interaction or heterogeneity in the analyses

– Report all results from SGA (including p values for HTE tests and effect measures with confidence intervals), even if not significant

21

IMPLICATIONS

• Journal Editors:– Ensure authors report SGA-associated data

when SGA is performed – Ensure that authors’ discussion includes

interpretation of SGA performed and limitations of such analyses

• Readers/Clinicians:– Weigh the authors’ interpretations and

recommendations in light of the objective evidence presented prior to changing practice or implementing recommendations

22

Thank you