Critical appraisal: Systematic Review & Meta-analysismed.mahidol.ac.th/ceb/sites/default/files/public/pdf/Conference... · 07/09/59 1 Critical appraisal: Systematic Review & Meta-analysis

07/09/59

1

Critical appraisal: Systematic Review &

Meta-analysis

Atiporn Ingsathit MD.PhD.Section for Clinical Epidemiology and biostatisticsFaculty of Medicine Ramathibodi HospitalMahidol University

What is a review?

A review provides a summary of evidence to answer important practice and policy questions without readers having to spend the time and effort to summarize the evidence themselves.

07/09/59

2

Type of review

Narrative review (conventional review) Review article

Chapter from textbook

Systematic review

07/09/59

3

07/09/59

4

Why we need systematic reviews?

07/09/59

5

Problems of conventional review

Broad clinical questions

Unsystematic approaches to collecting of

evidences

Unsystematic approach to summarizing of

evidences

Trend to be biased by author’s opinions

Load of evidence

Conflicting of evidence

Hunink, Glasziou et al, 2001. 10

What is a systematic review?

A review of a particular subject undertaken in such a systematic way that risk of bias is reduced.

Systemic reviews have explicit, scientific, and comprehensive descriptions of their objectives and methods.

07/09/59

6

AIMS

Systematic: to reduce bias

Explicit (precisely and clearly express): to ensure reproducibility

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

07/09/59

7

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

------

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Systematic review

07/09/59

8

Hunink, Glasziou et al, 2001. 15

What is a meta-analysis?

The analysis of multiple studies, including statistical techniques for merging and contrasting results across studies.

Synonyms: research synthesis, systematic overview, pooling, and scientific audit.

Focus on contrasting and combining results from different studies in the hopes of identifying patterns among study results.

Quantitative methods applied only after rigorous qualitative selection process.

Estimates treatment effects

Leading to reduces probability of false negative results (increase power of test)

Potentially to a more timely introduction of effective treatments.

Meta-analysis

07/09/59

9

Process of conducting a systematic review and meta-analysis Define the question: PICO

Conduct literature search Sources: Databases, experts, funding agencies, pharmaceutical companies,

hand-searching, references

Identify titles and abstracts

Apply inclusion and exclusion criteria Titles and abstract full articles final eligible articles agreement

Create data abstraction Data abstraction, methodologic quality, agreement on validity

Conduct analysis Determine method of generating pooled estimates

Pooled estimates ( if appropriate)

Explore heterogeneity conduct subgroup

Explore publication bias

Example

07/09/59

10

19

Users’ guides for how to use review articles

Gordon Guyatt,

Roman Jaeschke, Kameshwar Prasad, and Deborah J Cook

Users’ Guides to Medical Literature: A Manual for Evidence-Based Clinical Practice 2008

07/09/59

11

1. Assess the systematic review validity.

* Did the review explicitly Address a sensible clinical question?

* Did the review include explicit and appropriate eligibility criteria?

* Was biased selection and reporting of studies unlikely?

* Was the Search for Relevant Studies Detailed and Exhaustive?

* Were the Primary Studies of High Methodologic Quality?

* Were Assessments of Studies Reproducible?

2. What are the results?

* Were the results similar from study to study?

* What are the overall results of the review?

* How precise were the results?

07/09/59

12

3. How can I apply the results to patient care?

* Were all patient-important outcomes considered?

* Are any postulated subgroup effects credible?

* What is the overall quality of the evidence?

* Are the benefits worth the costs and potential risks?

Validity criteria

1. Did the Review Explicitly Address a Sensible Clinical Question?

P Lupus nephritis I Mycophenolate mofetil (MMF) C Cyclophosphamide (CYC) O Complete, partial remission, adverse events

07/09/59

13

Validity criteria

2. Did the review include explicit and appropriate eligibility criteria?

Range of patients (older/younger, severity)

Range of interventions ( dose, route)

Range of outcomes (short/long-term, surrogate/clinical)

Validity criteria

3 Was biased selection and reporting of studies unlikely? Clear inclusion and exclusion criteria

Topic Guides

Therapy Were patients randomized?Was follow-up complete?

Diagnosis Was the patient sample representative of those with the disorder?Was the diagnosis verified using gold standard, and independent?

Harm Did the investigators demonstrate similarity in all known determinants of outcome or adjust for differences in the analysis?Was follow-up sufficiently complete?

Prognosis Was there a representative sample of patients?Was follow-up sufficiently complete?

07/09/59

14

Study Search and Selection

One reviewer (NK) electronically searched the MEDLINE database using PubMed (National Library of Medicine, Bethesda,MD) (1951 to December

2009)

Ovid (WoltersKluwer, NewYork, NY) (1966 to December 2009)

The Cochrane Central Register of Randomized Controlled Trials (CENTRALVThe Cochrane Library issue 4, 2009) (United States Cochrane Center, Baltimore, MD).

Search terms used without language restriction were as follows: (mycophenolate mofetil or mycophenolate) and

cyclophosphamide and (lupus nephritis or glomerulonephritis),

limited to randomized controlled trial.

Two reviewers (NK and AT) independently screened titles and abstracts.

Validity criteria

4. Was the Search for Relevant Studies Detailed and Exhaustive?

Why should effort be exerted to search for published and unpublished articles?

What articles tend to published more - the ones with positive or negative results?

If positive articles tend to be published more, how will this affect meta-analyses of treatment interventions?

07/09/59

15

Positive studies are more likely

to be published

to be published in Eng

to be cited by other authors

To produce multiple publication

Large studies are more likely to be published even they have negative results

Quality of study

Lower quality of methodology shows larger effects

Bias due to association between treatment effect and study size

Publication bias

Publication bias assessment Using the Egger test on the 5 trials, we found borderline

evidence of bias (coefficient = 2.03, SE = 0.64, p = 0.049) from the small study effects.

Funnel plot for complete remission

07/09/59

16

Validity criteria

5. Were the Primary Studies of High Methodologic Quality?

Methodologic Quality

PRISMA guidelines

07/09/59

17

Validity criteria

6. Were Assessments of Studies Reproducible?

Having 2 more people participate in each decision

Good agreement

Data Extraction and Risk Assessment

Two reviewers (NK and AT) independently performed data extraction.

We extracted trial characteristics (for example, study design, sample size, treatment dosage and duration, WHO classification, renal biopsy information) and definitions (complete remission and complete/partial remission).

07/09/59

18

Results

07/09/59

19

07/09/59

20

Results

1. Were the results similar from study to study?

What does heterogeneity mean?

Explore heterogeneity

07/09/59

21

What does heterogeneity mean?

The results are significantly different between studies.

The possibility of excess variability between the results of the difference trials/studies is examined by the test of heterogeneity.



Why? As the studies might be not conduct

according to a common protocol.

Variations in patient groups, clinical setting, concomitant care, and the methods of delivery of the intervention or method of measurement of exposure for observational studies.

07/09/59

22

1) Visual interpretation

2) Do statistical tests (e.g. q test, p<.1

implies heterogeneity, or I2 >0.7)

How do we detect heterogeneity?

Visual interpretation

07/09/59

23

07/09/59

24

Do statistical tests

Statistical test (1)

Statistical test of heterogeneity (yes/no) Cochran Q Null hypothesis of the test for heterogeneity is that the

underlying effect is the same in each of the studies.

Low P value means that random error is an unlikely explanation of the differences in results from study to study.

High P value increases our confidence that the underlying assumption of pooling holds true.

07/09/59

25

Statistical test (2)

Magnitude of heterogeneity I2 statistic Provides an estimate of the percentage of variability in

results across studies that is likely due to true differences in treatment effect as opposed to chance

As the I2 increases, we become progressively less comfortable with a single pooled estimate, and need to look for explanations of variability other than chance

I2 < 0.25 small heterogeneity0.25-0.5 moderate heterogeneity> 0.5 large heterogeneity

Plot study resultsForest plot or metaview

07/09/59

26

1) Identify the source of heterogeneity

2) Try to group studies into homogeneous

categories (sensitivity analysis)

3) No statistical combination (no meta-

analysis)

What can authors do if there is heterogeneity?

Results

2 What are the overall results of the review?

07/09/59

27

Results

3. How precise were the results?

07/09/59

28

Confidence Intervals

0.6 0.8 1 1.2 1.4 1.6

Risk ratio

3. How can I apply the results to patient care?

* Were all patient-important outcomes considered?

* Are any postulated subgroup effects credible?

* What is the overall quality of the evidence?

* Are the benefits worth the costs and potential risks?

07/09/59

29

Number need to harm (NNH)

Number needed to be treated to harm one more of them

NNH = 1/Rt-Rc

Number need to treat (NNT)

Number needed to be treated to prevent one more event

NNT = 1/Rc-Rt

= 1/ARR

NNT and NNH

07/09/59

30

Network meta-analysis

Meta-analysis

Traditional meta-analysis address the merits of one intervention vs. another

Drawback – it evaluates the effect of only 1 intervention vs. 1 comparator

Do not permit inferences about the relative effectiveness of several interventions

* Medical condition – there are a selection of interventions that have most frequently been compared with placebo and occasionally with one another. 60

07/09/59

31

Network Meta-analysis (NMA) Multiple or mixed treatment comparison meta-analysis

NMA approach provides estimates of effect sizes for all possible pairwise comparisons whether or not they have actually been compared head to head in RCTs.

61

Network Meta-analysis

A network meta-analysis combines direct and indirect sources of evidence to estimate treatment effects. Direct evidence on the comparison of two

particular treatments will be obtained from studies that contain both treatments

Indirect evidence is obtained through studies that examine both treatments via some common treatment only.

07/09/59

32

Consideration in NMA

1. Among trials available for pairwise comparisons, are the studies sufficiently homogenous to combine for each intervention? (An assumption that is also necessary for a conventional meta-analysis)

2. Are the trials in the network sufficiently similar, with the exception of the intervention (eg, in important features, such as populations, design, or outcomes)?

3. Where direct and indirect evidence exist, are the findings sufficiently consistent to allow confident pooling of direct and indirect evidence together?

63

Users' Guides to the Medical Literature: A Manual for Evidence-Based Clinical Practice, 3rd ed 2015

Gordon Guyatt, Drummond Rennie, Maureen O. Meade, Deborah J. Cook

http://jamaevidence.mhmedical.com/book.aspx?bookID=847

64

07/09/59

33

65

07/09/59

34

I. How Serious Is the Risk of Bias?

67

1. Did the Meta-analysis Include Explicit and Appropriate Eligibility Criteria?

PICO

Broader eligibility criteria enhance generalizability of the results if participants are too dissimilar heterogeneity

Diversity of interventions excessive if authors pool results from different doses or even different agents in the same class, based on the assumption that effects are similar.

Too broad in their inclusion of different populations, different doses or different agents in the same class, or different outcomes to make comparisons across studies credible.

68

07/09/59

35

Research question

We therefore conducted a systematic review and network meta-analysis with the aim of comparing complete recovery rates at 3 and 6 months for corticosteroids, AVT (Acyclovir or Valacyclovir), or the combination of both for treatment of adult Bell’s palsy.

P

I

C

O

Eligible criteria

Studies were included if they were RCTs,

and studied subjects aged 18 years or older with

sufficient data. Non-English papers were

excluded from the review.

07/09/59

36

2. Was Biased Selection and Reporting of Studies Unlikely?

Include all interventions because data on clearly suboptimal or abandoned interventions may still offer indirect evidence for other comparisons

Apply the search strategies from other systematic reviews only if authors have updated the search to include recently published trials

Some industry-initiated NMAs may choose to consider only a sponsored agent and its direct competitors

Omit the optimal agent give a fragmented picture of the evidence

Selection of NMA outcomes should not be data driven but based on importance for patients and consider both outcomes of benefit and harm.

71

Search strategy

One author (NP) located studies in MEDLINE (from

1966 to August 2010) and EMBASE (from 1950 to September 2010) using PubMed and Ovid search engines.

Search terms used were as follows: (Bell’s palsy or idiopathic facial palsy) and (antiviral agents or acyclovir or valacyclovir), limited to randomized controlled trials.

07/09/59

37

Selection of study

Where eligible papers had insufficient information, corresponding authors were contacted by e-mail for additional information.

The reference lists of the retrieved papers were also reviewed to identify relevant publications.

Where there were multiple publications from the same study group, the most complete and recent results were used.

Study selection

07/09/59

38

Outcome

Complete recovery was defined as a score ≤2 on the House-Brackman Facial

Recovery scale,

≥ 8 on the Facial Palsy Recovery Index,

> 36 points on the Yanagihara score, or 100 on the Sunnybrook scale.

3. Did the Meta-analysis Address Possible Explanations of Between-Study Differences in Results?

When clinical variability is present conduct subgroup analyses or meta-regression to explain heterogeneity more optimally fit the clinical setting and characteristics of the patient you are treating.

Multiple control interventions (eg, placebo, no intervention, older standard of care)

It is important to account for potential differences between control groups

Potential placebo effect

76

07/09/59

39

Plan for explore heterogeneity

4. Did the Authors Rate the Confidence in Effect Estimates for Each Paired Comparison?

Ideally, for each paired comparison, authors will present the pooled estimate for the direct comparison (if there is one) and its associated rating of confidence, the indirect comparison(s) that contributed to the pooled estimate from the NMA and its associated rating of confidence, and the NMA estimate and the associated rating of confidence.

78

07/09/59

40

Lose Confidence in comparison of treatments

RCT - failed to protect against risk of bias by

allocation concealment, blinding, and preventing

loss to follow-up.

When on pooled estimates are (imprecision)

Results vary from study to study and we cannot

explain the differences (inconsistency);

The population, intervention, or outcome differ from

that of primary interest (indirectness);

80

07/09/59

41

II. What Are the Results?

81

1. What Was the Amount of Evidence in the Treatment Network?

Gauge from the number of trials, total sample size, and number of events for each treatment and comparison

Understanding the geometry of the network (nodes and links) will permit clinicians to examine the larger picture and see what is compared to what

The credible intervals around direct, indirect, and NMA estimates provide a helpful index

82

07/09/59

42

Result at 3 months

Result at >3 months

07/09/59

43

2. Were the Results Similar From Study to Study?

NMA, with larger numbers of patients and studies -more powerful exploration of explanations of between-study differences

The search conducted by NMA authors for explanations for heterogeneity may be informative.

NMA - vulnerable to unexplained differences in results from study to study

85

3. Were the Results Consistent in Direct and Indirect Comparisons?

Direct or indirect - most trustworthy?

Requires assessing whether the direct and

indirect estimates are consistent or discrepant.

86

07/09/59

44

Inconsistency

When the direct and indirect sources of evidence within a network do not agree, this is known as inconsistency

A

B

CThree designs: AB, AC, ABC

3. Were the Results Consistent in Direct and Indirect Comparisons?

Direct or indirect - most trustworthy?

Requires assessing whether the direct and

indirect estimates are consistent or discrepant.

Inconsistency in results in both the direct and indirect comparisons decrease confidence in estimates

Statistical methods exist for checking this type of inconsistency, typically called a test forincoherence. 88

07/09/59

45

Potential Reasons for Incoherence Between the Results of Direct and Indirect Comparisons

Chance

Genuine differences in results Differences in enrolled participants, interventions,

background managements

Bias in head-to-head (direct) comparisons Publication bias

Selective reporting of outcomes and of analyses

Inflated effect size in stopped early trials

Limitations in allocation concealment, blinding, loss to follow-up, analysis as randomized

Bias in indirect comparisons Each of the biasing issues above

Test for incoherence Discrepancy of treatment effects between direct and indirect

meta-results was then assessed using the standardized normal method (Z), i.e. by dividing the difference by its standard error.

07/09/59

46

4. How Did Treatments Rank and How Confident Are We in the Ranking?

Besides presenting treatment effects, authors may also present the probability that each treatment is superior to all other treatments, allowing ranking of treatments.

May be misleading because

Fragility in the rankings

Differences among the ranks may be too small to be important

Other limitations in the studies (eg, risk of bias, inconsistency, indirectness).

92

07/09/59

47

93

5. Were the Results Robust to Sensitivity Assumptions and Potential Biases?

May assess the robustness of the study findings by applying sensitivity analyses that reveal how the results change if some criteria or assumptions change.

Sensitivity analyses may include restricting the analyses to trials with low risk of bias only or examining different but related outcomes

94

07/09/59

48

III. How Can I Apply the Results to Patient Care?

95

1. Were All Patient-Important Outcomes Considered?

Many NMAs report only 1 or a few outcomes of interest

Adverse events are infrequently assessed in meta-analysis and in NMAs.

More likely to include multiple outcomes and assessments of harms

96

07/09/59

49

2. Were All Potential Treatment Options Considered?

Network meta-analyses may place restrictions on what treatments are examined.

Need background knowledge review.

97

3. Are Any Postulated Subgroup Effects Credible?

Criteria exist for determining the credibility of subgroup analyses.

NMA allow a greater number of RCTs to be evaluated and may offer more opportunities for subgroup analysis.

98

07/09/59

50

Single common comparator – star network

Only allow for indirect comparison – reduce confidence in effect

99

• Use both direct and indirect evidence

• increase confidence in estimates of interest

• Mixture of indirect links and close loops, unbalanced shapes

• High confidence for some

• Low confidence for others

Hierarchy of EvidenceSystematic reviews

Randomized Controlled Trials

Cohort studies

Case-control studies

Cross-sectionalstudies

Cases reports

07/09/59

51

Take home messages

Systematic review is a secondary research. It focused on a research question that tries to identify, appraise, select and synthesize all high quality research evidence relevant to that question.

Meta-analysis is a statistic tool of a systematic review, which is broadly defined as a quantitativereview and synthesis of the results of related but independent studies.

Take home messages

NMA can provide extremely valuable information in choosing among multiple treatments offered for the same condition

It is important to determine the confidence one can place in the estimates of effect of the treatments considered and the extent to which that confidence differs across comparisons.

Documents

Critical appraisal: Systematic Review & Meta-analysismed.mahidol.ac.th/ceb/sites/default/files/public/pdf/Conference... · 07/09/59 1 Critical appraisal: Systematic Review & Meta-analysis