39
Critically Evaluating the Evidence: Tools for Appraisal Elizabeth A. Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor, Library & Informatics Medical University of South Carolina

Critically Evaluating the Evidence: Tools for Appraisal Elizabeth A. Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant

Embed Size (px)

Citation preview

Critically Evaluating the Evidence:

Tools for AppraisalElizabeth A. Crabtree, MPH, PhD (c)

Director of Evidence-Based Practice, Quality ManagementAssistant Professor, Library & Informatics

Medical University of South Carolina

1) Ask the question

2) Find the best evidence

3) Evaluate the evidence

4) Apply the information

5) Evaluate outcomes

Steps of EBP:

Step 3: Evaluate the EvidenceSystematic, Critical Appraisal

It’s peer-reviewed, therefore it must be OK?

Adopted from: Heneghan, Carl. Introduction, 16th Oxford Workshop on Evidence-Based Practice, September, 2010.

What is in “the stack”?

Gold mine Bonfire

Hierarchy of Evidence

CONSORT• Consolidated Standards of Reporting

Trials• Focus - Randomized Control Trials

(RCT)» 2-group, parallel

• Checklist of 25 items– Title/Abstract– Introduction– Methods– Results– Discussion– Other information

The CONSORT Group

STROBE• Strengthening the Reporting of

Observational Studies in Epidemiology

• Focus – Cross-sectional, Case-control, Cohort and Observational Studies

• Checklists of 22 items– Title/Abstract– Introduction– Methods– Results– Discussion– Other Information

STROBE Statement

CASP• Critical Appraisal Skills Programme• Focus – Systematic Reviews, RCTs,

Qualitative Studies, Diagnostic Test Studies, Cohort Studies, Case-control Studies & Economic Evaluation Studies

• 10 - 12 Questions per appraisal tool– Validity– Results– Relevance

CASP

Body of Evidence

• All studies relevant to a given PICO questions– Recommend grouping

studies by PICO question

• Assess the quality of relevant studies as a group

How is this done???

GRADE Quality Assessment Criteria

What is the GRADE System?

G rading ofR ecommendationsA ssessmentD evelopment andE valuation• Built on previous systems• International group of guideline

developers

Advantages of GRADE

• Transparent process of moving from evidence to recommendations

• Explicit, comprehensive criteria for downgrading and upgrading quality of evidence ratings

• Explicit evaluation of the importance of outcomes of alternative management strategies

GRADE vs. The Competition

Quality & Recommendations

• Quality of evidence-the extent to which one can be confident that an estimate of effect is adequate to support recommendations

• Strength of recommendation-the extent to which one can be confident that adherence to the recommendation will do more good than harm

Getting Started…• Must have a clearly defined question • Patient(s), intervention, comparison, and

outcome of interest (PICO)In adult patients (population), is the use of glucocorticosteroids (intervention) associated with VTE (outcome)?

Chutes & Ladders

Evaluation of evidence can lower its quality or raise its quality.

Key Elements-Chutes

• Study design limitations

• Inconsistency• Indirectness• Imprecision• Reporting bias

Study Design Limitations

• Basic study design (randomized trials or observational)

• Study Limitations– Insufficient sample size– Lack of blinding– Lack of allocation concealment– Large losses to follow up– Non-adherence to intent to treat

analysis– Stopped for early benefit– Selective reporting of measured outcomes

Inconsistency of Results

• Detailed study methods and execution–Wide variation of treatment effect across

studies– Populations varied (e.g. sicker, older)– Interventions varied (e.g. doses)– Outcomes varied (e.g. diminishing effect

over time)

• Increased heterogeneity = ↓ quality (I2: <0.25 low; 0.25 – 0.5 moderate; > 0.5 high)

Indirectness of Evidence• The extent to which the people,

interventions, and outcome measures are similar to those of interest– Indirect comparisons– Different populations– Different interventions – Different outcomes measured– Comparisons not applicable to

question/outcome

Imprecision

• Accuracy of data/results• Results include just a few events or

observations– Sample size lower than calculated for

optimal information (needed for decision-making)

– Confidence intervals are sufficiently wide that an estimate is consistent with either important harms or benefits

Bias

Key Elements-Ladders

• Effect • Dose response• Plausible confounders

EffectMagnitude of treatment effect

• Strong effect• e.g., meta-analysis of observational

studies found that bicycle helmets reduce the risk of head injuries RR 0.31 (95% CI, 0.13 to 0.37)

• Very Strong effect• e.g., meta-analysis looking at impact of

warfarin prophylaxis in cardiac valve replacement • Relative Risk for thromboembolism with

warfarin was 0.17 (95% CI, 0.13 to 0.24)

Dose Response

Evidence of a dose-response gradient• The more exposure to an

intervention the greater the harm– Higher warfarin dose → Higher INR →

increased bleeding

Plausible Confounders• All plausible confounders would

have reduced the demonstrated effect

• OR would suggest a spurious effect when results show no effect

Evidence of Association• Strong evidence of association–significant relative risk of > 2 ( <

0.5) based on consistent evidence from two or more observational studies, with no plausible confounders

• Very Strong evidence of association–significant relative risk of > 5 ( <

0.2) based on direct evidence with no major threats to validity

High

• Further research is very unlikely to change confidence‡ in the estimate of effect

• Consistent evidence from well-performed RCT’s or exceptionally strong evidence from unbiased observational studies

Moderate

• Further research is likely to have an important impact on confidence in the estimate of effect and may change the estimate.

• Evidence from RCTs with important limitations or unusually strong evidence from unbiased observational studies

Low

• Further research is very likely to have an important impact on confidence in the estimate of effect and is likely to change the estimate

• Evidence for at least 1 critical outcome from observational studies or from RCTs with serious flaws or indirect evidence

Very Low

• Any estimate of effect is very uncertain

• Evidence for at least 1 of the critical outcomes from unsystematic clinical observations or very indirect evidence

Quality of Supporting Evidence

Outcomes: Critical or Important

Guyatt, G. H., Oxman, A. D., Kunz, R., Vist, G. E., Falck-Ytter, Y. & Schünemann, H. J. (2008). What is “quality of evidence” and why is it important to clinicians? BMJ 333, 995-998.

Strength of Recommendations

VS.

Strong Weak

Strength of Recommendations

VS.

Strong Weak

X

Strong Recommendation

• Desirable effects clearly outweigh undesirable effects or vice versa

• Certain that benefits do, or do not, outweigh risks & burdens

Weak Recommendation

• Desirable effects closely balanced with undesirable effects

• Benefits, risks & burdens are finely balanced OR appreciable uncertainty exists about the magnitude of benefits & risks

Moving from Strong to WeakTo treat or not to treat…

• Absence of high quality evidence• Imprecise estimates• Uncertainty or variation in

individuals’ value of the outcomes• Small net benefits• Uncertain if net benefits are worth

the costs

Strong Recommendations

Strong recommendation

High quality evidence

Recommendation can apply to most patients.

Further research is unlikely to change our confidence in the estimate of effect.

Strong recommendation

Moderate quality evidence

Recommendation can apply to most patients.

Further research (if performed) is likely to have an

important effect on our confidence in the estimate

of effect and may change the estimate.

Strong recommendation

Low quality evidence

Recommendation may change when higherquality evidence becomes available.

Furtherresearch (if performed) is likely to have animportant influence on our confidence in

theestimate of effect and is likely to change

theestimate.

Strong recommendation

Very low quality evidence

(Very rarely applicable)

Recommendation may change when higherquality evidence becomes available; any

estimateof effect, for at least 1 critical outcome, isuncertain.

Weak RecommendationsWeak

recommendationHigh quality evidence

The best action may differ, depending oncircumstances or patients or societal

values. Further research is unlikely to change our confidence in the estimate of effect.

Weak recommendation

Moderate quality evidence

Alternative approaches likely to be better for

some patients under some circumstances. Further research (if performed) is likely to have an important influence on our confidence in the estimate of effect and

may change the estimate.

Weak recommendation

Low quality evidence

Other alternatives may be equally reasonable.

Further research is likely to have an important

influence on our confidence in the estimate of

effect and is likely to change the estimate.

Weak recommendation

Very low quality evidence

Other alternatives may be equally reasonable.

Any estimate of effect, for at least 1 critical outcome, is uncertain.

Guideline Evaluation-AGREE II

• Appraisal of Guidelines for Research and Evaluation

• Focus – evaluation of practice guidelines

• Checklist of 23 questions• Six domains– Scope and Purpose– Stakeholder Involvement– Rigor of Development– Clarity and Presentation– Applicability– Editorial Independence