GRADING EVIDENCE AND RECOMMENDATIONS: STARTING WITH GRADE BASICS VS. UTILIZING THE FULL FRAMEWORKAHRQ Annual Meeting 2010:
“Better Care, Better Health: Delivering on Quality for All Americans"
September 28, 2010
Yngve Falck-Ytter, M.D.Associate Professor of Medicine
Case Western Reserve University, Cleveland, Ohio
Holger Schünemann, M.D., Ph.D. Chair, Department of Clinical Epidemiology & Biostatistics
Michael Gent Chair in Healthcare ResearchMcMaster University, Hamilton, Canada 1
DisclosuresIn the past 5 years, Dr. Falck-Ytter received no personal payments for services from industry. His research group received research grants from Three Rivers, Valeant and Roche that were deposited into non-profit research accounts. He is a member of the GRADE working group which has received funding from various governmental entities in the US and Europe, such as the AHRQ. Some of the GRADE work he has done is supported in part by grant # 1 R13 HS016880-01 from the Agency for Healthcare Research and Quality (AHRQ). 2
Content
Part 1 A 7 minute version of GRADEPart 2 Rapid interactive exchange contrasting
GRADE basic vs. the full GRADE approach Advantages of a structured approach Asking good clinical questions Systematic review vs. ad hoc approaches Grading the quality of evidence How to determine the strength of
recommendations3
Question to the audience
A. Training, experience and knowledge of respected colleagues
B. Patient preferencesC. Convincing evidence (non experimental)
from case reports, case series, disease mechanism
D. RCTs, systematic reviews of RCTs and meta-analyses
E. All of the above
Decisions in your medical practice are based on:
4
Evidence-based clinical decisions
Research evidence
Patient values and preferences
Clinical circumstances
Expertise
Haynes et al. 20025
A real world example…P: In patients with acute hepatitis C … I : Should anti-viral treatment be used … C: Compared to no treatment …O: To achieve viral clearance?Evidence Recommendation Organization
B Class I AASLD (2009)
VA (2006)II-1 “Should be initiated…”
SIGN (2006)1+ A
AGA (2006)-/- “Most authorities…”
AWMF(2004)-/- B “It works…”6
Question to the audience
A. …you are thoroughly confusedB. …you send her to a doctor because
treatment is recommendedC. …you send her to a doctor but she can
expect that, according to guidelines, she will not be treated
D. …you look at the evidence yourself because past experience tells you that guidelines don’t help
By now…
7
GRADE is outcome-centric
I B II V III
Quality: HighQuality: ModerateQuality: Low
Old system
Outcome #1Outcome #2Outcome #3
GRADE
Systematic review
Guideline development
PICO
OutcomeOutcomeOutcomeOutcome
Formulate
question
Rate
importa
nce
Critical
Important
Critical
Lessimportant
Create
evidence
profile with
GRADEpro
Summary of findings & estimate of effect for each outcome
Rate overall quality of
evidence across outcomes based
on lowest quality of critical outcomes
Panel
RCT start high, obs. data start
low1. Risk of bias2. Inconsisten
cy3. Indirectnes
s4. Imprecision5. Publication
bias
Gra
de
dow
nG
rad
e
up
1. Large effect
2. Dose response
3. Confounders
Rate quality
of evidence
for each
outcomeSelect
outcomes
Very low
LowModerate
High
Formulate recommendations:
• For or against (direction)• Strong or weak (strength)
By considering: Quality of evidence Balance
benefits/harms Values and
preferences
Revise if necessary by considering:
Resource use (cost)
• “We recommend using…”• “We suggest using…”• “We recommend against using…”• “We suggest against using…”
Outcomes
across
studies
9
Question to the audience
A. What is the evidence that food allergens cause eosinophilic esophagitis?
B. Is it known what the evidence is that aspirin can prevent progression of dysplasia to cancer in Barrett’s esophagus?
C. In patients undergoing hip replacement, does warfarin compared to aspirin reduce venous thromboembolism, pulmonary embolism and mortality?
Which question follows a well structured clinical PICO format:
10
That’s an excellent question
Translating informal clinical questions into specific PICO questions = central to GRADE
Even if an organization has limited resources, taking care of this step actually saves resources: Helps limiting your scope Specifies the search strategy more clearly Guides data extraction Helps with formulating recommendations
11
Taking it to the next level
12
Informal Question
PICO Question Method
Popu-lation
Inter-vention(s)
Com-parator(s)
Outcome(s)
Whether to use thrombo-prophylaxis for VTE prophylaxis (drugs)
Patients under-going THR
Any drug (ASA, LDUH, LMWH, fonda-parinux, direct thrombin inhibitors)
No anti-coagulation
Asymptomatic DVT (surrogate for symptomatic VTE); symptomatic DVT; non-fatal PE; fatal PE; bleeding (operative site vs. non-operative site); readmission; re-operation; total mortality
RCT, obs. studies
Importance of outcomes
P: In patients after hip replacement…I : Should warfarin rather than…C: Aspirin be given…O: To reduce symptomatic venous
thromboembolism and mortality?
Deciding on the importance of outcomes on decision making:1 2 3 4 5 6 7 8 9Less important Important Critically important
13
Question to the audience
Please rate outcome: Dying from pulmonary embolism
Deciding on the importance of outcomes on decision making:1 2 3 4 5 6 7 8 9Less important Important Critically important
14
A. (1, 2, 3): Less important for decision making
B. (4, 5, 6): Important for decision makingC. (7, 8, 9): Critically important for decision
making
Question to the audience
Asymptomatic deep vein thrombosis in the calf (e.g., as seen on mandatory venography at end of study)
Deciding on the importance of outcomes on decision making:1 2 3 4 5 6 7 8 9Less important Important Critically important
15
A. (1, 2, 3): Less important for decision making
B. (4, 5, 6): Important for decision makingC. (7, 8, 9): Critically important for decision
making
Question to the audience
Stomach ulcer bleeding requiring endoscopy
Deciding on the importance of outcomes on decision making:1 2 3 4 5 6 7 8 9Less important Important Critically important
16
A. (1, 2, 3): Less important for decision making
B. (4, 5, 6): Important for decision makingC. (7, 8, 9): Critically important for decision
making
Question to the audience
Regular blood work and dose adjustments
Deciding on the importance of outcomes on decision making:1 2 3 4 5 6 7 8 9Less important Important Critically important
17
A. (1, 2, 3): Less important for decision making
B. (4, 5, 6): Important for decision makingC. (7, 8, 9): Critically important for decision
making
Rating the importance of outcomes
Train the content expert to understand that outcomes that are critical for decision making are identified
Rating is done before, during and after the evidence review
The rating may change in light of new information
18
Systematic review
Guideline development
PICO
OutcomeOutcomeOutcomeOutcome
Formulate
question
Rate
importa
nce
Critical
Important
Critical
Lessimportant
Create
evidence
profile with
GRADEpro
Summary of findings & estimate of effect for each outcome
Rate overall quality of
evidence across outcomes based
on lowest quality of critical outcomes
Panel
RCT start high, obs. data start
low1. Risk of bias2. Inconsisten
cy3. Indirectnes
s4. Imprecision5. Publication
bias
Gra
de
dow
nG
rad
e
up
1. Large effect
2. Dose response
3. Confounders
Rate quality
of evidence
for each
outcomeSelect
outcomes
Very low
LowModerate
High
Formulate recommendations:
• For or against (direction)• Strong or weak (strength)
By considering: Quality of evidence Balance
benefits/harms Values and
preferences
Revise if necessary by considering:
Resource use (cost)
• “We recommend using…”• “We suggest using…”• “We recommend against using…”• “We suggest against using…”
Outcomes
across
studies
19
Taking it to the next level Early involvement of consumers in
the guideline development process Selecting systematic reviews that are
known to make an effort to include consumer views (e.g., Cochrane etc.)
Can be used to identify research gaps
20
Evidence review stage
21
What format of evidence do you use?
Using mainly systematic reviews (SR) Mainly using single study data
Don’t have the resources
Search for SR
Ready to use SR
Not ready to use SR
Use GRADE without
evidence profiles
Have the resources
Do it in-house
Utilize the full GRADE framework (± evidence Profiles)
Out-source
Update SR Ad hoc reviews
$$$
$
Question to the audience
A. AHRQB. The Cochrane LibraryC. Canadian Agency for Drugs and
Technologies in Health (CADTH)D. National Institute for Clinical Excellence
(NICE), UKE. All of the above
Select the best answer: You can find high quality systematic reviews for “free” here:
22
Taking it to the next level What to look for when selecting
evidence review centers Commissioning systematic reviews:
Making sure the center understands GRADE requirements What SR methodology they use What databases they can search What software they use How they document their work
23
Question to the audience
A. The outcome is reduction of elevated pressure in the eye (IOP) instead of loss of vision
B. There are large losses to follow-upC. Some trials showing benefits, others
reporting harmsD. The confidence interval is wide and there
are few eventsE. All of the above
GRADE rating evidence: The quality of evidence may need downgrading if:
24
Quality of evidence: beyond risk of biasDefinition: The extent to which our confidence in an
estimate of the treatment effect is adequate to support a particular recommendationMethodological
limitationsInconsistency
of resultsIndirectness of evidence
Imprecision of results
Publication bias
Risk of bias:
Allocation concealment
BlindingIntention-to-treatFollow-upStopped early
Sources of indirectness:
Indirect comparisons
PatientsInterventionsComparatorsOutcomes
25
Quality assessment criteria
26
Lower if…Quality of evidence
High
Moderate
Low
Very low
Study limitations(design and execution)
Inconsistency
Indirectness
Imprecision
Publication bias
Observational studies
Study design
Randomized trials
Higher if…
What can raise the quality of evidence?
Question to the audience
A. High B. Moderate C. LowD. Very low
A systematic review of observational studies showed a relationship between front sleeping position (versus back position) and sudden infant death syndrome (SIDS): OR 2.93 (1.15, 7.47). Rate the quality of evidence for the outcome SIDS:
27
Question to the audience
A. High B. Moderate C. LowD. Very low
You review all colonoscopies for average risk screening in your health system and document a percentage of patient who developed a perforation after the procedure (evidence of free air on imaging). No comparison group without colonoscopy available. Rate the quality of evidence for the outcome perforation:
28
Question to the audience
A. High B. Moderate C. LowD. Very low
Several RCTs have shown the effectiveness of natalizumab to induce remission in Crohn’s disease. Study/post-marketing data showed 31 cases of potentially lethal progressive multifocal leukoencephalopathy (PML, JC virus related). Rate the quality of evidence for PML:
29
Quality assessment criteria
30
Lower if…Quality of evidence
High
Moderate
Low
Very low
Study limitations(design and execution)
Inconsistency
Indirectness
Imprecision
Publication bias
Observational studies
Study design
Randomized trials
Higher if…
Large effect (e.g., RR 0.5)Very large effect (e.g., RR 0.2)
Evidence of dose-response gradient
All plausible confounding would reduce a demonstrated effect
31
“Categories” of quality (1)
Further research is very unlikely to change our confidence in the estimate of effectHigh
LowFurther research is very likely to have an important impact on our confidence in the estimate of effect and is likely to change the estimate
ModerateFurther research is likely to have an important impact on our confidence in the estimate of effect and may change the estimate
Very low Any estimate of effect is very uncertain
32
Conceptualizing quality (2)
We are very confident that the true effect lies close to that of the estimate of the effect.High
LowOur confidence in the effect is limited: The true effect may be substantially different from the estimate of the effect.
ModerateWe are moderately confident in the estimate of effect: The true effect is likely to be close to the estimate of effect , but possibility to be substantially different.
Very lowWe have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect.
Taking it to the next level Advantages of systematically
assessing quality of evidence Downgrading and upgrading “on-the-
fly” can introduce errors
33
Study / year
Treatment
Allo-cation conceal-ment
Blinding No outcome (%)
Analysis
Comments
RE-MOBI-LIZE 2009
dabigatran 220 mg QDdabigatran 150 mg QDenoxaparin 30 mg BID
Yes (IVRS) (blocks of 6)
Patients: YCaregivers: YData coll: PYAdjudic: YData analysts: ?
269/862 (31.2%)232/877 (26.5%)239/876 (27.3%)
ITT: no Low dose ASA and stocking allowed, but not pneumatic devices
GRADE evidence profile
34
Question to the audience
A. High B. Moderate C. LowD. Very low
PICO: Should children with otitis media be treated with antibiotics? Rate the overall quality of evidence for this clinical question by evaluating all critical outcomes (use the evidence profile):
35
36
PICO
Clinica
l questi
on
Rate
importa
nce
Panel
Select
outcomes
Very low
Low
Modera
te
High
Formulate recommendations:
• For or against (direction)• Strong or weak (strength)
By considering: Quality of evidence Balance
benefits/harms Values and
preferences
Revise if necessary by considering:
Resource use (cost)
Quality
rating
outcomes
across
studies
OutcomeOutcomeOutcome
Outcome
Critical
Important
Critical
Lessimportant
Gra
de
dow
n o
r up
Outcome
Important
Overa
ll q
ualit
y o
f evid
ence
Question to the audience
A. “We recommend early antibiotics in children with acute otitis media”
B. “We suggest early antibiotics…”C. “We suggest against using antibiotics
initially…”D. “We recommend against using antibiotics
initially…”
PICO: Should children with otitis media be treated with antibiotics?
Rate the overall strength or recommendations:
37
Strength of recommendation
“The strength of a recommendation reflects the extent to which we can,
across the range of patients for whom the recommendations are intended,
be confident that desirable effects of a management strategy outweigh undesirable effects.”
4 determinants of the strength of recommendation
Factors that can weaken the strength of a recommendation
Explanation
Lower quality evidence The higher the quality of evidence, the more likely is a strong recommendation.
Uncertainty about the balance of benefits versus harms and burdens
The larger the difference between the desirable and undesirable consequences, the more likely a strong recommendation warranted. The smaller the net benefit and the lower certainty for that benefit, the more likely is a weak recommendation warranted.
Uncertainty or differences in patients’ values
The greater the variability in values and preferences, or uncertainty in values and preferences, the more likely weak recommendation warranted.
Uncertainty about whether the net benefits are worth the costs
The higher the costs of an intervention – that is, the more resources consumed – the less likely is a strong recommendation warranted.
39
Implications of a strong recommendation
Patients: Most people in this situation would want the recommended course of action and only a small proportion would not
Clinicians: Most patients should receive the recommended course of action
Policy makers: The recommendation can be adapted as a policy in most situations
40
Implications of a weak recommendation
Patients: The majority of people in this situation would want the recommended course of action, but many would not
Clinicians: Be prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making
Policy makers: There is a need for substantial debate and involvement of stakeholders
41
Taking it to the next level Explicit separation of quality of
evidence from making recommendations
Correctly balancing the benefits against the undesirable effects
Special challenges: resource use Increasing transparency in the
process of making recommendations
42
Question to the audience
A. “We recommend treatment of chronic hepatitis C”
B. “We suggest treatment…”C. “We suggest against treating patients…”D. “We recommend against treating
patients…”
Should patients with chronic hepatitis C be treated with interferon/ribavirin combination? There is high quality evidence for benefits and high quality evidence for harms. Rate the overall strength or recommendations:
43
Patient values & preferences In the absence of evidence, guideline
panels have to function as surrogates to estimate values and preferences (V&P)
Consumer involvement can help Attaching V&P statements to
guideline recommendations increases transparency
44
Taking it to the next level Systematically searching the
literature for studies of values and
preferences
Systematic reviews of V&P
Querying the guideline panel to rate
health utilities of outcomes using
case scenarios 45
Question to the audience
A. Just interested in the topicB. Have been involved in narrative evidence
reviews, but have not used any formal grading system
C. Have used a grading system but not GRADE
D. Using or considered using GRADE
Please select the most appropriate answer. The reason you attended this session:
46
Question to the audience
A. Appears too expensive to implementB. Appears valuable, but still requires
substantial upfront expenseC. Appears to have some upfront cost but
long-term savingsD. I use GRADE – it has been paying off for
me
Please select the most appropriate answer. Selecting a system to rate the quality of evidence and strength of recommendations, such as GRADE:
47
Basic dimensions
Guideline work aligns along 3 basic dimensions
High quality vs. low quality Fast vs. slow Expensive vs. cheap
48
Ideal vs. practical ad hoc GRADE approaches
Stage Elements Advantage Comment
Ideal Systematic reviewGRADE eTablesQual. of evidenceStrength of rec.
Follows highest standardsMethodolog. most rigorousEasily maintainableFully transparent process
Access to methodologistAccess to evidence centersInitially more resource
intensive, long-term savings
Inter-mediary
Ad hoc reviewGRADE eTablesQual. of evidenceStrength of rec.
Still retaining major advantages of the of the “ideal approach”
Risk of bias higherAccess methodologist rec.Only minimal addl. cost
Initiation Ad hoc reviewGRADE eTablesQual. of evidenceStrength of rec.
Option to fully “upgrade” to an “ideal approach”
Foundation of a methodo-logically sound system
Risk of bias higherAccess methodologist prnNo additional cost
49
Sources of funding
Funders may have an agenda Industry – tricky Foundations Public – AHRQ, criteria
EHC program fit (3: available, relevance for public payer, priority condition)
Importance (7: e.g., public interest etc.) No duplication Feasibility Impact (6: e.g., addresses inequity) 50
Taking it to the next level Long term planning Create a high quality guideline
product Attract high quality guideline panel
Unconflicted methodologist (editor) Content expert (deputy editor) Content expert authors Health economists
51
Taking it to the next level
GRADE evidence profiles Condensed and standardized summary of
evidence Are increasingly already created as part of
a systematic review (e.g., Cochrane reviews)
Flexible presentation (e.g., as summary of findings tables)
Initial investment Long-term value GRADEpro software (tie-in with RevMan) Avoids duplication of efforts across the
globe
52
Vision
1. Globalize the evidence, localize recommendations
2. Focus on questions that are important to patients and clinicians
3. Undertake collaborative evidence reviews
4. Use a common metric to assess the quality of evidence and strength of recommendations
5. Examined collaborative models for funding
53Schunemann 2009
GRADE uptake
54
Conclusion
Gaining acceptance as international standard because GRADE adds value:
1. Criteria for evidence assessment across a range of questions and outcomes
2. Sensible, systematic, fostering transparency
3. Balance between simplicity and methodological rigor