Upload
griffin-bradley
View
213
Download
0
Embed Size (px)
Citation preview
Management of Neonatal Hyperbilirubinemia
Methods of the AHRQ Evidence Report
FDA Advisory Committee MeetingJune 11, 2003
Joseph Lau, MD
Tufts-New England Medical Center EPC
INVESTIGATORS
Stanley Ip, MD
Mei Chung, MPH
Stephan Glicken, MD
John Kulig, MD
Rebecca O’Brien, MD
Robert Sege, MD, PhD
Joseph Lau, MD
Evidence report process
• Rigorous, comprehensive syntheses and analyses of relevant scientific literature
• Explicit and detailed documentation of methods, rationale, and assumptions
• Scientific syntheses may include meta-analyses and cost analyses
• Broad range of experts is included in the development process
• Reports do NOT make clinical recommendations
Systematic review process
• Formulate well focused study questions
• Establish evidence review protocol (inclusion and exclusion criteria)
• Perform comprehensive literature search
• Screen abstracts and full articles
• Abstract data and perform critical appraisal
• Perform analyses, summarize and interpret results
Key questionsAssociation of neonatal hyperbilirubinemia with
neurodevelopmental outcomes
1. What is the relationship between peak bilirubin levels and/or duration of hyperbilirubinemia and developmental outcome?
2. What is the evidence for effect modification of the results in question 1, by gestational age, hemolysis, serum albumin, and other factors?
Key questions (cont.)Treatments for neonatal hyperbilirubinemia
3. What are the quantitative estimates of efficacy of treatment for:
1. reducing peak bilirubin levels (e.g., number-needed-to-treat (NNT) at 20 mg/dl to keep total serum bilirubin (TSB) from rising);
2. reducing the duration of hyperbilirubinemia (e.g., average number of hours by which time TSB greater than 20 mg/dl may be shortened by treatment); and
3. improving neurodevelopmental outcomes.
Key questions (cont.)Diagnosis of neonatal hyperbilirubinemia
4. What is the efficacy of various strategies for predicting hyperbilirubinemia, including hour-specific bilirubin percentiles?
5. What is the accuracy of transcutaneous bilirubin measurements?
Literature search
• Medline and Premedline databases searched September 2001, yielding 4,325 citations
• Consulted domain experts and reviewed bibliography of relevant review articles for potential additional studies
• Supplemental search for case reports of kernicterus was also performed
General inclusion criteria
• English language human studies• Newborns between birth and one-month • Healthy, full-term infants 34 weeks EGA or 2,500 grams 10 subjects per arm (5 for Q1 and Q2)
• Additional criteria were applied to specific question
Literature search results
• Total citations screened = 4,325• Full articles retrieved = 663• Studies included in report = 138*
– Q1/Q2 = 37 + 28 kernicterus case reports– Q3 = 21– Q4 = 10– Q5 = 46
* Total of counts of individual questions exceeds 138 due to overlapping coverage
Summarizing and grading of evidence
Important parameters to sum up
• Methodological quality (internal validity, design, conduct, and reporting of the study)
• Applicability (generalizability, external validity, population, setting)
• Study size (weight, precision)• Effect (results, associations, test
performance)
Methodological quality
Refers to the design, conduct, and reporting of the clinical study. Because studies may be from a variety of types of design, the following three-level classification of study quality may be used to apply to each type of design.– Least potential bias (Grade A)– Susceptible to some bias, but not sufficient
to invalidate the results (Grade B)– Significant bias that may invalidate the
result (Grade C)
Applicability Category 1: Sample is representative of the target
population, or if results are definitely applicable to general population irrespective of study sample.
Category 2: Sample is representative of a relevant sub-group of the target population.
Category 3: Sample is representative of a narrow subgroup of patients only, and not well generalizable to other subgroups.
Quantitative methods used in evidence report
Question 3: NNT
What are the quantitative estimates of efficacy of treatment for: reducing peak bilirubin levels (e.g., number-needed-to-treat (NNT) at 20 mg/dl to keep total serum bilirubin (TSB) from rising)?
Hypothetical example of treating bilirubin at 15 mg/dl to prevent it from rising
Treat at 15 mg/dl
Not treat
Rise 10 pts 20
Not rise 90 80
Total 100 100
Risk Difference = 10/100 – 20/100 = -10/100 = -0.1
NNT = 1 / Risk Difference = 1/10 = 10
Methods to assess agreement between two testing methods reported in studies
• Correlation (r value)– Meta-analyses performed in evidence
report when data available
• Bland and Altman method (difference of results of two testing methods plotted against their mean value)– Preferred method
Accuracy of BilicheckTM
Bhutani et al., Pediatrics 2000
Limitations of correlation coefficient to assess agreement
(hypothetical data - all have correlation coefficient of 1)
0
5
10
15
20
25
30
35
40
45
0 5 10 15 20 25 30 35 40 45
HPLC bilirubin(reference standard)
Ne
w m
ea
su
rin
g d
ev
ice
Limitations of correlation coefficients in assessing agreement between two testing methods
• Correlation coefficient provides a measure of the strength and directionality of the association, but NOT agreement
• Correlation measures ignore bias• Correlation coefficient does not provide
information as to clinical utility of diagnostic test• Correlation coefficient (r) is dependent on
distribution of serum bilirubin• Measures relative rather than absolute
agreement• High correlation coefficient is a necessary but
not a sufficient condition to assess agreement
Bland and Altman method
• True value is unknown• Takes the average of the paired measurements
as the best estimate• Plot for each pair of measurements, the
difference in results between devices against the average results
• Removes statistical artifact of plotting the difference against either of the measurement (built-in correlation)
• The magnitude of bias can be estimated as well as the standard deviation of the differences
Error distribution paired HPLC TSB and TcBBhutani et al., Pediatrics 2000
Common methods to summarize diagnostic test performance
• Combining sensitivity and specificity independently
• Combining diagnostic odds ratios across studies
• Summary ROC curve
Summary ROC methodMoses LE, Shapiro D, Littenberg B. Combining independent studies of a diagnostic test into a summary ROC curve: Data-analytic approaches and some additional considerations. Stat Med 1993; 12:1293-1316.
• Assumption: studies results differ because of different thresholds
• Solution: fit a curve in the ROC space that best describes the data
• Problem: sensitivity and specificity are correlated
• Solution: regress the difference of the logits onto the sum of logits and transform back to ROC space
1 - specificity
a
b
d
c
1 - specificity
sen
sitiv
ity
a
b
d
c
ROC curve constructed from multiple test thresholds
Diseased
Notdiseased
Multiple thresholds evaluated in test
b c da
Examples of SROC curves and pooled sensitivity and specificity