Critical appraisal of a diagnostic paper
Shakila ThangaratinamProfessor of Maternal and Perinatal HealthWomen’s Health Research Unit
R & D Director of Women’s HealthBarts Health NHS Trust
Relation of diagnosis and therapy in clinical practice
Diagnosis
Therapy
C lin ica l o u tco m e
D isea sed p op u la tion
H e a lth y a nd d ise ase d p op u la tion
A hierarchical model for evaluation of tests
• Assessment of the reliability and other technical
aspects of a test
• Assessment of the diagnostic accuracy
• Assessment of the diagnostic effectiveness and cost
effectiveness
Fryback et al. Med Decis Making 1991;11:88-94
How does a diagnostic test help?
A test is useful if it changes our ability to predict whether the patient has the disease or not
• The likelihood of disease before the test should be altered by the test result
• The likelihood of disease after the test should be different compared to the prevalence of disease in the population
Value of a test
Likelihood of TEST Likelihood ofdisease before test RESULT disease
after test
Prevalence of disease + Should raise the
(e.g. 7-11% of pregnant likelihood to ~100%
women have pretermdelivery ) — Should lower the
likelihood to ~ 0%
How to do systematic reviews
Formulate clear clinical questions from our knowledge needs identified in patient encounters
Search the literature to identify relevant articles
Critically appraise the evidence for its validity and usefulness
Synthesis the evidence
Implement useful findings in clinical practice
Critical Appraisal of the Medical Literature
• Are the results of the study valid?
• What are the results?
• Will the results help in caring for patients?
Oxman et al. JAMA 1993;270:2093-5
A test accuracy study
T e s t
G o ld sta n da rd
D ise a sep re se n t
D ise a sea b se n t
T e s tp o s it ive
G o ld sta n da rd
D ise a sep re se n t
D ise a sea b se n t
T e s tN e ga tive
S tu dy S am p le
Are the results valid?• Was there an independent “blind”
comparison with a reference standard?
• Did the patient sample include an appropriate spectrum of patients to whom the test will be applied in clinical practice?
• Did the results of the test influence the decision to perform the reference standard?
Are the results valid?
Test
Gold s tandardIn de pe nd en t
“B lin d ”V e rif ica tion o f a ll te st p o sit ive ca ses
D ise a sep re se n t
D ise a sea b se n t
T e s tp o s it ive
Gold s tandardIn de pe nd en t
“B lin d ”V e rifica tion o f a ll te s t ne g a tive ca ses
D ise a sep re se n t
D ise a sea b se n t
T e s tN e ga tive
Study SampleA p pro p ria te sp ec trum o f d ise a seC o n se cu it ive o r ran d om sam p le
P ro s p ec tive ly re c ru ited
Hierarchy of evidence for test accuracy studies
1 An independent, blind comparison with reference standard among an appropriate population of consecutive patients.
2 An independent, blind comparison with reference standard among an appropriate population of non-consecutive patients or confined to a narrow population of study patients.
3 An independent, blind comparison among an appropriate population of patients, but reference standard not applied to all study patients.
4 Non-blind comparison or reference standard not applied independently5 Expert opinion with no explicit critical appraisal, based on physiology,
bench research or first principles
Critical Appraisal of the Medical Literature
• Are the results of the study valid?
• What are the results?
• Will the results help in caring for patients?
Oxman et al. JAMA 1993;270:2093-5
What are the results?
What is the “benefit” or “added value” of test as a diagnostic aid? Can we measure this added benefit?
• Are clinically meaningful measures of diagnostic accuracy provided?
• Are the data necessary to calculate the diagnostic accuracy measures provided?
Consequences of testing
T e s t
G o ld sta n da rd
D ise ase p re se n t(T ru e po s it ive )
D ise ase ab se n t(F a lse po s tive )
T e s t po s it ive
G o ld sta n da rd
D ise ase p re se n t(F a lse n e ga tive )
D ise ase ab se n t(T ru e n e ga tive)
T e s t N e ga tive
S tu dy S am p le
Measures of test accuracyMeasures of test accuracyDisease
Present AbsentPositive TP FP
Test Negative FN TN
• Sensitivity and Specificity
• Predictive values
• Likelihood ratios
• Diagnostic odds ratio
Sensitivity
SensitivitySensitivity is the proportion of those people who really have
the disease (TP+FN) who are correctly identified as such (TP)
• Sensitivity = TP/(TP+FN)
TP FP
FN TN
DiseasePresent Absent
Test
Positive
Negative
Sensitivity
Sensitivity is the proportion of those people who really have
the disease (TP+FN) who are correctly identified as such (TP)
• Sensitivity = TP/(TP+FN)
TP FP
FN TN
DiseasePresent Absent
Test
Positive
Negative
Specificity
Specificity
Specificity is the proportion of those people who really do not
have the disease (TN+FP) who are correctly identified as such (TN)
• Specificity = TN/(TN+FP)
TP FP
FN TN
DiseasePresent Absent
Test
Positive
Negative
Specificity
Specificity is the proportion of those people who really do not
have the disease (TN+FP) who are correctly identified as such (TN)
• Specificity = TN/(TN+FP)
TP FP
FN TN
DiseasePresent Absent
Test
Positive
Negative
Predictive Values
PPV
Positive Predictive Value is the proportion of the people who test
positive (TP+FP) who truly have the disease (TP)
• Positive predictive value = TP/(TP+FP)
TP FP
FN TN
DiseasePresent Absent
Test
Positive
Negative
PPV
Positive Predictive Value is the proportion of the people who test
positive (TP+FP) who truly have the disease (TP)
• Positive predictive value = TP/(TP+FP)
TP FP
FN TN
DiseasePresent Absent
Test
Positive
Negative
NPV
Negative Predictive Value is the proportion of the people who test
negative (TN+FN) who truly do not have the disease (TN)
Negative predictive value = TN/(TN+FN)
TP FP
FN TN
DiseasePresent Absent
Test
Positive
Negative
NPV
Negative Predictive Value is the proportion of the people who test
negative (TN+FN) who truly do not have the disease (TN)
Negative predictive value = TN/(TN+FN)
TP FP
FN TN
DiseasePresent Absent
Test
Positive
Negative
Problems• Sensitivity and specificity are
characteristics of the test
• Predictive values are dependent on the prevalence of the disease
• Our population is often quite different from the study population
The systematic review process
Formulate research /
policy conclusions
Search bibliographi
c databases
Identify possible papers
from titles/abstracts
Retrieve papers
Extract data
Further selection of
primary studies using inclusion
criteria
Synthesi
s
Formulate
research question
Design search
strategy
Quality
appraisal