Click here to load reader
Upload
kubuldinho
View
28
Download
4
Embed Size (px)
DESCRIPTION
Another easy-to-digest presentation on epidemiology. This presentation focuses on the assessing of tests done in medical research, especially kits used in screening large populations.
Citation preview
VALIDITY AND REALIABILITY OF SCREENING TESTS
SCREENING OR DIAGNOSTIC TESTS
Epidem. is about (among other things) determining the prevalence or incidence of disease in pops.
Usually, pop. is examined to decide whether condition is present or not.
Screening procedure is for early detection of disease process.
Procedure & examiner must be valid and Reliable.
VALIDITY & RELIABILITY:
Validity (accuracy) is about recognizing a condition where it is present or not.
There must also be reliability (Consistency).
Reliability is about being able to produce the same finding when examination done more than once.
VALIDITY :
Simply ability of test to do what it purports to do (ACCURATE).
- i.e. correctly categorize those that are +ve.or correctly categorize those that are –ve
Consider a diagnostic test (e.g. Dipstix). (For Dichotomous Results.)
DISEASE STATUS DIAGNOSIS (TRUTH)
SCREENINGTEST
POSITIVE NEGATIVE TOTAL
POSITIVE a (TP) b (FP) a+b
NEGATIVE c (FN) d (TN) c+dTOTAL a+c b+d a+b+c+d
VALIDITY Cont….:
a = Those with disease detected by test (True positives - TP)
b= Those without disease that test says they do (False positives - FP)
c= Those with disease that test says don’t have (False negatives - FN)
d= Those without disease which test says they don’t (True negatives - TN)
Measuring (Quantifying) validity:
Sensitivity:= Proportion of positives (Those with
disease) that the test is able to detect.i.e. a (Probab. that +ve will be called
+ve) a+c
(Able to give +ve findings when person has disease)
Measures of validity cont…:
Specificity:= Proportion of those without disease
that test was able to detect i.e. d (Probability of a -ve being called -
ve) b+d(Able to give -ve findings when person
has no disease)
Measures of validity cont…:
Accuracy is thus sensitivity and specificity. As sensitivity ↑ false Negative↓ (FN) As specificity ↑false Positive ↓ (FP)
Measures of validity cont…: In setting up test cut off point
(Sensitivity or specificity), must consider consequences of missing a positive or a negative. ↑Sensitivity (and↓specifity) when disease
is serious and treatment exists or when spreading at high rate. (HIV !!!!)
Measures of validity cont…: Desirable to have a high (100%) sensitivity
and specificity. In real life it isn’t so, espec. Continuous
variables. Lowering criterion for +ve means more people
with disease will test +ve (↑ sensitivity) But people without disease will also ↑ among
those testing positive (↓Specificity). (Thus the test will be very sensitive but less specific).
When ↑criterion those without disease will ↑ (↑specificity). But those with disease will ↓. Thus it will be more specific but less sensitive.
Measuring (Quantifying) validity…:
Predictive Values: Accuracy of a test is alternatively
described as: The extent to which being
categorized as positive or negative predicts the presence or absence of the disease.
This is given as positive or negative predictive values.
Measures of validity cont…:
Positive Predictive Value (PV+):
= (Predictive value of a positive test) is percentage of persons who are deemed positive by new test and confirmed so by standard.
Measures of validity cont…: Negative Predictive Value (PV-):= (Predictive Value of a negative
test) is percentage of persons who are deemed negative by new test and confirmed so by standard.
(This is proportion of people being correctly labeled diseased or not disease).
SCREENING TEST
+ - TOTAL
+TP (a) FP (b) TP+FP
-FN (c) TN (d) FN+TN
TOTAL TP+FN FP+TNTP+FP+ FN+TN
GOLD STANDARD (DIAGNOSTIC) TEST
Validity…: Sensitivity = TP
TP +FN Specificity = TN
FP +TN PV + = TP
TP +FP
PV - = TNFN +TN
Measures of validity cont…: PV+ = a (Proportion of +ves by test
who are a+b actually with disease). PV- = d (Proportion of -ves by test who
are c+d actually without disease).
Measures of validity cont…:
In rare disease PV- are high cause most of those tested will be –ve.
Predictive Values depend not only on validity of test (Sensitivity, specificity) but also on prevalence of disease.
Measures of validity cont…: Test that is more specific will make person with
+ve test likely to have the disease. Thus the greater the PV+ (More accurately spotting the–ve).
Test that is more sensitive will make person with a –ve test likely to have no disease. Thus the greater the PV-.
No matter how specific test is, the positives in a disease with low prevalence are likely to be false positives.
PREDICTIVE VALUE & SPECIFICITY OF TEST
Specificity is one factor that affects PV of a test.
Thus increase in specificity results in a much greater ↑ in PV+ than does the same ↑in sensitivity.
+
-
+ -SCREENING TEST
1000
Diagnostic test
Prev. = 50%Sens. = 50%Spec. = 50%PV+ = ??
+
-
+ -
(1000)
Diagnostic Test
Screening Test Prev. =
20%Sens. = 50%Spec. = 50%PV+ = ??
+
-
+ -
(1000)
Diagnostic Test
Screening Test
Prev. = 20%Sens. = 90%Spec. = 50%PV+ = ??
+
-
+ -
(1000)
Diagnostic Test
Screening Test Prev. = 20%
Sens. = 50%Spec. = 90%PV+ = ??
PREDICTIVE VALUE & SPECIFICITY OF TEST
Relationship between disease prevalence and predictive value in a test with 95% sensitivity and 95% specificity.
At 0 prevalence, chance –ve test has no disease is 100% (PV-) and the chance that a +ve test has disease is 0% (PV+).
The rise in prevalence is accompanied by a rise in PV+ and decrease in PV-. At 40% Prev. PV+ rises to peak while PV- declines lower.
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
Prevalence of disease (%)
Pred
ictiv
e va
lue
(%)
Negative test Positive test
PREDICTIVE VALUE & SPECIFICITY OF TEST
Most of gain in PV+ occurs with increase in prevalence at the lowest rates of disease prevalence i.e.
(1% - 5% Prevalence associated with 17% - 51% of predictive value).
(Prev. 20%; Pop=1000; Sensitivity=90% Specificity=80% - Calculate PV+).
REATIONSHIP OF DISEASE PREV. TO PREDICTIVE VALUEEXAMPLE: SENSITIVITY = 99% SPECIFICITY = 95%
DISEASE PREV.
TESTS RESULTS
SICK NOT SICK
TOTALS PREDICTIVE (+VE) VALUE
1% +-
TOTALS 10,000
5% +
-
TOTALS 10,000
REATION BETWEEN SPECIFITY AND PREDICTIVE VALUEEXAMPLE: PREVALENCE = 10%, SENSITIVITY = 100%
SPECIFICITY TESTS RESULT
S
SICK NOT SICK TOTALS PREDICTIVE VALUE (+VE)
70% +-
TOTALS
10,000
95% +-
TOTALS
10,000
Validity Cont…:
Why worry about disease prevalence; The higher the prevalence, the higher the
predictive +ve value. Screening test is more efficient if targeted
on high risk pop. Screening low prevalence pops can be
wasteful and yields few detected cases for large efforts applied.
SUMMARY Cont…:
Sensitivity:- Is calculated from test results of diseased persons.
-It is totally independent of the test results of the non-diseased.
SUMMARY Cont….: Specificity:
- Is calculated from test results of non-diseased persons.- It is totally independent of the test results of the diseased.
Predictive values rely on both results of diseased and non-diseased. Always a high predictive value is preferred.
SUMMARY Cont…:
Altering cut-off point diagnostic test may affect sensitivity and specificity. e.g in BP for Hypertension.
↑BP defined as Diastolic 90mmHg or more. But have some hypertensives between
80mmHg & 90mmHg. If cut off is reduced to 80mmHg i.e. all with
80mmHg are hypertensive,
SUMMARY Cont….:
All with hypertension (True) +ves will be detected (↑sensitivity).
But those without will also ↑ (false +ves) which is ↓specificity,
So test will be very sensitive but not specific. When we ↑cut off point to 100mm Hg diastolic Those without hypertension will all be detected
↑true negatives (↑specificity). But those with disease will ↓(↓ in true positives)
which is ↓ in sensitivity. So test will be very specific but not sensitive.
SUMMARY Cont…: In setting sensitivity or specificity levels, must
consider consequences of: Missing actual cases (Positives e.g. Ca. Cervix). Missing actual negatives (HIV).
↑sensitivity when disease is serious and treatment exists or when spreading at high rate and is serious.
↑Specificity (PV+) when treatment procedure is cumbersome and expensive (e.g. mastectomy).
But when early detection is important for complete cure and treatment is invasive, then balance the two.
RELIABILITY (REPEATABILITY, PRECISION, REPRODUCABILITY)
Test gives consistent results when test repeated on same person under same condition.
Four sources of variability that can affect the reproducibility of a screening test. Inherent biological variability in person being
tested e.g. BP. Varies in individuals under differing circumstances.
Reliability of the instrument being used or the test method e.g. when temp ↑or equipment tilted.
Intra-observer variability.
Reliability Cont….: Inter-Observer variability
- Two observers- Extent to which observers agree or
disagree can be put in quantitative terms.
Calculating Overall (%) AgreementX-RAYS
RADIOLOGIST (OBSERVER 2)
RADIOLOGIST (OBSERVER 1)
NORMAL SUSPECT DOUBTFUL ABNORMAL
NORMAL (A) B C D
SUSPECT E (F) G H
DOUBTFUL J K (L) M
ABNORMAL N O P (Q)
Overall (%) Agreement
Percent Agreement= A+F+L+Q x 100 Total readings (Total x-rays read).
In general most people who are tested have negative results.
Considerable agreement is therefore found (between two observers) in negative or normal tests i.e. when no disease its easier to detect for both observers.
% Agreement…: When one calculates percentage
agreement on all subjects (population) per cent agreement may be high because of the high agreement among negative tests. (Those with obvious disease are few. Doubtful cases are more difficult and few).
OBSERVER 2
+ -
+
OBSERVER 1
-
a b
c d Can Ignore (d)
% Agreement…: This high value of percent agreement
because of the –ve tests, tend to conceal significant disagreements between the observers in regard to identification of subjects as positive.
- hence a
a+b+cwill only address % agreement in regard to identifying the sick.
Kappa Statistic : (coefficient): Agreement between two observers can be
purely by chance e.g. If no standard or criteria for reading x-rays, agreement in many cases is purely by chance.
Question we ask is: To what extent do their readings agree beyond
what we would expect by chance alone. Or
To what extent does agreement between the two observers exceed the level of agreement that would result just from chance.
Kappa Statistic : (coefficient): The Kappa Statistic is used to
calculate this extent: Kappa: Numerator: is percent observed
agreement minus per cent agreement expected by chance alone. (Deals with actual observations).
Kappa Statistic (coefficient):
Denominator: Difference between full agreement and percent agreement expected by chance alone.
Thus Kappa quantifies the extent to which observed agreement exceeds that which would be expected by chance alone.
Kappa Statistic (coefficient): To calculate Kappa, first calculate observed
agreement.
A = Identifies 45 slides i.e. 60% of 75 total as grade II.
B = Identifies 44 or 58.6% of all slides as grade II.To calculate % agreement the formula is:a+d x 100%
a+b+c+d
In this case % observed agreement is:41+27 x 100 = 90.7%
75
PATHOL A
GRADE II III
II 44 (56.8%)
PATHOL B
III 31 (41.4%)
45 (60%) 30 (40%)
41a
3b
4c
27d
Kappa Statistic (coefficient): If 2 pathologists used entirely different
sets of criteria, how much agreement would be expected solely on the basis of chance?
A read 60% of all 75 slides as grade II.
Kappa Statistic (coefficient): If A applied a criteria independent of
that used by B, Then A would read as grade II, 60% of
those that B has called grade II and 60% of those that B called grade III would be grade II by A.
Thus 60% of slides called grade II by B = 60 x 44 = 26.4
100
GRADE A
GRADE II III
II 44 (56.8%)
GRADE B
III 31 (41.4%)
45 (60%) 30 (40%)
26.4a
17.6b
18.6c
12.4d
Kappa Statistic (coefficient):
60% of slides called grade III by B will be grade II by A
= 60 x 31 = 18.6 100 Thus Agreement expected by chance
alone= 26.4+12.4 x 100 = 51.7%
75
Kappa Statistic : (coefficient):
Kappa is calculated by formula = (% Obser. Agre.) – (% agre. Expec. by
chance) 100% - (% agre. Expec. by chance)= 90.7% - 51.7% = 39% = 81
100% - 51.7% 48.3%
Kappa Statistic (coefficient):
Its suggested that a Kappa of : 0.75 and above is excellent agreement
beyond chance. 0.40 is poor agreement.
Between .40 and .75 is intermediate agreement