View
216
Download
0
Category
Preview:
Citation preview
10 May 2010 1
Approaches to test evaluation
Evan Sergeant
AusVet Animal Health Services
Comparing tests
Kappa – how well tests agree McNemar’s chi-sq – are tests
significantly different?
Kappa
Expected no. both +ve = (157 x 155)/1122 = 21.7 Expected no. both -ve = (965 x 967)/1122 = 831.6 Total Agreement = 1052 Chance Agreement = 853.4 K=(1052-853.4)/(1122-853.4) = 0.739
Test 2 result
Test 1 result T2+ T2- Total
T1+ 121 36 157
T1- 34 931 965
Total 155 967 1122
McNemar Chi-Squared
McNemar's Chi-squared test with continuity correction
McNemar's chi-squared = 22.881, df = 1, p-value = 1.724e-06
Test 2 result
Test 1 result T2+ T2- Total
T1+ 58 37 95
T1- 5 196 201
Total 63 233 296
OJD AGID and ELISA
ELISA
AGID + – Total
+ 34 21 55
– 15 154 169
Total 49 175 224 Enter data into epitools
• Application of diagnostic tests > compare 2 tests• see kappa, McNemar’s and level of agreement
Kappa 0.5496
SE for kappa = 0 0.0666
Z(kappa) 8.25
p(kappa) - one-tailed 0
Proportion positive agreement 0.6538
Proportion negative agreement 0.8953
Overall proportion agreement 0.8393
McNemar's Chi sq 0.6944
p(Chi sq) 0.4
Gold Standard Tests
Use tests with perfect sensitivity and/or specificity to identify the true disease status of the individual from which the samples were taken.
What are the advantages and disadvantages of this approach?
Gold Standards Tests
Advantages• Known disease status, • Relatively simple calculations
Disadvantages• May not exist, or be prohibitively expensive• Rare diseases may only allow small sample size• Disease may not be present in the country?• Difficult to get representative (or even comparable)
samples of diseased/non-diseased individuals
Exercises
Calculate Se and Sp for OJD AGID using data provided in OJD_AGID_Data.xls • Calculate confidence limits using epitools
Non-gold standard methods
Do not depend on determining true infection status of individual.
Rely on statistical approaches to calculate best fit values for Se and Sp.
Tests must satisfy some important assumptions.
Comparison with a knownreference test
Assumptions• Independence of tests• Se/Sp of reference test is known.
For ~100% specific reference test, • Se(new test) = Number positive both tests /
Total number positive to the reference test
Culture vs Serology
Estimate sensitivity of culture and serology (as flock tests)
Serology followed-up by histopathology to confirm flock status
Both tests 100% specificity (as flock tests) How would you estimate sensitivity for these test(s) Which test has better Se? Is the difference significant?All Flocks Serology
+ve -ve Total
PFC +ve 58 37 95
-ve 5 196 201
Total 63 233 296
Example
Se (PFC) = 58/63 = 92% (83% - 97%) Se (Serology) = 58/95 = 61% (51% - 70%)
Value
Kappa 0.6427
SE for kappa = 0 0.0559
Z(kappa) 11.49
p(kappa) - one-tailed 0
Proportion positive agreement 0.7342
Proportion negative agreement 0.9032
Overall proportion agreement 0.8581
McNemar's Chi sq 22.881
p(Chi sq) 0
Estimation from routine testing data
test-positives are subject to follow-up and truly infected animals are identified and removed from the population
Can be used to estimate specificity when the disease is rare in the population of interest.
Sp = 1 – (Number of reactors / Total number tested)
Se and Sp of equine influenza ELISA
During the equine influenza outbreak in Australia, horses were tested by PCR and serology:• to confirm infection; • to demonstrate seroconversion and/or absence of
infection >30 days later;• As part of random and targeted surveillance for
case detection, to confirm area status and for zone progression in presumed “EI free” areas.
How could you use the resulting data to estimate sensitivity and specificity of the ELISA?
Equine influenza ELISA
475 PCR-positive horses, 471 also positive on ELISA
1323 horses from properties in areas with no infection, 1280 ELISA negative
Analyse in Epitools• Application of diagnostic tests> test
evaluation against gold standard Sergeant, E. S. G., Kirkland, P. D. & Cowled, B. D. 2009. Field Evaluation of an
equine influenza ELISA used in New South Wales during the 2007 Australian outbreak response. Preventive Veterinary Medicine, 92, 382-385.
Point Estimate Lower 95% CL Upper 95% CL
Sensitivity 0.9916 0.9786 0.9977
Specificity 0.9675 0.9565 0.9764
Mixture modelling
Assumptions• observed distribution of test results (for a
test with a continuous outcome reading such as an ELISA) is actually a mixture of two frequency distributions, one for infected individuals and one for uninfected individuals
Opsteegh, M., Teunis, P., Mensink, M., Zuchner, L., Titilincu, A., Langelaar, M. & van der Giessen, J. 2010. Evaluation of ELISA test characteristics and estimation of Toxoplasma gondii seroprevalence in Dutch sheep using mixture models. Preventive Veterinary Medicine.
Latent Class Analysis
What is Latent Class Analysis? Maximum Likelihood Bayesian
Maximum likelihood estimation
Assumptions• The tests are independent conditional on disease status (the
sensitivity [specificity] of one test is the same, regardless of the result of the other test);
• The tests are compared in two or more populations with different prevalence between populations;
• Test sensitivity and specificity are constant across populations; and
• There are at least as many populations as there are tests being evaluated.
TAGS software• Hui, S. L. & Walter, S. D. 1980. Estimating the error rates of diagnostic
tests. Biometrics, 36, 167-171.
TAGS
Open R – shortcut in root directory of stick
Open tags.R in text editor or word Select all and copy/paste into R console Type TAGS() and <Enter> to run Hui Walter example
• 2 tests for TB• Test 1 = Mantoux• Test 2 = Tine test
Follow the prompts to enter data:• Data set = new• Name = test• Number of tests = 2, Number of populations = 2• Reference population? = No (0)• Enter results for each population from table below• Best guesses use defaults • Bootstrap CI = Yes (1000 iterations)
Test 1 Test 2 Population 1 Population 2
0 0 528 367
1 0 4 31
0 1 9 37
1 1 14 887
Data
$Estimations
pre1 pre2 Sp1 Sp2 Se1 Se2
Est 0.0268 0.7168 0.9933 0.9841 0.9661 0.9688 CIinf 0.0159 0.6911 0.9797 0.9684 0.9495 0.9540 CIsup 0.0450 0.7412 0.9978 0.9921 0.9774 0.9790
Bayesian estimation
What is Bayesian estimation?• Combines prior knowledge/belief (what you think you know) with
data to give best estimate• Incorporates existing knowledge on parameters (Se, Sp,
prevalence)• “Priors” entered as probability (usually Beta) distributions• Uses Monte Carlo simulation to solve• Outputs also as probability distributions• Can get very complex
Assumptions• Independence of the tests• Appropriate prior distributions chosen.• Need information on prior probabilities• Some methods can adjust for correlated tests• Multiple tests in multiple populations
Methods• EpiTools (only allows one population so must have
good information on one or more test characteristics)
• WinBUGS models
Bayesian analysis surra data
Test 2
Test 1 ELISA
CATT +ve -ve Total
+ve 0 39 39
-ve 0 251 251
Total 0 290 290Inputs for Bayesian analysis for revised sensitivity and specificity estimates
Prior distributions for Bayesian analysis
x n alpha beta
Prev 1 1
Se_CATT (81%) 100 81 82 20
Sp_CATT (99.4%) 160 159 160 2
Se_ELISA_2 (75%) 100 75 76 26
Sp_ELISA_2 (97.5%) 120 117 118 4
EpiTools
Run EpiTools > Estimating true prevalence > Bayesian estimation with two tests
Enter parameters:• Data from 2x2 table: 0, 39, 0, 251• Prevalence = Beta(1,1) (uniform = don’t know)• Test 1 (CATT): Se = Beta(82, 20), Sp = Beta(160,
2)• Test 2 (ELISA): Se = Beta(76, 26), Sp = Beta(118,
4)• Starting values: 0, 38, 0, 245• Other values as defaults and click submit
Prevalence Sensitivity-1 Specificity-1 Sensitivity-2 Specificity-2
Minimum <0.0001 0.6219 0.8535 0.5475 0.9554
2.5% 0.0001 0.7210 0.8818 0.6510 0.9789
Median 0.0038 0.8064 0.9109 0.7418 0.9910
97.5% 0.0201 0.8749 0.9354 0.8217 0.9973
Maximum 0.0567 0.9370 0.9517 0.8891 0.9998
Mean 0.0055 0.8044 0.9103 0.7406 0.9903
SD 0.0055 0.0393 0.0136 0.0436 0.0048
Iterations 20000 20000 20000 20000 20000
Recommended