48
06/06/22 Validity and reliability of Test s 1 VALIDITY AND RELIABILITY OF SCREENING TESTS Rashida B Syed, Epidemiologist Consultant Faculty Field Epidemiology Training Program (FETP)-Pakistan

Validity-reliability Screening Tests Rbs-feltp

Embed Size (px)

Citation preview

Page 1: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 1

VALIDITY AND RELIABILITY OF SCREENING TESTS

Rashida B Syed, EpidemiologistConsultant Faculty Field Epidemiology Training Program (FETP)-Pakistan

Page 2: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 2

Objectives Calculate and interpret measures of the validity of a screening

test: Sensitivity Specificity

Understand the relationship between sensitivity and specificity.

Calculate and interpret measures of the performance (yield) of a screening test:

Predictive value positive (PV+) Predictive value negative (PV-)

Understand factors that influence PV+ and PV-

Recognize issues and sources of bias in evaluating screening programs.

Page 3: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 3

Purpose of screening The early detection of disease in individuals

who do not show any signs of disease.

Aims to reduce morbidity and mortality from disease among persons being screened.

Is the application of a relatively simple, inexpensive test, examinations or other procedures to people.

a means of identifying persons at increased risk for the presence of disease, who warrant further evaluation

Page 4: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 4

Diagnosis = Screening

Screening tests can also often be used as diagnostic tests

Diagnosis involves confirmation of presence or absence of disease in someone suspected of or at risk for disease

Screening is generally in done among individuals who are not suspected of having disease

Page 5: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 5

Requirements

Is there a truly effective treatment available for the discovered disease?

Is that treatment more effective in screened than non-screened cases?

What are the side effects of the screening process?

How efficient is screening? Do we have the right threshold? i.e. how many people must be screened to obtain a case?

Page 6: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 6

Susceptible Host

Subclinical Disease

Clinical Disease

Stage of Recovery, Disability, or Death

Point of Exposure

Screening

Onset of symptoms

Diagnosis sought

Natural History of Disease

Detectable sub-clinical disease

Page 7: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 7

Examples of Screening Tests

Questions Clinical Examinations Laboratory Tests Genetic Tests X-rays

Goel

Page 8: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 8

Diseases for which screening has been recommendedDiseases for which screening has been recommended

Cervical cancer

Breast cancer

Prostate cancer

Colon cancer

Diabetes

Hypertension

Page 9: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 9

Terminology

Validity is analogous to accuracy

The validity of a screening test is how well the given screening test reflects another test of known greater accuracy

Validity assumes that there is a gold standard to which a test can be compared

Paneth

Page 10: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 10

Page 11: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 11

Three key measures of validity

• Sensitivity• Specificity• Predictive value

Page 12: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 12

Sensitivity and Specificity

Sensitivity tells us how well a positive test detects disease.

It is defined as the ability of the test to identify correctly as diseased, those who have the disease.

---------------------------------------------------------------------------------

Specificity tells us how well a negative test detects

non-disease.

Defined as the ability of the test to identify correctly those who do not have the disease as test negative.

Page 13: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 13

DiseaseS

cree

nin

gT

est

Present Absent

PositiveTrue

positives

Negative

Falsepositives

Falsenegatives

Truenegatives

Page 14: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 14

Present Absent

Positive a b

Negative c d

a + b

c + d

a + c b + d

DiseaseS

cree

nin

gT

est

N

Page 15: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 15

Sensitivity

Proportion of individuals who have the disease who test positive (true positive rate) tells us how well a “+” test picks up disease

a

a + c=Sensitivityyes no

+ a b

- c d

a + b

c + d

a + c b + d

Disease

Scr

eeni

ngT

est

N

Page 16: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 16

Specificity

Proportion of individuals who don’t have the disease who test negative (true negative rate) tell us how well a “-” test detects no disease

d

b + d=Specificityyes no

+ a b

- c d

a + b

c + d

a + c b + d

Disease

Scr

eeni

ngT

est

N

Page 17: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 17

Predictive value

Positive predictive value – the number of individuals who have a condition from all those who test positive.

Negative predictive value - the number of individuals who do not have a condition from all those who test negative

Page 18: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 18

Positive Predictive Value

Proportion of individuals who test positive who actually have the disease

a

a + b=P.P.V.yes no

+ a b

- c d

a + b

c + d

a + c b + d

Disease

Scr

een

ing

Tes

t

N

Page 19: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 19

Negative Predictive Value

Proportion of individuals who test negative who don’t have the disease

d

c + d=N.P.V.yes no

+ a b

- c d

a + b

c + d

a + c b + d

Disease

Scr

een

ing

Tes

t

N

Page 20: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 20

Determinants of predictive value

The predictive value of a test is determined by 3 factors:

1. Sensitivity 2. Specificity 3. Prevalence of the disease in the

population being tested

Page 21: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 21

Effect of prevalence on PPV

As prevalence rates decrease, the positive predictive value of a test also decreases

This explains why diagnostic tests which are developed in clinical populations (where the prevalence of the disease being tested is often high) often perform poorly in general population settings (where disease prevalence tends to be lower).

In our example-prove it

Page 22: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 22

Scenarios

Tests with Dichotomous Results Examples

(Positive or Negative)

Tests with Continuous results Examples

Systolic blood pressure (mm Hg) Tuberculin reaction (induration diameter, mm)

Page 23: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 23

Examples

In a sample of 200 people: 100 people have the disease Hypothyroidism, and 100 people do not have it.

In the same sample of 200 people: 110 people test positive for Hypothyroidism using a new diagnostic test, and 90 people test negative for Hypothyroidism using the same diagnostic test.

Of the 110 people who are test positive, 90 do have the disease and 20 do not.

Of the 90 people who are test negative, 10 do have the disease and 80 do not.

Sensitivity and Specificity?

Page 24: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 24

Solution

SENSITIVITY=TP/TP+FN

=90/90+10=90% SPECIFICITY=TN/TN+FP

=80/80+20=80%

Page 25: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 25

Present Absent

Positive 48 3

Negative 2 47

51

49

50 50

Disease

Scr

een

ing

Tes

t

100

A test is used in 50 people with disease and50 people without. These are the results.

Paneth

Page 26: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 26

Present Absent

Positive 48 3

Negative 2 47

51

49

50 50

Disease

Scr

een

ing

Tes

t

100Sensitivity = 48/50Specificity = 47/50Positive Predictive Value = 48/51Negative Predictive Value = 47/49

Paneth

Page 27: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 27

So… you understand the accuracy of a screening test …

What is the next step?

Put screening to use in the population

Page 28: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 28

Sensitive vs. Specific tests

A test with high levels of sensitivity is usually positive when disease is present and has few false negatives – useful when it is important not to miss a diagnosis (e.g. if the disease is dangerous but has an effective treatment)

A test with high levels of specificity is usually negative when disease is absent and has few false positives – useful when a false positive diagnosis would be harmful (e.g. if it resulted in unnecessary treatment)

Page 29: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 29

Balancing sensitivity vs. specificity A really good test would be highly sensitive and highly specific. In practice, this is often not the case.

Instead, there is often a trade-off between the sensitivity and the specificity of diagnostic tests

This occurs in cases where the test result is expressed on a continuous scale (e.g. blood pressure, blood sugar levels)

In such circumstances, a cut-point has to be chosen to define normal vs. abnormal

The decision for the cut point involves weighing the consequences of leaving cases undetected (false negatives) against erroneously classifying healthy persons as diseased (false positives).

Refer to Gordis

Page 30: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 30

NET SENSITIVITY AND SPECIFICITY

Use of multiple tests Refer Gordis

Page 31: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 31

Balancing sensitivity vs. specificity

Blood sugar level Sensitivity % Specificity%2hrs after eating(mg/100ml)

70 98.6 8.890 94.3 47.6110 85.7 84.1130 64.3 96.9170 42.9 100.0

Page 32: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 32

ROC curves

One method for determining the best cut-off point is by constructing a ROC curve

ROC=receiver operating characteristic, a term that comes from radar science

ROC curves are constructed by plotting the sensitivity (or true positive rate) against the false positive rate (1-specificity)

Page 33: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 33

ROC curve for blood sugar readings

Source: Fletcher, Fletcher and Wagner, Clinical epidemiology: the essentials (3rd ed)

Page 34: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 34

Shows trade-off between sensitivity and specificity

Closer to left hand and top borders the more accurate the test

Slope of tangent at cut point gives the Likelihood Ratio (LR) for that value of the test

The area under the curve is a measure of test accuracy

Page 35: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 35

The Area under an ROC Curve

Page 36: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 36

Good tests lie close to the upper left hand corner of the graph – where sensitivity and specificity are both high

Generally the best cut-off point lies at or near the “shoulder” of the curve*

The overall accuracy of the test is represented by the area under the curve

Tests that plot close to the diagonal across the middle of the graph are least useful, as this is where the test is no better than chance

ROC curves can also be used to compare different tests

*unless there are clinical reasons for preferring a highly sensitive or highly specific test

Page 37: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 37

Sources of Bias in the Evaluation of Screening Programs

Lead time bias Length bias Volunteer bias

Page 38: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 38

Lead time bias

Lead time: interval between the diagnosis of a disease at screening and the usual time of diagnosis (by symptoms)

Diagnosis by screening

Diagnosis via symptoms

Lead Time

Page 39: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 39

Consider a condition where the natural history allows for an earlier diagnosis, however, survival does not improve despite identifying it earlier

A screening program here will… survival will appear to increase

but in reality, it is increased by exactly the amount of time their diagnosis was advanced by the screening program

Thus there is no benefit to screening from a survival standpoint.

Lead-Time Bias 

Page 40: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 40

Lead time bias Assumes survival is time between screen and

death Does not take into account lead time between

diagnosis at screening and usual diagnosis.

Diagnosis by screening

in 1994

Deathin 2008

Survival = 14 years

Page 41: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 41

Lead time bias

Diagnosis by

screeningin 1994

Usual time of diagnosis

via symptomsin 1998

Lead Time 4 years

Deathin 2008

True Survival = 10 years

Survival = 14 years

Page 42: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 42

Length Bias

Most chronic diseases, especially cancers, do not progress at the same rate in everyone.

Any group of diseased people will include some in whom the disease developed slowly and some in whom it developed rapidly.

Screening will preferentially pick up slowly developing disease (longer opportunity to be screened) which usually has a better prognosis

Paneth

Page 43: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 43

Leng

th b

ias

OBiological onset of disease

Screening

YSymptoms

Begin

DDeath

PDisease

detectable via screening

O DP Y

O DP Y

O DP Y

O DP Y

O DP Y

O P Y D

Time

Page 44: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 44

Volunteer bias

Type of bias where those who choose to participate are likely to be different from those who don’t

Volunteers tend to have: Better health Lower mortality Likely to adhere to prescribed medical regimens

Page 45: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 45

A worked example the Fecal occult blood (FOB) screen test is used in 203 people to look for bowel cancer: Patients with bowel cancer (as confirmed on endoscopy)

False positive rate (α) = FP / (FP + TN) = 18 / (18 + 182) = 9% = 1 − specificity.

False negative rate (β) = FN / (TP + FN) = 1 / (2 + 1) = 33% = 1 − sensitivity.

Power = sensitivity = 1 − β

Hence with large numbers of false positives and few false negatives, a positive FOB screen test is in itself poor at confirming cancer (PPV = 10%) and further investigations must be undertaken, it will though pickup 66.7% of all cancers (the sensitivity). However as a screening test, a negative result is very good at reassuring that a patient does not have cancer (NPV = 99.5%) and at this initial screen correctly identifies 91% of those who do not have cancer (the specificity).

Page 46: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 46

Reliability

Validity (accuracy) Reliability (Repeatability)

Refer Epidemiology by Gordis

Page 47: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 47

Review questions from Gordis

Page 48: Validity-reliability Screening Tests Rbs-feltp

04/08/23 Validity and reliability of Tests 48

Likelihood-ratio positive = sensitivity / (1 − specificity) = 66.67% / (1 − 91%) = 7.4

Likelihood-ratio negative = (1 − sensitivity) / specificity = (1 − 66.67%) / 91% = 0.37