Course overview, the diagnostic process, and measures of interobserver agreement Thomas B. Newman,...

Course overview, the diagnostic process, and measures of interobserver agreement

Thomas B. Newman, MD, MPH

September 18, 2008

Overview Administrative stuff Overview of the course The diagnostic process Interobserver agreement

– Continuous variables– Categorical variables

• Concordance• Kappa

– Regular– Weighted

Administrative stuff Introductions Basic structure of course

– New material each week in lecture– Read material before lecture if possible– HW on that material due the following week in

section– Exceptions:

• No class October 9• Penultimate class 12/4 – Chapter 12 (Challenges for

EBD) and course review: pass out take-home exam; no HW on Ch 12

• Last lecture 12/11: review of take-home exam Lectures: mixture of PPT and Whiteboard

– How many want paper copies of PPT slides?

SECTIONS

Section assignments: Click ROSTER on Epi 204 website

Section rooms: Click SCHEDULE on website

Faculty will rotate; students, rooms and TA's will be constant for the quarter

Homework Required – key way of learning material Which problems are assigned announced

in SECTION and (later) posted on web Not graded if late, but can still be turned in;

answers on web Use fresh sheets of paper with your name

on each, not syllabus pages, not e-mail. (You can download and word-process if you want, but print a copy unless section leader prefers electronic.)

Will be graded by section leaders and returned the following week

Getting help Classmates, then section leaders, then

faculty Ambiguous/confusing problems – send

e-mail to section leader or me– Unless you indicate otherwise, we will

assume we can cc the whole class when we respond if we think question is of general interest

Textbook

TBN and MAK have almost finished a book, “Evidence-based Diagnosis” (Cambridge University Press, 2009)

Other texts listed in on web Copies of other books in bookstore and on

reserve in the library and available for browsing here

Grading, honor code, etc. Worst HW score dropped; all other HW count

equally 2/3 Homework avg + 1/3 final examination

OR 1/3 Homework avg + 2/3 final examination, whichever is better

Try all problems on your own first; OK to help each other with HW but– Acknowledge help– Write answer in own words

Do not collaborate on final exam Honor code taken seriously

Course overview Diagnosis

– Theory– Inter-rater reliability– Dichotomous tests– Multilevel tests– Studies of tests– Combining tests

Screening and prognostic tests Treatments: randomized trials Alternatives to randomized trials P-values and confidence intervals; Bayes' theorem Clinicians and probability

Diagnostic process Why do we want to assign a name to

this person’s illness? Different reasons lead to different

classification schemes

Examples Acute nephrotic syndrome Acute leukemia Attention deficit disorder Dysuria worth a course of antibiotics SLUBI=Self-limited undiagnosed benign

illness

Simplified Generic Decision Problem

Patient either has the disease or not If D+, net benefit of treatment If D-, better not to treat (“Treat” could include doing more tests)

Simplifying assumptions (often wrong) Test results are dichotomous

– Most tests have more than two possible answers

Disease states are dichotomous– Many diseases occur on a spectrum– There are many kinds of “nondisease”

Evaluating diagnostic tests

Reliability Accuracy Usefulness

Today we do reliability

Types of variables

Categorical– Dichotomous – 2 values– Nominal – no intrinsic ordering – Ordinal – intrinsic ordering

Continuous (infinite number of values) vs Discrete (limited number)

Measuring interobserver agreement for categorical variables

Gallop heard by Observer B

No gallop heard by Observer B

Total, Observer A

Gallop heard by Observer A 20 15 35No gallop heard by Observer A 10 55 65Total, observer B 30 70 100

What is agreement?

Concordance rate

What percent of the time do the 2 observers agree (exactly)

Advantage: easy to understand Disadvantage: may be misleading if

observers agree on prevalence of abnormality

Concordance rate problem

Total, Observer A

Unbalanced Disagreement

Lesion # RATER A RATER B

10 L L

BA S M L Total

S 2 2 1 5M 0 0 2 2L 0 0 3 3

Total 2 3 6

What is going on here? Look for lack of balance

above and below diagonal Results when observers

have different thresholds

Definition of Kappa The amount of agreement beyond what

would be expected by chance* Formula:

Practice– Obs = 90%, Exp = 80%, K =– Obs = 70%, Exp = 60%, K =– Obs = 60%, Exp = 70%, K =

*Given the observed marginals

Observed agreement – Expected agreement

1 – Expected agreement

Calculation of Expected Agreement from Marginals

Total, Observer A

GCS Eye opening- Observed

Doc #2None To Pain To

CommandSpontaneous Total

None 11 2 0 4 17To Pain 4 1 2 0 7

To Command 0 3 8 3 14Spontaneous 2 1 7 68 78Total 17 7 17 75 116

Emergency Physician #2

GCS Eye Opening: Expected

Doc #2None To Pain To

CommandSpontaneous Total

None 2.5 1 2.5 11 17To Pain 1 0.4 1 4.5 7

To Command 2.1 0.8 2.1 9.1 14Spontaneous 11.4 4.7 11.4 50.4 78Total 17 7 17 75 116

Emergency Physician #2

17 x 78/116 = 1326/116 = 11.4

Why does multiplying row total by column total and dividing by N give you the expected agreement?

Weighted Kappa Weighted kappa

– Linear– Quadratic– Custom

Real-life illustration: Rating of neurological examination Types of weights, Stata illustration.

. tab ex1 ex2

. kap ex1 ex2, w(w)

. kap ex1 ex2, w(w2) (See Appendix 2.1)

What does observed Kappa depend upon?

How well people agree SPECTRUM within classifications

– E.g., re the abnormal ones VERY abnormal?– Difficult cases can be excluded or over-sampled

PREVALENCE of classifications by the various observers (and whether they agree on prevalence)

Chance (random error; people can get lucky/unlucky)

Weighting scheme used

Wireless Internet Access

Key is n2xa8!wr

Course overview, the diagnostic process, and measures of interobserver agreement Thomas B. Newman,...

Documents

Interobserver Variation in the Diagnosis of Gastric ... · Interobserver Variation in the Diagnosis of Gastric Epithelial Dysplasia and Carcinoma 143 for neoplasia’. This corresponds

Intra- and interobserver reliability of glenoid fracture

ORIGINAL ARTICLE Interobserver agreement for the ATS/ERS ... · ORIGINAL ARTICLE Interobserver agreement for the ATS/ERS/JRS/ALAT criteria for a UIP pattern on CT Simon L F Walsh,1

interobserver correlation in classification of bone loss in total knee arthroplasty

Studies of Diagnostic Tests Thomas B. Newman, MD, MPH October 14, 2010

High Interobserver Variability in the Assessment of

Intra and interobserver reliability of the interpretation ... and... · Intra and interobserver reliability of the interpretation of high-resolution computed ... de TCAR de LPMBP

The Power of Stories over Statistics: Illustrations from Neonatal Jaundice and Infant Airplane Safety Thomas B. Newman MD, MPH Professor of Epidemiology

Introduction to Clinical Research and Research Questions Thomas B. Newman, MD,MPH Professor of Epidemiology & Biostatistics and Pediatrics, UCSF Epi 150.03,

Thomas B. Newman, MD, MPH Andi Marmor, MD, MSEd October 21, 2010

Intra and interobserver reliability of the interpretation ... · Intra and interobserver reliability of the interpretation of high-resolution computed ... due to alveolar collapse),

Epi 202: Designing Clinical Research Data Management for Clinical Research Thomas B. Newman, MD,MPH Professor of Epidemiology & Biostatistics and Pediatrics,

Andi Marmor, MD, MSEd Thomas B. Newman, MD, MPH October 18, 2012

SHIRE Evaluation of Clinical Information Technology CLC 11/17/04 Jeff Newman MD MPH Sutter Health Institute for Research & Education (SHIRE)

Quantifying Your Commitment to Your Patient Experience ... · Quantifying Your Commitment to Your Patient Experience Strategy Polina Strug, MPH Lyndsey Newman

Interobserver agreement in determining non-small cell lung cancer … · 2012. 8. 21. · Interobserver agreement in determining non-small cell lung cancer subtype in specimens acquired

Understanding P- values and Confidence Intervals Thomas B. Newman, MD, MPH \Clinepi 2004\Understanding P- values and CI 10Nov04

Newman Questionnaire Report 2015 Newman Questionnaire Report Report... · Newman Questionnaire Report 2015 Newman Questionnaire Report ... Newman Questionnaire Report 2015 ... The

Studies of Medical Tests Thomas B. Newman, MD, MPH September 9, 2008

Alternatives and Enhancements to Intention to Treat Analyses of Randomized Trials Thomas B. Newman, MD, MPH February 20, 2002