NONPARAMETRIC ESTIMATION AND INFERENCE
FOR POLYTOMOUSDISCRIMINATION INDEXJIALIANG LI, QUNQIANG FENG, JASON P. FINE
MICHAEL J. PENCINA, BEN VAN CALSTER
MULTI-CATEGORY CLASSIFICATION ACCURACY
• THE OUTCOMES OF DIAGNOSTIC PROBLEMS IN MEDICINE SOMETIMES INVOLVE MORE THAN TWO DISTINCT CATEGORIES.
• TO EXAMINE THE CLASSIFICATION ACCURACY WE MUST EMPLOY NON-STANDARD ACCURACY MEASURES.
• TWO GENERAL APPROACHES TO EXTEND THE DIAGNOSTIC MEASURE OF DICHOTOMOUS DISCRIMINATION (SO-CALLED C-STATISTIC OR AREA UNDER THE RECEIVER OPERATING CHARACTERISTIC CURVE (AUC)) TO POLYTOMOUS PROBLEMS.
CONSIDER A PAIR
• THE FIRST APPROACH EVALUATES PAIRS OF SUBJECTS FROM A DIFFERENT CATEGORY, SUCH AS THE M-INDEX (HAND AND TILL (2001), MACHINE LEARNING), AND OBUCHOWSKI’S PAIRWISE C-STATISTIC.
• IN GENERAL THERE ARE M CHOOSE 2 PAIRS. COULD BE TOO MANY WHEN M IS LARGE.
• THE M-INDEX DOES NOT CORRESPOND TO ANY RANDOM EVENT.
CONSIDER A SET OF SUBJECTS SELECTED FROM ALL CATEGORIES
• THE SECOND APPROACH EVALUATES SETS OF M SUBJECTS, WHERE EACH IS FROM A DIFFERENT CATEGORY.
• VOLUME UNDER THE ROC SURFACE (VUS): M=3.
• HYPERVOLUME UNDER THE ROC MANIFOLD (HUM): M>=3.
• R PACKAGE HUM IS AVAILABLE.
• HUM EXTENDS AUC WITH SIMILAR PROBABILISTIC INTERPRETATION.
POLYTOMOUSDISCRIMINATION INDEX (PDI)
• SIMILAR TO HUM, PDI IS ALSO EVALUATING THE PROBABILITY OF AN EVENT RELATED TO SIMULTANEOUSLY CLASSIFYING M SUBJECTS FROM M CATEGORIES.
• VAN CALSTER ET AL. (2012) INTRODUCED THE SAMPLE DEFINTION OF PDI BUT DIDN’T DISCUSS POPULATION DEFINITION AND ALSO DIDN’T PROVIDE INFERENCE METHODS.
• HIGHER PDI VALUES SUGGEST BETTER MULTI-CLASS DISCRIMINATION. IF SOME BIOMARKERS HAVE TOO LOW A PDI VALUE, THEY SHOULD BE REMOVED FROM CONSIDERATION AND WE MAY RESERVE MORE ATTENTION ON OTHER QUALIFIED BIOMARKERS.
• USEFUL FOR A SCREENING STUDY WITH THOUSANDS OF CANDIDATE BIOMARKERS AND ONLY A FEW OF THEM ARE TRULY USEFUL FOR DIFFERENTIATING THE DISEASE STATUS.
CONCEPTUAL DEFINITION
CONCEPTUAL DEFINITION
REMARKS
EQUALLY LIKELY CATEGORIES
ACTIONABLE DEFINITION
ACTIONABLE DEFINITION
COMPARISON WITH SENSITIVITY
COMPARING IN TWO DIRECTION
)...,( :MSubject ......
)...,( :2Subject )...,(:1Subject
,2,1
2,22,12
1,21,11
MMMM
M
M
ppp
pppppp
ACTIONABLE DEFINITION
NULL VALUE OF PDI
SAMPLE ESTIMATION
U-STATISTIC FORM
UNBIASEDNESS
EXACT VARIANCE
SAMPLE ESTIMATES
ASYMPTOTIC VARIANCE
CENTRAL LIMIT THEOREM
COVARIATE ADJUSTMENT
SIMULATION STUDY
SIMULATION STUDY: UNEQUAL GROUP SIZE
LIVER CANCER EXAMPLE
• 202 PARTICIPANTS FROM CAIRO, EGYPT: 73 HEPATOCELLULAR CARCINOMA (HC) CASES, 77 HEALTHY INDIVIDUALS (NC) AND 52 WITH CHRONIC LIVER DISEASE (QC).
• FOCUS ON A TOTAL OF 484 PEAKS RESULTED FROM PREPROCESSING OF THE RAW MASS SPECTROMETRY DATA.
SYNOVIAL BIOMARKERS: 6 CATEGORIES
• THE FOLLOWING SPECIMEN SAMPLES WERE INCLUDED: (1) NON-INFLAMED CONTROL SPECIMENS (N=22) (2) RHEUMATOID ARTHRITIS (RA) WITH ACTIVE DISEASE DESPITE DMARD TREATMENT (N=28); (3) EARLY UNDIFFERENTIATED ARTHRITIS (DURATION < 12 MONTHS, N=10); (4) CHRONIC (DISEASE DURATION > 4 WEEKS) SEPTIC ARTHRITIS (SEA) PROVEN BY POSITIVE BACTERIAL CULTURE (N= 11); (5) NON-INFLAMMATORY ORTHOPEDIC ARTHROPATHIES(ORTH.A, N=23, CONSISTING OF FEMUR FRACTURE, N=3; AVASCULAR NECROSIS OF THE FEMUR, N=3; MENISCUS AND/OR LIGAMENT INJURY, N=13, AND PLICA SYNDROME, N=4); (6) OSTEOARTHRITIS (OA, N=31).
PDI FOR MARKERS
SOFTWARE
• DOWNLOADABLE FROM MY WEBSITE:
• HTTP://WWW.STAT.NUS.EDU.SG/~STALJ/