Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
Polyhedral Classifier for Target Detection
A Case Study: Colorectal Cancer
Murat Dundar, Matthias Wolf, Sarang Lakare, Marcos Salganicoff, Vikas C. Raykar
Siemens Medical Solutions, Inc. USAMalvern, PA 19355
Page 2 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Computer Aided Diagnosis (CAD) for Colon Cancer
Identify suspicious regions
(candidates)
Extract features for each
candidate
Classify candidates as a polyp or
non-polyp
Page 4 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Multi-mode nature of CAD data
The only ground truth
available is the location of the
polyp.
All other candidates that are
not pointing to a known polyp
are pooled into the negative
class.
Variation among the different
negatives is large.
Page 5 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
A CAD Example: Colorectal CancerPolyps vs. common false positives
Fold
Stool Noise
Rectal tube
Sessile polyp
Pedunculated polyp
Page 6 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Model class distribution by a mixture model, one mode for
each subclass, then design a maximum a posteriori or
maximum likelihood classifier
Too few positives, too many features with redundancy!
Robust estimation of model parameters for positive class is
very difficult, if not impractical
State-of-the-Art – Finite Mixture Models
Page 7 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
State-of-the-Art – Discriminative Techniques
Pool all negative candidates into a single class and
learn a binary classifier, i.e. polyps vs. negatives
A kernel-based discriminative technique (SVM, RVM,
KFD) can yield nonlinear decision boundaries
suitable for classifying multi-mode data.
Too few positive candidates, too many features with
redundancy! Data can be easily overfit by a nonlinear
classifier
Page 8 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
State-of-the-Art – One-Class Classifiers
Omits the negative class, learns a model with
positive samples only.
Kernel-based and neural network implementation
yield nonlinear decision boundaries suitable for
classifying multi-mode data.
Like other nonlinear classifiers susceptible to
overfitting
Page 9 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
State-of-the-art in a Nutshell
Linear classifiers
less prone to overfitting
not enough capacity to deal with multi-mode data
Finite mixture models
Parameter estimation is an issue!
Discriminative & One-class Classifiers
good capacity
more prone to overfitting
Page 10 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
A Viable solution
A series of linear classifiers one for each subclass of the
negatives
More capacity than a linear classifier, yet less prone to
overfitting than a nonlinear classifier
An unseen sample is classified as positive if all the classifier
classifies it as positive
Page 11 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Training Multiple Linear Classifiers
Train each classifier independently: Negative subclass k vs.
Positives, for k=1,…,K.
Inefficient! Potentially excessive penalization due to a
misclassified positive sample
Page 12 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Proposed Approach
Optimize classifiers jointly
One classifier for each subclass of negative data
Objective function is penalized once due to a
misclassified positive sample
Yields a polyhedral decision surface
Page 13 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
A Toy Example
Page 14 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Hyperplane Classifiers with Hinge Loss
}xα1 ,0max{ iT
ii y
TP+
FP-
ξ
0xα T
ξ
Page 15 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
i-th Positive example: -- “AND”
i-th Negative example:
Polyhedral Classifier with AND Framework
If the hinge loss = 0, the example is correctly classified,If the hinge loss > 0, the example is mis-classified
Let be the hinge loss of i-th example induced by the classifier k
),0max( ik
),,max(0, 21 iKii ξξ
ik
Page 16 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Objective Function with the AND Framework
K
k
iKiCi
i
k CiikK
P
ξξ
Jk
1k
212
121
)α(
),,max(0,
),0max()α,α,α(
Error on Negative Examples
Error on Positive Examples
Regularization to Control Complexity
Convex Problem!
Page 17 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Incomplete Ground Truth for Subclasses
AND algorithm assumes the subclass membership is known for all samples. Not Realistic!
Annotate a small portion of the negatives
identify potential subclasses
pool training samples for each subgroup.
Three different types of samples in the training data
Positives
Negatives with known and unknown subclass membership
Page 18 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Objective Function with the AND-OR Framework
K
k
iKiCi
i
Ci kik
k CiikK
P
ξξ
Jk
1k
212
ˆ1
121
)α(
),,max(0,
),0max(
),0max()α,α,α(
Error on Negative Examples with known subclasses
Error on Positive ExamplesAND operation
Regularization to Control Complexity
Error on Negative Examples with unknown subclasses, OR operation
Not Convex!
Page 19 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Each iteration contains K steps, and each step optimizes a single classifier
Alternating Optimization Iterative Algorithm
At the k-th step,Fix all classifiers (α’s) but the classifier k
Minimize J(α1,…, αk ,… αK) for optimal αk
Page 20 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Cascaded Design
1 2 KT1 ….
rejected candidates
T2 TK-1 TK
F1F2 FK
Candidates
TP
Page 21 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Cascade Design with Sparse Linear Classifiers
Setting P(k)=| k | yields K sparse classifiers, each with varying
number of non-zero coefficients
Run-time order does not change the outcome
Start with the classifier that has the least number of nonzero
coefficients
Classify the sample, if negative reject, if positive pass it to the next
classifier that requires computation of least number of additional
features. Continue until all K classifiers are run
Page 22 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Experiments – Automatic Polyp Detection
Data
Volumes Polyps Negative candidatesTraining 316 226 1,249Test 385 245 1,920
98 numerical image features are computed,out of 1249 negatives, 177 are annotated, 9 subclasses are identified
Page 23 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
ROC plots
Page 24 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Run-time Performance
25 % gain in execution time over SVDD and RBF-SVM
Classifiers Sens (at 3fp/vol) Time (t)
Polyhedral 84 452
SVDD 80 595
Rbf-SVM 60 595
Linear-SVM 45 437
Page 25 July-08Copyright © Siemens Medical Solutions, USA, Inc.; 2008. All rights reserved.
CAD & Knowledge Solutions / Malvern, USA / IKMDundar et al.
Conclusions
Polyhedral classifier for multi-mode data
AND framework when subclass information is fully available
AND-OR framework when subclass information is partially available
Cascade design as a by-product to speed-up online execution
Thank you! Questions and Comments