1
Christian A. Cumbaa [email protected] and Igor Jurisica [email protected], Division of Signaling Biology, Ontario Cancer Institute, Toronto, Ontario Automated Classification of Crystallization Images Acknowledgements All images in these studies were generated at the High-Throughput Screening lab at The Hauptman-Woodward Institute. Multi-outcome truth data was painstakingly generated by eight heroes at HWI, and carefully organized, cleaned, and curated by Max Thayer and Raymond Nagel at HWI. This work was funded by the following grants and organizations: NIH U54 GM074899-01, Genome Canada, IBM, NSERC RGPIN 203833-02. Earlier work supported by NIH P50 GM62413-05, NSERC and CITO. Image classification New directions Table 2: The confusion matrix summarizing the match between actual crystallization outcomes and the labels assigned by the image analysis system (Experiment 2). Numbers indicate counts of actual images. M achine classification C rystal 3934 354 1039 280 88 1 4 268 1174 21 C rystal/Phase Sep. 578 433 281 117 51 14 0 421 94 2 C rystal/Precip. 1016 153 2972 1721 296 23 2 211 69 0 Precip. 397 49 1325 24547 987 52 4 1213 810 27 Precip./Skin 120 24 206 1201 2557 5 3 98 29 8 Precip./Phase Sep. 19 13 101 199 38 18 1 49 1 0 Phase Sep./Skin 7 2 4 2 16 0 11 24 1 0 Phase Sep. 422 115 77 274 73 29 2 3721 1229 32 Clear 101 1 12 128 9 0 1 123 28482 174 G arbage 19 1 0 33 4 0 1 4 163 246 Figure 2: Distributions of observed crystallization outcomes (rows) grouped by labels (columns) applied by the image analysis system. Elements on the diagonal indicate correct classifications. Numbers indicate Precision scores for each class. M achine classification C rystal C rystal/Phase Sep. C rystal/Precip. Precip. Precip./Skin Precip./Phase Sep. Phase Sep./Skin Phase Sep. Clear G arbage True positives (highest- scoring) False negatives (lowest- scoring) False positives (highest- scoring) Crystal Crystal + Phase Sep. Crystal + Precip. Precip. Precip + Skin Precip + Phase Sep. Phase Sep. + Skin Phase Sep. Clear Garbage Figure 3: Example classifications and misclassifications for each category (Experiment 2). Goal: We aim to automatically classify all images generated by the HWI robotic imaging system, and eliminate the need for a crystallographer to search among hundreds of images for crystal hits, or other conditions of interest. Data source: Truth data for 147456 images from Hauptman Woodward's High-Throughput Screening (HTS) Laboratory • Each image evaluated by 3 or more experts • Scored for presence/absence of 7 independent crystallization conditions: clear, phase separation, precipitate, skin, crystal, garbage, unsure Experiment 2 was supplemented with • 6456 crystal images (NESG-sourced proteins) • 11504 crystal images (SGPP-sourced proteins) Image analysis: Each image was processed by our image processing algorithms in order to extract 840 numeric measures of image texture. These features measure the presence of straight edges, grey-tone statistics, etc., each measured at multiple scale and contrast levels. Feature selection: For each target category of images, we select a subset of the 840 features that most effectively distinguishes positive/negative examples of each category. Images are therefore reduced to a short vectors of numeric values. Image classification: To train a classifier, we construct statistical models of the probability distribution of feature-vector values: one for each category. For these experiments, we use multivariate Gaussians to estimate probability density. New images are classified by comparing their feature vectors to each category's probability distribution. The result, for each image, is itself a probability distribution across all categories. The category with the highest probability will be output by the classifier. To avoid bias in our models, each data point is used in turn for training and testing in a 10-fold cross- validation process. measures are used, precision and recall. Precision measures the fraction of images classified as category C that actually belong to C. Recall measures the fraction of images belonging to C that were classified as C. Experiment 1: Independent crystallization conditions. 6 classifiers trained to detect clear, phase separation, precipitate, skin, crystal, garbage. Training/test images limited to unanimously-scored images (per-category). • Table 1 summarizes the performance of each. Experiment 2: Compound crystallization conditions. One 10-way classifier trained to distinguish between 10 compound categories: crystal only, crystal+phase separation, crystal+precipitate, precipitate only, precipitate+skin, precipitate+phase separation , phase separation+skin, phase separation only, clear drop, and garbage. Training/test images limited to unanimously-scored images belonging to one of the 10 categories. • Table 2 summarizes the performance of the classifier. • Figure 1 illustrates the distribution of true positives and false negatives. • Figure 2 illustrates the distribution of true positives and false positives. • Figure 3 gives example images of each. Discussion Experiments 1 and 2 reveal degrees of difficulty in recognizing crystallization outcomes. Most singleton categories in Experiment 2 performed generally well. Clear drops are most accurately classified. Many compound categories demonstrate the classifier's confusion between certain mixtures of outcomes. All Results New image analysis system (under development) •Revised and expanded feature set •Textural features of local regions of the image •More precise texture, straight edge, and discrete object metrics World Community Grid •New system will run on the World Community Grid •150 CPU-years compute time per day •Will compute features for 60 million images •Project launch Spring/Summer 2007 •http://www.worldcommunitygrid.org/ M achine classification C rystal C rystal/Phase Sep. C rystal/Precip. Precip. Precip./Skin Precip./Phase Sep. Phase Sep./Skin Phase Sep. Clear G arbage Figure 1: Distributions of classification labels (columns), as applied by the image analysis system to observed crystallization outcomes (rows). Elements on the diagonal indicate correct classifications. Numbers indicate Recall scores for each outcome. .55 .22 .46 .83 .60 .04 .16 .62 .98 .52 .59 .38 .49 .86 .62 .13 .38 .61 .89 .48 Table 1: The confusion matrices summarizing the match between actual crystallization outcomes and the labels assigned by the classification system (Experiment 1). Numbers indicate counts of actual images. M achine classification C rystal Phase Sep. Precip. Skin Clear G arbage + - + - + - + - + - + - Truth + 234 448 9315 630 60528 2849 5783 1453 28278 782 407 122 - 1167 144223 47059 41261 3429 52395 13103 106919 5093 89159 2068 138131 Precision 0.17 0.17 0.95 0.31 0.85 0.16 Recall 0.34 0.94 0.96 0.80 0.97 0.77

Automated Classification of Crystallization Images

Embed Size (px)

DESCRIPTION

.55. False negatives (lowest-scoring). False positives (highest-scoring). True positives (highest-scoring). .22. .46. .83. Crystal. .60. .04. Crystal + Phase Sep. .16. .62. .98. .52. Crystal + Precip. Precip. Precip + Skin. Precip + Phase Sep. Phase Sep. + Skin. - PowerPoint PPT Presentation

Citation preview

Page 1: Automated Classification of Crystallization Images

Christian A. Cumbaa [email protected] and Igor Jurisica [email protected], Division of Signaling Biology, Ontario Cancer Institute, Toronto, Ontario

Automated Classification of Crystallization Images

AcknowledgementsAll images in these studies were generated at the High-Throughput Screening lab at The Hauptman-Woodward Institute. Multi-outcome truth data was painstakingly generated by eight heroes at HWI, and carefully organized, cleaned, and curated by Max Thayer and Raymond Nagel at HWI.

This work was funded by the following grants and organizations:NIH U54 GM074899-01, Genome Canada, IBM, NSERC RGPIN 203833-02.Earlier work supported by NIH P50 GM62413-05, NSERC and CITO.

Image classification

New directions

Table 2: The confusion matrix summarizing the match between actual crystallization outcomes and the labels assigned by the image analysis system (Experiment 2). Numbers indicate counts of actual images.

Machine classification

CrystalCrystal/Phase Sep.Crystal/Precip.Precip.Precip./SkinPrecip./Phase Sep.Phase Sep./SkinPhase Sep.Clear Garbage

Crystal 3934 354 1039 280 88 1 4 268 1174 21

Crystal/Phase Sep. 578 433 281 117 51 14 0 421 94 2

h Crystal/Precip. 1016 153 2972 1721 296 23 2 211 69 0

t Precip. 397 49 1325 24547 987 52 4 1213 810 27

u Precip./Skin 120 24 206 1201 2557 5 3 98 29 8

r Precip./Phase Sep. 19 13 101 199 38 18 1 49 1 0

T Phase Sep./Skin 7 2 4 2 16 0 11 24 1 0

Phase Sep. 422 115 77 274 73 29 2 3721 1229 32

Clear 101 1 12 128 9 0 1 123 28482 174

Garbage 19 1 0 33 4 0 1 4 163 246

Figure 2: Distributions of observed crystallization outcomes (rows) grouped by labels (columns) applied by the image analysis system. Elements on the diagonal indicate correct classifications. Numbers indicate Precision scores for each class.

Machine classification

CrystalCrystal/Phase Sep.Crystal/Precip.Precip.Precip./SkinPrecip./Phase Sep.Phase Sep./SkinPhase Sep.ClearGarbage

Crystal

Crystal/Phase Sep.

h Crystal/Precip.

t Precip.

u Precip./Skin

r Precip./Phase Sep.

T Phase Sep./Skin

Phase Sep.

Clear

GarbageTrue positives (highest-scoring)

False negatives(lowest-scoring)

False positives (highest-scoring)

Crystal

Crystal +Phase Sep.

Crystal +Precip.

Precip.

Precip+ Skin

Precip +Phase Sep.

Phase Sep.+ Skin

Phase Sep.

Clear

Garbage

Figure 3: Example classifications and misclassifications for each category (Experiment 2).

Goal: We aim to automatically classify all images generated by the HWI robotic imaging system, and eliminate the need for a crystallographer to search among hundreds of images for crystal hits, or other conditions of interest.

Data source:Truth data for 147456 images from Hauptman Woodward's High-

Throughput Screening (HTS) Laboratory• Each image evaluated by 3 or more experts• Scored for presence/absence of 7 independent crystallization

conditions: clear, phase separation, precipitate, skin, crystal, garbage, unsure

Experiment 2 was supplemented with• 6456 crystal images (NESG-sourced proteins)• 11504 crystal images (SGPP-sourced proteins)

Image analysis: Each image was processed by our image processing algorithms in order to extract 840 numeric measures of image texture. These features measure the presence of straight edges, grey-tone statistics, etc., each measured at multiple scale and contrast levels.

Feature selection: For each target category of images, we select a subset of the 840 features that most effectively distinguishes positive/negative examples of each category. Images are therefore reduced to a short vectors of numeric values.

Image classification: To train a classifier, we construct statistical models of the probability distribution of feature-vector values: one for each category. For these experiments, we use multivariate Gaussians to estimate probability density.

New images are classified by comparing their feature vectors to each category's probability distribution. The result, for each image, is itself a probability distribution across all categories. The category with the highest probability will be output by the classifier.

To avoid bias in our models, each data point is used in turn for training and testing in a 10-fold cross-validation process.

Measuring performance: Two important performance measures are used, precision and recall.

Precision measures the fraction of images classified as category C that actually belong to C.

Recall measures the fraction of images belonging to C that were classified as C.

Experiment 1: Independent crystallization conditions.6 classifiers trained to detect clear, phase separation, precipitate, skin,

crystal, garbage. Training/test images limited to unanimously-scored images (per-

category). • Table 1 summarizes the performance of each.

Experiment 2: Compound crystallization conditions. One 10-way classifier trained to distinguish between 10 compound

categories: crystal only, crystal+phase separation, crystal+precipitate, precipitate only, precipitate+skin, precipitate+phase separation, phase separation+skin, phase separation only, clear drop, and garbage.

Training/test images limited to unanimously-scored images belonging to one of the 10 categories.

• Table 2 summarizes the performance of the classifier. • Figure 1 illustrates the distribution of true positives and false negatives. • Figure 2 illustrates the distribution of true positives and false positives.• Figure 3 gives example images of each.

DiscussionExperiments 1 and 2 reveal degrees of difficulty in recognizing

crystallization outcomes. Most singleton categories in Experiment 2 performed generally well.

Clear drops are most accurately classified. Many compound categories demonstrate the classifier's confusion

between certain mixtures of outcomes. All crystal-bearing categories are confused to a degree. Precipitates as a whole are easily detected, but compound precipitates are difficult to subdivide.

Results

New image analysis system (under development)• Revised and expanded feature set• Textural features of local regions of the image• More precise texture, straight edge, and discrete

object metrics

World Community Grid• New system will run on the World Community Grid• 150 CPU-years compute time per day• Will compute features for 60 million images• Project launch Spring/Summer 2007• http://www.worldcommunitygrid.org/

Machine classification

CrystalCrystal/Phase Sep.Crystal/Precip.Precip.Precip./SkinPrecip./Phase Sep.Phase Sep./SkinPhase Sep.ClearGarbage

Crystal

Crystal/Phase Sep.

h Crystal/Precip.

t Precip.

u Precip./Skin

r Precip./Phase Sep.

T Phase Sep./Skin

Phase Sep.

Clear

Garbage

Figure 1: Distributions of classification labels (columns), as applied by the image analysis system to observed crystallization outcomes (rows). Elements on the diagonal indicate correct classifications. Numbers indicate Recall scores for each outcome.

.55

.22

.46

.83

.60

.04

.16

.62

.98

.52

.59

.38

.49

.86

.62

.13

.38

.61

.89

.48

Table 1: The confusion matrices summarizing the match between actual crystallization outcomes and the labels assigned by the classification system (Experiment 1). Numbers indicate counts of actual images.

Machine classification

Crystal Phase Sep. Precip. Skin Clear Garbage

+ - + - + - + - + - + -

Truth + 234 448 9315 630 60528 2849 5783 1453 28278 782 407 122

- 1167 144223 47059 41261 3429 52395 13103 106919 5093 89159 2068 138131

Precision 0.17 0.17 0.95 0.31 0.85 0.16

Recall 0.34 0.94 0.96 0.80 0.97 0.77