32
1 Artificial Intelligence in High-Content Screening and Cervical Cancer Diagnosis Lukasz Miroslaw , PhD. [email protected] Organic Chemisy Instu Grid Compung Compence Cenr Universi of Zurich, Switzerland ETH LMC, 17.10.2012

Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

Embed Size (px)

DESCRIPTION

Artficial Intelligence (AI) studies and designs intelligent agents, e.g. systems that perceive their environment and take actions that maximize the chances of success. In microscopy a success is often understood when the automated image analysis effciently detects phenotypes, as in biological screens, or retrieves a diagnostically relevant statistics from images, as in the medical diagnosis. During the talk I will present two applications aimed at supporting high-content screening and diagnosis of cervical cancer where subdomains of AI, e.g. Evolutionary Algorithm, Neural Networks and Machine Learning techniques were applied.

Citation preview

Page 1: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

1

Artificial Intelligence in High-Content Screening and Cervical Cancer

Diagnosis

Lukasz Miroslaw , PhD. [email protected]

Organic Chemistry Institute���

Grid Computing Competence Center

���University of Zurich, Switzerland

ETH LMC, 17.10.2012

Page 2: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

2

Table of Contents

Introduction: -  Why Artificial Intelligence? -  Loss-of-function screens. -  Cervical Cancer Diagnosis.

GC3Pie: Software for Workflow-Management in High Content Screening.

Demo. Conclusions.

Page 3: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

3

Why Artificial Intelligence ?

Algorithm design is difficult due to high number of parameters that must be estimated. Some numbers: Bridge: 52! = (≈8.07×1067) = 80,658,175,170,943,878,571,660,636,856,403,766,975,289,505,440,883,277,824,000,000,000,000 Chess: 4.52×1046 is a proven upper bound for the number of legal chess positions. Cosmology: There are 1024 stars.

The only reasonable approach is to use AI.

Page 4: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

4

Artificial Intelligence

Search and optimization

Logic

Probabilistic methods for uncertain reasoning

Classifiers and statistical learning methods

Control theory

Languages

Neural networks

… and more

Page 5: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

5

esiRNA: Knockdown efficacy

QRT-PCR analyses 24 hours after transfection of HeLa cells with indicated

esiRNAs are shown. Because of the complex

mixture of different siRNAs all targeting the

same mRNA, the esiRNA pool typically produces

excellent silencing of the transcripts.

Credits: Prof. Frank Buchholz

Page 6: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

6

Objective and Assay Setup

Objective: Automated estimation of Mitotic Index from time-lapse movies. Segmentation and Classification of three types of HeLa cells: normal, apoptotic and mitotic cells. Methodology: Multimodal Image Analysis.

Fig. GFP-tagged HeLa cells imaged with Positive (Left) and Negative (Right) Phase-Contrast and Fluorescence Microscopy. TDS (right) and Kyoto (bottom).

Page 7: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

7

Cell Model

Mitotic cells: cell boundaries well distinguishable, rounded shape.

Fig. Distribution of Features for TDS HeLa cells.

Normal and apoptotic cells have slightly different level of GFP signal, in metaphase the signal gets stronger.

Page 8: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

8

Apoptopic and Normal Cells

Detection of GFP signal: local background subtraction with rolling-ball algorithm [1] followed by watershed.

[1] Sternberg S., “Biomedical Image Processing”, IEEE Computer, January 1983.

Fig. Validation: Specificity: 98% measured on 578 cells.

Fig. Discriminant Function Analysis, linear vs. quadratic classifiers. Best specificity: 71% measured on 6 randomly picked images from the training set.

Classification of detected objects as apoptotic or normal cells.

Page 9: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

9

Detection of Mitotic Cells

Cross-correlation based approach [3]: 1. Given N cell models gi and target image f: 2. Cross-correlation of f with gi in Fourier Space, i = 1,…,N

[3] Miroslaw L., Chorazyczewski A., Correlation-based method for automatic mitotic cell detection in phase contrast microscopy, Proc. 4th Int. Conf. Computer Recognition Systems CORES'05, pp. 627-635, Springer-Verlag Berlin Hildelberg 2005.

3. Validate correlation peaks.

Page 10: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

10

Evolution-Driven Validation

[4] Miroslaw L., Chorazyczewski A., Buchholz F., Kittler R., EA validation method in detection of mitotic cells, Proc. 8th National Conference on Evolutionary Computation and Global Optimization, pp. 157-163, Korbielow 2005.

No Teaching. Just one parameter (σ) Specificity: 81%

Page 11: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

11

Summary

[5] Kittler R, Pelletier L, Heninger AK, Slabicki M, Theis M, Miroslaw L, Poser I, Lawo S, Grabner H, Kozak K, Wagner J, Surendranath V, Richter C, Bowen W, Habermann B, Hyman AA, Buchholz B. (2007) Genome-wide RNAi profiling of cell cycle progression in human tissue culture cells. Nat Cell Biol. 9(12): 1401-12.

Fig. Estimated Mitotic Index for well-type cells (blue) and cells with CDC16 being knocked down. One of developed methods used in genome-scale screening.

Genome-scale RNA-mediated interference screen in HeLa cells to identify human genes that are important for cell division [5]. Cited 133 times.

Page 12: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

12

5 years later …

Objective: Estimation of Mitotic Index from negative phase contrast images. Pre-processing: Shading Correction to reduce uneven illumination. Nuclei Segmentation: isodata algorithm [1] followed by dilation to segment the nuclei on Fluorescence Image.

Classification: Neural Network.

[1] T.W. Ridler, S. Calvard, Picture thresholding using an iterative selection method, IEEE Trans. System, Man and Cybernetics, SMC-8 (1978) 630-632.

Learning Set: 20x20 px images originating from detected nuclei. For each sub-image 11 Texture Features were calculated (1st Order Statistics)

Page 13: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

13

Neural Networks and Markov Chains

Artificial Neural Network 11 input neurons 17 hidden layers (rule of thumb) 3 output neurons representing apoptopic cells, mitotic cells and background. Back-Propagation based learning. Stop Condition: Learning Error < 1e-7. Fig: Artificial Neural Network with Back-Propagation

Learning scheme. Image Source: Theodor Tanner Jr.

Post-processing Motivated by Markov-Chain Transition Probability Estimation. aij estimated from the training set.

Page 14: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

14

Example Markov Chain

Fig. Example Transition Matrix.

Page 15: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

15

Summary

Program Features •  Off-line teaching module. •  Very fast classification. •  XML-based statistics generation. •  Automated plot generation. •  Run/Pause Button.

Performance: Sensitivity: 82% Specificity: 94% Segmentation: Sensitivity: 85% Criticism: •  Mitotic arrest is estimated. Detection of ALL cells

must be done to provide better estimation. •  Time-consuming teaching. Acknowledgments: Karol Radziszewski, Krzysztof Sikora, Marek

Skowroński, Krzysztof Stępień Wroclaw University of Technology 2011, Poland.

Click: http://goo.gl/mZpRU

Page 16: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

16

Cervical Cancer Diagnosis

•  Worldwide, cervical cancer is second

most common and the fifth deadliest cancer in women.

•  HPV vaccines are still being investigated. Pap test is a long examination (2-3 weeks).

•  Phase Contrast allows for immediate

examination. Objective: automated segmentation of epithelial cells and detection of atypical cell nuclei.

Fig: Typical image with epithelial cells. Image Source: Dr Grzegorz Glab, Opole Hospital of Gyneacology., Poland.

Page 17: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

17

Algorithm

1.  80 Texture Features were computed

for each of image subregion.

2.  Selection of most relevant features.

3.  Post-processing.

4.  Active Contour in cell membrane detection.

Fig: Typical image with epithelial cells.

Image Source: Dr Grzegorz Glab, Opole Hospital of Gyneacology., Poland.

Page 18: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

18

How to Limit Number of Features?

Fig. Mean Classification Rate for different number of features. Sequential Forward Floating Selection Scheme and 10-fold cross-validation was used.

Metric B. distance

FLD Classifier

Scatter Matrices

Classification Error for 20 features

15.6% 16.9% 15.2%

Page 19: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

19

Classification

Mean Classification Error (MCE) was estimated with cross-validation. For k=20, MCE=15.2%, for h=5.5 MCE=12.9%.

1.  k-Nearest Neighbor Clustering

2.  Kernel Fisher Discriminant

3.  Linear Fisher Discriminant - a linear combination of features that best separates two or more classes

Problem: How to estimate parameters?

Objective: assign each subimage to one of the classes: background, cell membrane, epithelial cell.

Page 20: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

20

Final Classification

Fig: Decision Matrix for kNN, FLD and KFD (best 86.8% classification rate).

Cell Membrane Validation

1.  Ask biologist!

2.  Active Contour Initialization.

3.  Calculate Gradient Flow and iteratively adapt the contour to the membrane.

Page 21: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

21

Final Segmentation

Fig: Some examples of detection of epithelial cells.

Page 22: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

22

Nuclei Detection

Fig. Four-fold cross-validation on a training set with 57 pathological nuclei and 2379 other oval objects. Ten classifiers have been tested. Specificity: 95%, sensitivity: 96%. (Marcin Smereka) [6] Schilling T.*, Miroslaw L.*, Glab G., M. Smereka, Towards rapid cervical cancer diagnosis: automated detection of cells in phase contrast images with texture features and active contours, Int J Gynec Cancer 2007, 17(1):118-26. * First and Second Author contributed equally to the work.

Page 23: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

23

Problems with HCS

Typical image based assays generate thousands of hundreds images. Image analysis is often unique and composed of different algorithms. They form sequential/parallel workflows or their combination. Algorithms have many parameters. Estimation of the parameters is a big challenge. Control and management of is highly complex problem.

Common Approach: in-house created scripts that call image processing modules lead to problems: Portability: Cannot run on a different cluster without rewriting all the scripts. Code reuse: Scripts are often very tied to a certain purpose, so they are difficult to reuse. Heavy maintenance: the more a script does its job well, the more you’ll find yourself adding “generic” features and maintaining requests from other users.

Page 24: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

24

GC3Pie for HCS by Grid Computing Competence Center

GC3Pie is a suite of Python classes (and command-line tools built upon them) to aid in submitting and controlling batch jobs to clusters and grid resources seamlessly Building blocks by which a dynamic workflow can be quickly developed.

http://gc3pie.googlecode.com

GC3Libs functionality: submit/monitor/kill a job, retrieve output, etc. Core operations: submit, update state, retrieve (a snapshot of) output, cancel job.

Additional features: •  Get access to the Grid (e.g., authentication step) •  Prepare files for submission. •  Re-submit failed jobs. •  Monitor job status (loop) •  Retrieve results. •  Postprocess and display.

Page 25: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

25

Conclusions

Image Analysis can be complex, e.g. too many parameters (search space has too many parameters) Artificial Intelligence may be helpful. Some experience is needed to adapt AI to a given problem. A few applications of AI were presented: Classifiers and statistical learning methods (Non-linear and linear Classifiers), Search and optimization (Evolutionary Algorithm), Probabilistic methods for uncertain reasoning (Markov Chain), neural networks (NN with Back-Propagation Learning). Management and control of Image Analysis in High Content Screening can be simpler (-> GC3Pie) http://gc3pie.googlecode.com

Page 26: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

26

Hard vs. Soft Selection

Hard selection: the best individuals always win. Pros: local mimima are located easily. Cons: crossing saddles almost impossible. Soft selection: probability of selection depends on the fitness. Pros: better saddle crossing. Cons: Parameter-dependent method.

Page 27: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

27

Appendix

Additional Material

Page 28: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

28

Evolutionary Algorithm

Baldrige Group, group meeting

Individuals are the legal solutions to our problem. They form a population that 'evolves' in time and adapts to the environment. Fitness function is measure for the adaptation. Diversity is crucial. Finding extrema and saddle points are more frequent than by gradient searches. Operators that drive the evolution: Selection, Reproduction (Recombination), Mutation.

Page 29: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

29

Cross-over

Recombination: Mating process: two parents create offspring. The offspring consists of the generic materials from both parents. Weaker offspring tend to die out in time. Goal: variations allows the offspring to search out different available niches, find better fitness values ergo better solutions.

Page 30: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

30

Mutation

Mutation occurs in nature. Although this occurs very infrequently many believe this is a main driving force for evolution. The result of mutation can often result in a weaker individual. Occasionally the result might be to produce a stronger one.

Page 31: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

31

Classification Scheme

Page 32: Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis

32

Rolling-Ball Algorithm

[2] Stanley Sternberg, “Biomedical Image Processing”, IEEE Computer, January 1983.

The Rolling Ball Radius is the radius of curvature of the paraboloid. As a rule of thumb, for 8-bit or RGB images it should be at least as large as the radius of the largest object in the image that is not part of the background [2].