24
Computational Biology: A Measurement Perspective Alden Dima Information Technology Laboratory [email protected]

Computational Biology: A Measurement Perspective

  • Upload
    oliver

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

Computational Biology: A Measurement Perspective. Alden Dima Information Technology Laboratory a [email protected]. Problem. High-throughput technologies are generating large amounts of complex data that is difficult to process and convert into knowledge - PowerPoint PPT Presentation

Citation preview

Page 1: Computational Biology: A Measurement Perspective

Computational Biology:A Measurement Perspective

Alden DimaInformation Technology Laboratory

[email protected]

Page 2: Computational Biology: A Measurement Perspective

Problem

• High-throughput technologies are generating large amounts of complex data that is difficult to process and convert into knowledge

• Issues exist throughout data lifecycle:– Acquisition– Analysis– Archiving– Interchange

Page 3: Computational Biology: A Measurement Perspective

Imaging Technologies

• Imaging technologies are increasingly being used both as diagnostic tools and as research tools in the biosciences– Novel methods are needed for automated analysis and

comparisons– Correlation among studies is difficult at best– Off-the-shelf methods are not well characterized and can

contribute significantly measurement uncertainty– Need to combine features from images with other

biological or medical sources

Page 4: Computational Biology: A Measurement Perspective

Example from Literature

Original Data –64%

PoorlySegmente

d

Only Well Segmente

d Cells

Source:Hill, LaPan, Li, Haney (Wyeth Research)Impact of image segmentation on high content screening data quality for SK-BR-3 cellsBioMed Central Bioinformatics 2007

• Relies heavily on cell imaging

• Gigabytes of images collected

• Algorithms treated as “black boxes”

• Many algorithms published but rarely compared

• Poor segmentation significantly effects conclusions

High Content Screening

Page 5: Computational Biology: A Measurement Perspective

Computational Biology: Single Cell AnalysesIntracellular molecular reactions and interactions that control the response

and fate of cells and organisms cannot be unambiguously compared and combined due to a lack of standards and validated protocols

NIST RoleProvide the measurement tools and standards that enable quantifiable and

reproducible measurements of cells and their interactions through:• Standard data/metadata format for image capture, storage, retrieval, analysis• Software to enable high throughput cell image analysis and interoperability• Standards and validation required to ensure reproducible image analysis• Multi-site experiments to test software and validation protocols

Technical Approach• Create and evaluate an integrated data collection, organization and analysis

infrastructure for cellular imaging • Experimentalists and computational scientists - focus on the physical standards and

protocols for data collection, image processing and analysis, storage of data and metadata, and evaluation of results

Page 6: Computational Biology: A Measurement Perspective

Segmentation Evaluation

• Determine which segmentation technique and associated parameters can be used to reliably determine the morphology of cells for the purposes of comparing cell lines as part of a new standard procedure under development

Page 7: Computational Biology: A Measurement Perspective

Variability Across MethodsDifferent segmentation techniques can change results

Page 8: Computational Biology: A Measurement Perspective

A10 Cell LineThree different segmentation techniques

Cell 1 was an edgecell in one case

In one casethis cell was segmentedas three particles

Red = k-means (k = 2)Blue = OtsuGreen = Maximum EntropyWhite = Ground TruthCell should

have been ignored

Preliminary evaluation shows that results can vary by more than ± 40%

Page 9: Computational Biology: A Measurement Perspective

Variability Across Implementations

NIST Reference

Implementation by

Javier Bernal

Implementation from ImageJ Community

Yellow = same

Some differences

Different implementations of the same technique can change results as well

“Software as measurement”

Comparing two implementations

across three images

Page 10: Computational Biology: A Measurement Perspective

Experimental Design

• Two cell lines – A10 & 3T3• Cell preparation follows ASTM document• Different image exposure and filter levels• No flat-field correction• Multiple sampling to capture noise• Ground truth: expert manual segmentation• Result: approximately 8000 images – 80 of

which are used for initial evaluation work

Page 11: Computational Biology: A Measurement Perspective

Red Channel Images - A10

Low Exposure, Optimal Filter Medium Exposure, Optimal Filter

High Exposure, Optimal Filter Medium Exposure, Non-optimal Filter

16-bit Gray-scale Images

Page 12: Computational Biology: A Measurement Perspective

Red Channel Images - 3T3

16-bit Gray-scale Images

Low Exposure, Optimal Filter Medium Exposure, Optimal Filter

High Exposure, Optimal Filter Medium Exposure, Non-optimal Filter

Page 13: Computational Biology: A Measurement Perspective

Pipeline

Images

ImageJ

Evaluation

Post-processing

Masks,Statistics

Data Set

ImageJScripts

CodeGenerator

ReferenceSeg.

Methods

Plug-inWrappers

TRUTH

ROINormalization

Page 14: Computational Biology: A Measurement Perspective

Planned Progression of EvaluationsBasic Algorithms

Edge-based Methods

Advanced Techniques

Page 15: Computational Biology: A Measurement Perspective

Algorithms Evaluated

Basic Methods

OtsuNIST ImageJ

Maximum Entropy

NIST ImageJ

K-Means Clustering

K=2NIST

ImageJ

Threshold Clusters (K=3,4,5)

NIST

Combined with Otsu (K=3,4,5)

ImageJ

Combined with Max. Entropy

(K=3,4,5)

ImageJ

Page 16: Computational Biology: A Measurement Perspective
Page 17: Computational Biology: A Measurement Perspective
Page 18: Computational Biology: A Measurement Perspective
Page 19: Computational Biology: A Measurement Perspective

Live Cell Tracking

• There are few automated image analysis options available to the cell biologists to quantify live cell image data

• Segment and track cells in an image sequence to quantify the total fluorescence intensity of individual cells over time

Page 20: Computational Biology: A Measurement Perspective

Manually Tracking NIH 3T3 Cells

Phase contrast on left, GFP on right

Page 21: Computational Biology: A Measurement Perspective

CompBio Cell Tracker

SegmentationSegmentation

Live Cell ImagesLive Cell ImagesLive Cell Images

Cell Tracking

Cell Lineage Intensity Plot Other Outputs: MigrationVelocity, etc

Live Cell Images

SegmentationSegmentation

Page 22: Computational Biology: A Measurement Perspective

Automatic Tracking of NIH 3T3 Cells

Page 23: Computational Biology: A Measurement Perspective

Automatic Tracking of NIH 3T3 Cells

Page 24: Computational Biology: A Measurement Perspective

Thank You!

[email protected]