28
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research Image Processing for cDNA Microarray Data Prepared with massive assistance from Yee Hwa Yang (Berkeley, WEHI), and reporting on work done jointly with her, Sandrine Dudoit (Stanford) and Mike Buckley (CSIRO, Sydney). References : M Eisen and P Brown, Methods in Enzymology vol 303, 1999; Chapter 2, DNA Microarrays (ed M Schena, OUP 1999) by Mack J Schermer; Chapter 13, Microarray Biochip Technology (ed M Schena, Eaton 2000) by Basarsky et al.

Image Processing for cDNA Microarray Data

  • Upload
    amanda

  • View
    51

  • Download
    1

Embed Size (px)

DESCRIPTION

Image Processing for cDNA Microarray Data. Prepared with massive assistance from Yee Hwa Yang (Berkeley, WEHI), and reporting on work done jointly with her, Sandrine Dudoit (Stanford) and Mike Buckley (CSIRO, Sydney). - PowerPoint PPT Presentation

Citation preview

Page 1: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Image Processing for cDNA Microarray Data

Prepared with massive assistance from Yee Hwa Yang (Berkeley, WEHI), and reporting on work done jointly with her, Sandrine Dudoit (Stanford) and Mike Buckley (CSIRO, Sydney).

References : M Eisen and P Brown, Methods in Enzymology vol 303, 1999; Chapter 2, DNA Microarrays (ed M Schena, OUP 1999) by Mack J Schermer; Chapter 13, Microarray Biochip Technology (ed M Schena, Eaton 2000) by Basarsky et al.

Page 2: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Scanner Process

Dye Photons Electrons Signal

Laser PMTA/D

Convertor

excitation amplification FilteringTime-spaceaveraging

Page 3: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

GenePix 4000a Microarray Scanner Protocol

1. Turn on scanner.

2. Slide scanner door open. Insert chip hyp side down and clip chip holder easily around the slide

3 Set PMTs to 600 in both 635nm (Cy3) and 532 (Cy5) channels.

4. Perform low resolution “PREVIEW SCAN” to determine location of spots and initial hyb intensities

5. Once scan location determined, draw a “SCAN AREA” marquis around the array

6. Perform quick visual inspection of hyb and make initial adjustments to PMTs

7. For gene expression hybs, raise or lower the red and green PMTs to achieve color balance

8. Before you perform your data scan, change “LINES TO AVERAGE” to 2.

9. Perform a high-resolution “DATA-SCAN”……(ctd)

Page 4: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

10. Observe the histograms and make adjustments to PMTs.

11. Once the PMT level has been set so that the Intensity Ratio is near 1.00 perform a “DATA SCAN” over “SCAN AREA” and save the results.

12. To save your image, select “SAVE IMAGES”.

13. Save as type=Multi-image TIFF files.

14. Once scanned and saved, you are ready to assign spot identities and calculate results.

Note: For us, normalization is performed later during data analysis, see next lecture.

GenePix 4000a Microarray Scanner Protocol, ctd

Page 5: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Scanner

Laser

PMT

Dye

Glass Slide

Objective Lens

Detector lens

Pinhole

Beam-splitter

Page 6: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

How to adjust for PMT?

Cy3 Cy51 600 6002 650 6003 650 6504 700 6505 650 7006 700 7007 750 750

saturated

Very weak

Page 7: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

After normalisation

In addition, the ranking of the genes stays pretty much the same.

Page 8: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Practical Problems 1

• Comet Tails• Likely caused by

insufficiently rapid immersion of the slides in the succinic anhydride blocking solution.

Page 9: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Practical Problems 2

Page 10: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Practical Problems 3

High Background• 2 likely causes:

– Insufficient blocking.

– Precipitation of the

labeled probe.

Weak Signals

Page 11: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Practical Problems 4

Spot overlap:Likely cause: toomuch rehydrationduring post -processing.

Page 12: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Practical Problems 5

DustDust

Page 13: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Steps in Images Processing

1. Addressing: locate centers

2. Segmentation: classification of pixels either as signal or background. using seeded region growing).

3. Information extraction: for each spot of the array, calculates signal intensity pairs, background and quality measures.

Page 14: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Steps in Image Processing

• Spot Intensities– mean (pixel intensities).– median (pixel intensities).

– Pixel variation (IQR of log (pixel

intensities).• Background values

– Local

– Morphological opening

– Constant (global)

– None

• Quality Information

Signal

Background

3. Information Extraction

Page 15: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Addressing

This is the process of assigning coordinates to each of the spots.

Automating this part of the procedure permits high throughput analysis.

4 by 4 grids19 by 21 spots per grid

Page 16: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Addressing

Registration

Registration

Page 17: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Problems in automatic addressing

Misregistration of the red and green channels

Rotation of the array in the image

Skew in the arrayRotation

Rotation

Page 18: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Segmentation methods• Fixed circles• Adaptive Circle• Adaptive Shape

– Edge detection.– Seeded Region Growing. (R. Adams and L.

Bishof (1994) :Regions grow outwards from the seed points preferentially according to the difference between a pixel’s value and the running mean of values in an adjoining region.

• Histogram Methods– Adaptive threshold.

Page 19: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Examples of algorithms and software implementation

Methods Software / algorithms

Fixed Circle ScanAlyze, GenePix, QuantArray

Adaptive Circle GenePix

Adaptive Shape Edging and region growing.

Histogram Method QuantArray and adaptivethresholding.

Page 20: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Limitation of fixed circle method

SRG Fixed Circle

Page 21: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Limitation of circular segmentation

—Small spot—Not circular

Results from SRG

Page 22: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Information Extraction

—Spot Intensities—mean (pixel intensities).—median (pixel intensities).

—Background values—Local —Morphological opening—Constant (global)—None

—Quality Information

Take the average

Page 23: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Local Backgrounds

Page 24: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Information

• Quality– Area– Circularity– Signal to Noise ratio

Page 25: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Quality Measurements

• Array– Correlation between spot intensities.– Percentage of spots with no signals.– Distribution of spot signal area.

• Spot– Signal / Noise ratio.– Variation in pixel intensities.– Identification of “bad spot” (spots with no signal).

• Ratio (2 spots combined)– Circularity

Page 26: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Quality of Array Distribution of areas. - Judge by eye

- Look at variation. (e.g, SD)

Cy3 area

• mean 57

•median 56

•SD 20.67

Cy5 area

• mean 59

• median 57

• SD 24.34

Page 27: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Does the image analysis matter?Spot.nbgSpot.nbg Spot.morphSpot.morph

Spot.valleySpot.valley ScanAlyzeScanAlyze

Page 28: Image Processing for cDNA Microarray Data

Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research

Background makes a difference

Background method Segmentation method Exp1 Exp2S.nbg 6 6Gp.nbg 7 6SA.nbg 6 6

No background QA.fix.nbg 7 6QA.hist.nbg 7 6QA.adp.nbg 14 14S.valley 17 21GP 11 11

Local surrounding SA 12 14QA.fix 18 23QA.hist 9 8QA.adp 27 26

Others S.morph 9 9S.const 14 14

Medians of the SD of log2(R/G) for 8 replicated spots multiplied by 100and rounded to the nearest integer.