Upload
amanda
View
51
Download
1
Embed Size (px)
DESCRIPTION
Image Processing for cDNA Microarray Data. Prepared with massive assistance from Yee Hwa Yang (Berkeley, WEHI), and reporting on work done jointly with her, Sandrine Dudoit (Stanford) and Mike Buckley (CSIRO, Sydney). - PowerPoint PPT Presentation
Citation preview
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Image Processing for cDNA Microarray Data
Prepared with massive assistance from Yee Hwa Yang (Berkeley, WEHI), and reporting on work done jointly with her, Sandrine Dudoit (Stanford) and Mike Buckley (CSIRO, Sydney).
References : M Eisen and P Brown, Methods in Enzymology vol 303, 1999; Chapter 2, DNA Microarrays (ed M Schena, OUP 1999) by Mack J Schermer; Chapter 13, Microarray Biochip Technology (ed M Schena, Eaton 2000) by Basarsky et al.
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Scanner Process
Dye Photons Electrons Signal
Laser PMTA/D
Convertor
excitation amplification FilteringTime-spaceaveraging
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
GenePix 4000a Microarray Scanner Protocol
1. Turn on scanner.
2. Slide scanner door open. Insert chip hyp side down and clip chip holder easily around the slide
3 Set PMTs to 600 in both 635nm (Cy3) and 532 (Cy5) channels.
4. Perform low resolution “PREVIEW SCAN” to determine location of spots and initial hyb intensities
5. Once scan location determined, draw a “SCAN AREA” marquis around the array
6. Perform quick visual inspection of hyb and make initial adjustments to PMTs
7. For gene expression hybs, raise or lower the red and green PMTs to achieve color balance
8. Before you perform your data scan, change “LINES TO AVERAGE” to 2.
9. Perform a high-resolution “DATA-SCAN”……(ctd)
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
10. Observe the histograms and make adjustments to PMTs.
11. Once the PMT level has been set so that the Intensity Ratio is near 1.00 perform a “DATA SCAN” over “SCAN AREA” and save the results.
12. To save your image, select “SAVE IMAGES”.
13. Save as type=Multi-image TIFF files.
14. Once scanned and saved, you are ready to assign spot identities and calculate results.
Note: For us, normalization is performed later during data analysis, see next lecture.
GenePix 4000a Microarray Scanner Protocol, ctd
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Scanner
Laser
PMT
Dye
Glass Slide
Objective Lens
Detector lens
Pinhole
Beam-splitter
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
How to adjust for PMT?
Cy3 Cy51 600 6002 650 6003 650 6504 700 6505 650 7006 700 7007 750 750
saturated
Very weak
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
After normalisation
In addition, the ranking of the genes stays pretty much the same.
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Practical Problems 1
• Comet Tails• Likely caused by
insufficiently rapid immersion of the slides in the succinic anhydride blocking solution.
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Practical Problems 2
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Practical Problems 3
High Background• 2 likely causes:
– Insufficient blocking.
– Precipitation of the
labeled probe.
Weak Signals
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Practical Problems 4
Spot overlap:Likely cause: toomuch rehydrationduring post -processing.
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Practical Problems 5
DustDust
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Steps in Images Processing
1. Addressing: locate centers
2. Segmentation: classification of pixels either as signal or background. using seeded region growing).
3. Information extraction: for each spot of the array, calculates signal intensity pairs, background and quality measures.
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Steps in Image Processing
• Spot Intensities– mean (pixel intensities).– median (pixel intensities).
– Pixel variation (IQR of log (pixel
intensities).• Background values
– Local
– Morphological opening
– Constant (global)
– None
• Quality Information
Signal
Background
3. Information Extraction
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Addressing
This is the process of assigning coordinates to each of the spots.
Automating this part of the procedure permits high throughput analysis.
4 by 4 grids19 by 21 spots per grid
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Addressing
Registration
Registration
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Problems in automatic addressing
Misregistration of the red and green channels
Rotation of the array in the image
Skew in the arrayRotation
Rotation
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Segmentation methods• Fixed circles• Adaptive Circle• Adaptive Shape
– Edge detection.– Seeded Region Growing. (R. Adams and L.
Bishof (1994) :Regions grow outwards from the seed points preferentially according to the difference between a pixel’s value and the running mean of values in an adjoining region.
• Histogram Methods– Adaptive threshold.
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Examples of algorithms and software implementation
Methods Software / algorithms
Fixed Circle ScanAlyze, GenePix, QuantArray
Adaptive Circle GenePix
Adaptive Shape Edging and region growing.
Histogram Method QuantArray and adaptivethresholding.
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Limitation of fixed circle method
SRG Fixed Circle
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Limitation of circular segmentation
—Small spot—Not circular
Results from SRG
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Information Extraction
—Spot Intensities—mean (pixel intensities).—median (pixel intensities).
—Background values—Local —Morphological opening—Constant (global)—None
—Quality Information
Take the average
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Local Backgrounds
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Information
• Quality– Area– Circularity– Signal to Noise ratio
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Quality Measurements
• Array– Correlation between spot intensities.– Percentage of spots with no signals.– Distribution of spot signal area.
• Spot– Signal / Noise ratio.– Variation in pixel intensities.– Identification of “bad spot” (spots with no signal).
• Ratio (2 spots combined)– Circularity
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Quality of Array Distribution of areas. - Judge by eye
- Look at variation. (e.g, SD)
Cy3 area
• mean 57
•median 56
•SD 20.67
Cy5 area
• mean 59
• median 57
• SD 24.34
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Does the image analysis matter?Spot.nbgSpot.nbg Spot.morphSpot.morph
Spot.valleySpot.valley ScanAlyzeScanAlyze
Department of Statistics, University of California, Berkeley , and Division of Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research
Background makes a difference
Background method Segmentation method Exp1 Exp2S.nbg 6 6Gp.nbg 7 6SA.nbg 6 6
No background QA.fix.nbg 7 6QA.hist.nbg 7 6QA.adp.nbg 14 14S.valley 17 21GP 11 11
Local surrounding SA 12 14QA.fix 18 23QA.hist 9 8QA.adp 27 26
Others S.morph 9 9S.const 14 14
Medians of the SD of log2(R/G) for 8 replicated spots multiplied by 100and rounded to the nearest integer.