36
The following slides have been adapted from http://www.tm4.org/ to be presented at the Follow-up course on Microarray Data Analysis (Nov 20-24 2006, PICB Shanghai) by Peter Serocka

The following slides have been adapted from tm4/ to be presented at the

  • Upload
    doctor

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

The following slides have been adapted from http://www.tm4.org/ to be presented at the Follow-up course on Microarray Data Analysis (Nov 20-24 2006, PICB Shanghai) by Peter Serocka. TIGR. THE INSTITUTE FOR GENOMIC RESEARCH. TIGR Spotfinder: a tool for microarray image processing. - PowerPoint PPT Presentation

Citation preview

Page 1: The  following slides have been adapted from tm4/  to be presented at the

The following slides have been adapted from

http://www.tm4.org/ to be presented at the

Follow-up course on Microarray Data Analysis

(Nov 20-24 2006, PICB Shanghai) by Peter Serocka

Page 2: The  following slides have been adapted from tm4/  to be presented at the

THE INSTITUTE FOR GENOMIC RESEARCH

TIGR

TIGR Spotfinder:a tool for microarray image

processing

The Institute for Genomic Research

Developer: Vasily Sharov

Page 3: The  following slides have been adapted from tm4/  to be presented at the

Microarray Data Flow

Raw Gene Expression Data

Normalized Data with Gene Annotation

Interpretation of Analysis Results

Image File

Gene Annotation

ScannerPrinter

Image Analysis

Normalization / Filtering

Expression Analysis

Page 4: The  following slides have been adapted from tm4/  to be presented at the

Microarray Data Flow

Raw Gene Expression Data

Normalized Data with Gene Annotation

Interpretation of Analysis Results

Image File

Gene Annotation

ScannerPrinter

Image Analysis

Normalization / Filtering

Expression Analysis

.tif

.mev (.gpr)

.mev (.gpr, .txt)

.ann (.gal)

Page 5: The  following slides have been adapted from tm4/  to be presented at the

TIGR Others

Slide Images .tif .tif

Gene Expression tables .mev

.tav - outdated

.gpr (GenePix)

.txt (tab-delimited, Excel)

Gene Annotations and

Array layout information

.ann .gal

Data File Formats

Page 6: The  following slides have been adapted from tm4/  to be presented at the

Cy5 intensity

Cy3

Cy5

Cy5-cDNA

Cy3-cDNA

RT

RT

cDNAarray

Cy3 intensity

Sample2 mRNA

Sample1 mRNA

Process Overview

Page 7: The  following slides have been adapted from tm4/  to be presented at the

Basic Steps from Image to File

1.) Image File Loading

2.) Construct or Apply an Overlay Grid

3.) Computations• Find Spot Boundary and Area• Intensity Calculation• Background Calculation and Correction

4.) Quality Control

5.) Text File Output

Page 8: The  following slides have been adapted from tm4/  to be presented at the

Basic DemonstrationExploring the Interface

(Using An Existing Grid File)

Page 9: The  following slides have been adapted from tm4/  to be presented at the

Microarray Image ParametersMA Scanner generates two 16 bit gray scale TIFF images: one image for each labeling probe (Cy3 and Cy5)

16 bit schema provides signal dynamic range from 0 to 216=65536

Each image size varies from 20 to 30 MB for scanning resolution 10 m/pixel

Page 10: The  following slides have been adapted from tm4/  to be presented at the

Image size 22 MB

Image size 28 MB

Typical layout of microarray image

(images scanned at 10m/pix resolution)

Page 11: The  following slides have been adapted from tm4/  to be presented at the

Processing Overview

Apply the Grid

Determine Spot Boundary

Calculate Spot Intensity

Determine Backgroundand Correct Intensity

Page 12: The  following slides have been adapted from tm4/  to be presented at the

Applying an Overlay Grid

• What does it accomplish?

– The grid cells set a boundary for the spot finding algorithms.

– The grid cells also define an area for background correction.

Page 13: The  following slides have been adapted from tm4/  to be presented at the

pin X pin X

pin Y

pin Y

Gridding Dimension Parameters

Page 14: The  following slides have been adapted from tm4/  to be presented at the

spot spacing

Spot Spacing Parameter

Page 15: The  following slides have been adapted from tm4/  to be presented at the

Spot Finding

Spot finding requires an estimated spot size.The spot can be drawn as an irregular contour, as an ellipse, or as unconnected pixels.

Area insidecontouris used for spot intensity calculation

Area outsidecontour is used forlocal background calculation

Page 16: The  following slides have been adapted from tm4/  to be presented at the

Processing Overview

Apply the Grid

Determine Spot Boundary

Calculate Spot Intensity

Determine Backgroundand Correct Intensity

Page 17: The  following slides have been adapted from tm4/  to be presented at the

Background Calculation

Background intensity is calculated as themedian pixel intensity from the area within thesquare and outside the spot.

A separate local background is calculated for each spot using the non-spot pixels from it’s square.

localbackground

area

Page 18: The  following slides have been adapted from tm4/  to be presented at the

Spot Definition and Calculations

Spot Area, A = number of pixels within the defined spot boundary

BKG = median pixel value withinthe cell (excluding the spot pixels)

Integral = Sum of all spot pixels excluding saturated pixels

Reported “Intensity”=Integral-BKG*A

Page 19: The  following slides have been adapted from tm4/  to be presented at the

Spot Integration with Background Correction

Page 20: The  following slides have been adapted from tm4/  to be presented at the

Quality Control Issues

Two measures of spot quality are reported by SpotFinder:

• Saturation Factor

• QC Score: reports shape and signal to noise ratio

Page 21: The  following slides have been adapted from tm4/  to be presented at the

Saturation Examples

Partially saturated spots can look like this:

saturated area

non-saturated area

Completely saturated spots can look like this:fully saturated spot

Page 22: The  following slides have been adapted from tm4/  to be presented at the

Saturation, Pixel Value Limit

Output:pixel value

Input:fluorescencedye light signal

216=65536

Page 23: The  following slides have been adapted from tm4/  to be presented at the

Saturation Factor

-Partially saturated spots can be handled in SpotFinder by excluding the saturated pixels from spot area and intensity calculations.

-Fully saturated spots can not be recovered in SpotFinder. In this case rescanning with lower excitation power or PMT gain could be considered.*Faint spots may possibly be lost.

Saturation Factor = (# good pixels in spot)

(total number of spot pixels)

Page 24: The  following slides have been adapted from tm4/  to be presented at the

Saturation, RI Plot

RI plot: log(IB/IA) vs 1/2log(IA*IB)

clearly displays the saturation limits

Page 25: The  following slides have been adapted from tm4/  to be presented at the

Quality Control, QC Score

A QC Score is generated for each spot andis based on the spot shape and a measure ofsignal to noise ratio.

shape signal/noise shape signal/noise

QCA QCB

QC Score

Page 26: The  following slides have been adapted from tm4/  to be presented at the

Spot Shape Parameter

Shape Factor = (Spot Area/Perimeter)

Spots with large perimeters relative to spotarea will have a low shape factor.

Page 27: The  following slides have been adapted from tm4/  to be presented at the

Signal to Noise Ratio

med(BKG)

0

Pix

el V

alu

es

*med(BKG) + * SD(BKG)

S/N factor = fraction of spot pixelsexceeding:

216

SD(BKG)

Page 28: The  following slides have been adapted from tm4/  to be presented at the

Quality Control Calculation

QC Score = (QCA+QCB)/2

QCA=sqrt(QC shape*QC S/N) for channel A

QCB=sqrt(QC shape*QC S/N) for channel B

Page 29: The  following slides have been adapted from tm4/  to be presented at the

Quality Control, RI Plot

RI plot:

log(IB/IA) vs1/2log(IA*IB)

plotted for means shows clearly low intensity distortion due to background overestimation.

Data from earlier slide processed without QC filter

Page 30: The  following slides have been adapted from tm4/  to be presented at the

Quality Control

(data provided by E. Snesrud)

Page 31: The  following slides have been adapted from tm4/  to be presented at the

Quality Control

(data provided by E. Snesrud)

Page 32: The  following slides have been adapted from tm4/  to be presented at the

A - Spot area is larger than 50 pixels

B - Spot area is between 30 and pixels

C - spot area is smaller than 30 pixels

X - Spot rejected by QC based on spot shape

and spot intensity relative to surrounding background

U - Spot rejected (“flagged”) by user

Y - Bad spot, background is higher than spot intensity

Z - Spot was not detected by the program

S - Warning: some spot pixels are saturated

SpotFinder Flag Descriptions

Page 33: The  following slides have been adapted from tm4/  to be presented at the

UID Unique identifier for this spot

IA Intensity value in channel A

IB Intensity value in channel B

R Row (slide row)

C Column (slide column)

MR Meta-row (block row)

MC Meta-column (block column)

SR Sub-row

SC Sub-column

Output data (.mev) per spot:

Page 34: The  following slides have been adapted from tm4/  to be presented at the

FlagA TIGR Spotfinder flag value in channel A

FlagB TIGR Spotfinder flag value in channel B

SA Actual spot area (in pixels)

SF Saturation factor

QC Cumulative quality control score

QCA Quality control score in channel A

QCB Quality control score in channel B

Output data (.mev) per spot:

Page 35: The  following slides have been adapted from tm4/  to be presented at the

BkgA Background value in channel A

BkgB Background value in channel B

SDA Standard deviation for spot pixels in channel A

SDB Standard deviation for spot pixels in channel B

SDBkgA Standard deviation of the background in channel A

SDBkgB Standard deviation of the background in channel B

Output data (.mev) per spot:

Page 36: The  following slides have been adapted from tm4/  to be presented at the

MedA Median intensity value in channel A

MedB Median intensity value in channel B

MNA Mean intensity value in channel A

MNB Mean intensity value in channel B

X/Y X resp. Y coordinates of the spot cell

PValueA P-value in channel A

PValueB P-value in channel B

DBID Data Base ID (if UID is substituted)

Output data (.mev) per spot: