The following slides have been adapted from tm4/ to be presented at the

Preview:

DESCRIPTION

The following slides have been adapted from http://www.tm4.org/ to be presented at the Follow-up course on Microarray Data Analysis (Nov 20-24 2006, PICB Shanghai) by Peter Serocka. TIGR. THE INSTITUTE FOR GENOMIC RESEARCH. TIGR Spotfinder: a tool for microarray image processing. - PowerPoint PPT Presentation

Citation preview

The following slides have been adapted from

http://www.tm4.org/ to be presented at the

Follow-up course on Microarray Data Analysis

(Nov 20-24 2006, PICB Shanghai) by Peter Serocka

THE INSTITUTE FOR GENOMIC RESEARCH

TIGR

TIGR Spotfinder:a tool for microarray image

processing

The Institute for Genomic Research

Developer: Vasily Sharov

Microarray Data Flow

Raw Gene Expression Data

Normalized Data with Gene Annotation

Interpretation of Analysis Results

Image File

Gene Annotation

ScannerPrinter

Image Analysis

Normalization / Filtering

Expression Analysis

Microarray Data Flow

Raw Gene Expression Data

Normalized Data with Gene Annotation

Interpretation of Analysis Results

Image File

Gene Annotation

ScannerPrinter

Image Analysis

Normalization / Filtering

Expression Analysis

.tif

.mev (.gpr)

.mev (.gpr, .txt)

.ann (.gal)

TIGR Others

Slide Images .tif .tif

Gene Expression tables .mev

.tav - outdated

.gpr (GenePix)

.txt (tab-delimited, Excel)

Gene Annotations and

Array layout information

.ann .gal

Data File Formats

Cy5 intensity

Cy3

Cy5

Cy5-cDNA

Cy3-cDNA

RT

RT

cDNAarray

Cy3 intensity

Sample2 mRNA

Sample1 mRNA

Process Overview

Basic Steps from Image to File

1.) Image File Loading

2.) Construct or Apply an Overlay Grid

3.) Computations• Find Spot Boundary and Area• Intensity Calculation• Background Calculation and Correction

4.) Quality Control

5.) Text File Output

Basic DemonstrationExploring the Interface

(Using An Existing Grid File)

Microarray Image ParametersMA Scanner generates two 16 bit gray scale TIFF images: one image for each labeling probe (Cy3 and Cy5)

16 bit schema provides signal dynamic range from 0 to 216=65536

Each image size varies from 20 to 30 MB for scanning resolution 10 m/pixel

Image size 22 MB

Image size 28 MB

Typical layout of microarray image

(images scanned at 10m/pix resolution)

Processing Overview

Apply the Grid

Determine Spot Boundary

Calculate Spot Intensity

Determine Backgroundand Correct Intensity

Applying an Overlay Grid

• What does it accomplish?

– The grid cells set a boundary for the spot finding algorithms.

– The grid cells also define an area for background correction.

pin X pin X

pin Y

pin Y

Gridding Dimension Parameters

spot spacing

Spot Spacing Parameter

Spot Finding

Spot finding requires an estimated spot size.The spot can be drawn as an irregular contour, as an ellipse, or as unconnected pixels.

Area insidecontouris used for spot intensity calculation

Area outsidecontour is used forlocal background calculation

Processing Overview

Apply the Grid

Determine Spot Boundary

Calculate Spot Intensity

Determine Backgroundand Correct Intensity

Background Calculation

Background intensity is calculated as themedian pixel intensity from the area within thesquare and outside the spot.

A separate local background is calculated for each spot using the non-spot pixels from it’s square.

localbackground

area

Spot Definition and Calculations

Spot Area, A = number of pixels within the defined spot boundary

BKG = median pixel value withinthe cell (excluding the spot pixels)

Integral = Sum of all spot pixels excluding saturated pixels

Reported “Intensity”=Integral-BKG*A

Spot Integration with Background Correction

Quality Control Issues

Two measures of spot quality are reported by SpotFinder:

• Saturation Factor

• QC Score: reports shape and signal to noise ratio

Saturation Examples

Partially saturated spots can look like this:

saturated area

non-saturated area

Completely saturated spots can look like this:fully saturated spot

Saturation, Pixel Value Limit

Output:pixel value

Input:fluorescencedye light signal

216=65536

Saturation Factor

-Partially saturated spots can be handled in SpotFinder by excluding the saturated pixels from spot area and intensity calculations.

-Fully saturated spots can not be recovered in SpotFinder. In this case rescanning with lower excitation power or PMT gain could be considered.*Faint spots may possibly be lost.

Saturation Factor = (# good pixels in spot)

(total number of spot pixels)

Saturation, RI Plot

RI plot: log(IB/IA) vs 1/2log(IA*IB)

clearly displays the saturation limits

Quality Control, QC Score

A QC Score is generated for each spot andis based on the spot shape and a measure ofsignal to noise ratio.

shape signal/noise shape signal/noise

QCA QCB

QC Score

Spot Shape Parameter

Shape Factor = (Spot Area/Perimeter)

Spots with large perimeters relative to spotarea will have a low shape factor.

Signal to Noise Ratio

med(BKG)

0

Pix

el V

alu

es

*med(BKG) + * SD(BKG)

S/N factor = fraction of spot pixelsexceeding:

216

SD(BKG)

Quality Control Calculation

QC Score = (QCA+QCB)/2

QCA=sqrt(QC shape*QC S/N) for channel A

QCB=sqrt(QC shape*QC S/N) for channel B

Quality Control, RI Plot

RI plot:

log(IB/IA) vs1/2log(IA*IB)

plotted for means shows clearly low intensity distortion due to background overestimation.

Data from earlier slide processed without QC filter

Quality Control

(data provided by E. Snesrud)

Quality Control

(data provided by E. Snesrud)

A - Spot area is larger than 50 pixels

B - Spot area is between 30 and pixels

C - spot area is smaller than 30 pixels

X - Spot rejected by QC based on spot shape

and spot intensity relative to surrounding background

U - Spot rejected (“flagged”) by user

Y - Bad spot, background is higher than spot intensity

Z - Spot was not detected by the program

S - Warning: some spot pixels are saturated

SpotFinder Flag Descriptions

UID Unique identifier for this spot

IA Intensity value in channel A

IB Intensity value in channel B

R Row (slide row)

C Column (slide column)

MR Meta-row (block row)

MC Meta-column (block column)

SR Sub-row

SC Sub-column

Output data (.mev) per spot:

FlagA TIGR Spotfinder flag value in channel A

FlagB TIGR Spotfinder flag value in channel B

SA Actual spot area (in pixels)

SF Saturation factor

QC Cumulative quality control score

QCA Quality control score in channel A

QCB Quality control score in channel B

Output data (.mev) per spot:

BkgA Background value in channel A

BkgB Background value in channel B

SDA Standard deviation for spot pixels in channel A

SDB Standard deviation for spot pixels in channel B

SDBkgA Standard deviation of the background in channel A

SDBkgB Standard deviation of the background in channel B

Output data (.mev) per spot:

MedA Median intensity value in channel A

MedB Median intensity value in channel B

MNA Mean intensity value in channel A

MNB Mean intensity value in channel B

X/Y X resp. Y coordinates of the spot cell

PValueA P-value in channel A

PValueB P-value in channel B

DBID Data Base ID (if UID is substituted)

Output data (.mev) per spot:

Recommended