Upload
kathleen-thackery
View
226
Download
2
Tags:
Embed Size (px)
Citation preview
Introduction to DNA Microarrays
Michael F. Miles, M.D., Ph.D.
Depts. of Pharmacology/Toxicology and Neurology and the Center for Study of
Biological Complexity
225-4054
Biological Regulation: “You are what you express”
• Levels of regulation
• Methods of measurement
• Concept of genomics
Regulation of Gene Expression
• Transcriptional– Altered DNA binding protein complex abundance or function
• Post-transcriptional– mRNA stability– mRNA processing (alternative splicing)
• Translational– RNA trafficking– RNA binding proteins
• Post-translational– Many forms!
Regulation of Gene Expression
• Genes are expressed when they are transcribed into RNA
• Amount of mRNA indicates gene activity
• Some genes expressed in all tissues -- but are still
regulated!
• Some genes expressed selectively depending on tissue,
disease, environment
• Dynamic regulation of gene expression allows long term
responses to environment
Acute Drug Use
Mesolimbic dopamine? Other
ReinforcementIntoxication
Chronic Drug Use
Compulsive Drug Use
“Addiction”
?Synaptic RemodelingPersistent Gene Exp.
ToleranceDependence
Sensitization
Altered SignalingGene Expression
?Synaptic Remodeling
Progress in Studies on Gene Regulation
1960 1970 1980 1990 2000
mRNA,tRNA discovered
Nucleic acid hybridization, protein/RNA electrophoresis
Molecular cloning; Southern, Northern & Western blots; 2-D gels
Subtractive Hybridization, PCR, Differential Display,
MALDI/TOF MS
Genome Sequencing
DNA/Protein Microarrays
Nucleic Acid Hybridization: How It Works
Primer on Nucleic Acid Hybridization
• Hybridization rate depends on time,the concentration of nucleic acids, and the reassociation constant for the nucleic acid:
C/Co = 1/(1+kCot)
High Density DNA Microarrays
A Bit of History
~1992-1996: Oligo arrays developed by Fodor, Stryer, Lockhart, others at Stanford/Affymetrix and Southern in Great Britain
~1994-1995: cDNA arrays usually attributed to Pat Brown and Dari Shalon at Stanford who first used a robot to print the arrays. In 1994, Shalon started Synteni which was bought by Incyte in 1998.
However, in 1982 Augenlicht and Korbin proposed a DNA array (Cancer Research) and in 1984 they made a 4000 element array to interrogate human cancer cells.
(Rejected by Science, Nature and the NIH)
Biological Networks
Types of Biological Networks
Gene Regulation Network
Examining Biological Networks: Experimental Design
Examining Biological Networks
PFCHIP VTA
NAC
Use of S-score in Hierarchical Clustering of Brain Regional Expression Patterns
0 +2-2
relative change
PFCHIP NAC
VTA
AvgDiff S-score
Expression Profiling: A Non-biased, Genomic Approach to Resolving the Mechanisms of Addiction
Candidate Gene Studies
Cycles of Expression
Profiling
Merge with Biological Databases
Utility of Expression Profiling
• Non-biased, genome-wide
• Hypothesis generating
• Gene hunting
• Pattern identification: – Insight into gene function– Molecular classification– Phenotypic mechanisms
Hybridization and Scanning
GE Database (SQL Server)
Comparisons(S-score, d-
chip)
Clustering Techniques
Statistical Filtering
(e.g. SAM)
Overlay Biological Databases(PubGen,
GenMAPP, QTL, etc.)
Provisional Gene
“Patterns”
Filtered Gene Lists
Candidate Genes
Molecular Validation
(RT-PCR, in situ, Western)
Behavioral Validation
De-noise
Experimental Design
Experimental Design with DNA Microarrays
High Density DNA Microarrays
Synthesis and Analysis of 2-color Spotted cDNA Arrays: “Brown Chips”
Comparative Hybridization with Spotted cDNA Microarrays
Synthesis of High Density Oligonucleotide Arrays by Photolithography/Photochemistry
GeneChip Features
• Parallel analysis of >30K human, rat or mouse genes/EST clusters with 15-20 oligos (25 mer) per gene/EST
• entire genome analysis (human, yeast, mouse)
• 3-4 orders of magnitude dynamic range (1-10,000 copies/cell)
• quantitative for changes >25% ??
• SNP analysis
Oligonucleotide Array Analysis
AAAA
Oligo(dT)-T7
Total RNA Rtase/Pol II
dsDNAAAAA-T7TTTT-T7
CTP-biotin
T7 polTTTT-5’5’
Biotin-cRNA
Hybridization
Steptavidin-phycoerythrin
Scanning
PM
MM
Stepwise Analysis of Microarray Data
• Low-level analysis -- image analysis, expression quantitation
• Primary analysis -- is there a change in expression?
• Secondary analysis -- what genes show correlated patterns of expression? (supervised vs. unsupervised)
• Tertiary analysis -- is there a phenotypic “trace” for a given expression pattern?
Affymetrix Arrays: Image Analysis
Affymetrix Arrays: Image Analysis
“.DAT” file “.CEL” file
Affymetrix Arrays: PM-MM Difference Calculation
Probe pairs control for non-specific hybridization of oligonucleotides
Variability and Error in DNA Microarray Hybridizations
(a)
Variability in Ln(FC)
- 4
- 3
- 2
- 1
0
1
2
3
4
- 4 - 3 - 2 - 1 0 1 2 3 4
l n ( P F C 1 A S / V T A 1 A S )
R = 0 . 7 1
ln(FoldChange) S-score
Ln(FC1)
Ln(FC2)
• Position Dependent Nearest Neighbor (PDNN) - 2003Zhang, Miles and Aldape, (2003) A model of molecular interactions on short oligogonucleotide microarrays: implications for probe design and data analysis. Nature Biotech. In Press.
Chip Normalization Procedures
• Whole chip intensity– Assumes relatively few changes, uniform
error/noise across chip and abundance classes
• Spiked standards– Requires exquisite technical control, assumes
uniform behavior
• Internal Standards– Assumes no significant regulation
• “Piece-wise” linear normalization
Normalization Confounds: Non-uniform Chip Behavior
S-s
core
Gene
Normalization Confounds: Non-linearity
“Lowess” normalization,Pin-specific Profiles
After Print-tip Normalization
Slide Normalization: Pieces and Pins
See also: Schuchhardt, J. et al., NAR 28: e47 (2000)
http://www.ipam.ucla.edu/publications/fg2000/fgt_tspeed9.pdf
Quality Assessment
• Gene specific: R/G correlation, %BG, %spot
• Array specific: normalization factor, % genes present, linearity, control/spike performance (e.g. 5’/3’ ratio, intensity)
• Across arrays: linearity, correlation, background, normalization factors, noise
Statistical Analysis of Microarrays: “Not Your Father’s Oldsmobile”
Normal vs. NormalNormal vs. Normal
Normal vs. TumorNormal vs. Tumor
Sources of Variability
• Target Preparation– Group target preps
• Chip Run– Minor, BUT…– Be aware of processing order
• Chip Lot– Stagger lots across experiment if necessary
• Chip Scanning Order– Cross and block chip scanning order
Secondary Analysis: Expression Patterns
• Supervised multivariate analyses– Support vector machines
• Non-supervised clustering methods– Hierarchical– K-means– SOM
PFCHIP VTA
NAC
Use of S-score in Hierarchical Clustering of Brain Regional Expression Patterns
0 +2-2
relative change
PFCHIP NAC
VTA
AvgDiff
S-score
Expression Networks
Expression Profiling
Behavior
Pharmacology Genetics
Prot-Prot
Interactions
OntologyHomoloGen
e
BioMed Lit
Relations
Array Analysis: Conclusions
• Be careful! Assess quality control parameters rigorously
• Single arrays or experiments are of limited value
• Normalization and weighting for noise are critical procedures
• Across investigator/platform/species comparisons will most easily be done with relative data
Comparison of Primary Analysis Algorithms II
Spotted cDNA Microarrays