JOSEPH LUCAS Toward a Characterization of Gene Expression in Single Tumor Samples

Embed Size (px)

DESCRIPTION

Overview Laboratory bias  Experiments on the lab bench  Correction doping controls  Modeling to alleviate bias Tumor expression  Factors as markers of pathway activity  Biological relevance  Clinical relevance Beyond

Citation preview

JOSEPH LUCAS Toward a Characterization of Gene Expression in Single Tumor Samples The Power of Microarrays Promise of personalized medicine Lack of consistency/reproducability Problem with overfitting Microarrays from lab bench to clinic Data collection bias and quality control In vitro -> in vivo In vivo -> in vitro Translation of meta-genes from in vitro to in vivo Overview Laboratory bias Experiments on the lab bench Correction doping controls Modeling to alleviate bias Tumor expression Factors as markers of pathway activity Biological relevance Clinical relevance Beyond Oncogene Upregulation Human Mammary Epithelial Cells (HMEC) 9 upregulated oncogenes and one set of controls Data collected in three batches Demonstrates collection bias Bild et al., Oncogenic pathway signatures in human cancers as a guide to targeted therapies Nature 439 (19), 2006 Collection Bias Doping control Should be identical across all observations Collection Bias Consistent errors across many genes May obscure interesting biology Modeling to Correct Collection Bias Systematic Errors Before Subtracting Error NFE2L1 Systematic Errors, Corrected Before Subtracting Error After Subtracting Error Upregulation of MYC NFE2L1 regulation of apoptosis MYC binding sequence in promoter GENES & DEVELOPMENT ( ) NFE2L1 Single Sample Factor Modeling Personalized medicine Need to deal with one array at a time Can not use the same correction technique Relative levels of genes within a sample should be informative Single Sample Factor Modeling Personalized medicine Need to deal with one array at a time Can not use the same correction technique Relative levels of genes within a sample should be informative Design Matrix Latent Factors Latent factors for Correction of Lab Bias Microenvironment Experiments Chen et al., Genomic analysis of response to lactic acidosis in human cancers Exact same conditions as oncogene experiment 24 control arrays split across 2 labs and 4 time points Uncorrelated measurements of gene expression? Correlation between Two Different Samples Microenvironment, array #1 Consistently Correlated across all Pairs Correlation of -0.6 Microenvironment, array #1 Factor Model almost Eliminates Correlation After Correction Before Correction Microenvironment, array #1 Factor Model almost Eliminates Correlation After Correction Before Correction Oncogene, array #7 Microarray Quality Control (MAQC) 120 arrays, also U different labs 5 repetitions per group 4 groups Universal Human Reference RNA Human Brain Reference RNA Titration of RNA to form groups Nature Biotechnology, all of volume 24 (2006) Example 2 We believe these are collection errors Due to pH, temperature, duration before washing, etc Errors should be universal for U arrays Keep all oncogene and microenvironment control observations Keep all 120 observations from MAQC Mean expression for each gene is different between MAQC and HMECs Refit model, but assume error correction is same! Retain Ability to Correct Bias in HMEC Have we improved the fidelity? Improved Fidelity UH UH Universal Human Reference RNA HB Human Brain Reference RNA 75% UH25% UHHB Raw data, Lab 4 Raw data, Lab 6 Raw data, Labs 1,2,3,5 Corrected data Very different error types, both corrected Improved Fidelity UH UH Universal Human Reference RNA HB Human Brain Reference RNA 75% UH25% UHHBUH75% UH25% UHHB Differentially Expressed Gene Defining Success By design, should be monotone ordering Does probability of correctly ordering increase? Before CorrectionAfter Correction Red points are not monotone ! failureRed points are monotone ! success UH75% UH25% UHHBUH75% UH25% UHHB MAQC Experiment More than Error Correction? Can correct biases from vastly different experiments Aggregate data from multiple labs across multiple time points Analyze and incorporate new data as it comes in More than Error Correction? Can correct biases from vastly different experiments Aggregate data from multiple labs across multiple time points Analyze and incorporate new data as it comes in Metagenes discovered in vitro can be used as in vivo phenotypes, however Signatures developed in cloned cells Lack biological variability In vivo, other pathways will be active/inactive Factor Evolution Break down into multiple pathways in vivo Evolutionary factor search to dissect and enhance signatures Carvalho, et al., High-dimensional sparse factor modelling - Applications in gene expression genomics., submitted Consider behavior of genes in vivo Miller, et al., An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival, PNAS, 102, (2005) '225378_at' '225399_at' '225407_at' '225493_at' '225527_at' '225681_at' '225768_at. * Mean( ) Initial genes New genes factors Expression differences from lactic acidosis experiment in vitro ! in vivo Highly differentially expressed genes P53 Wild Type versus Mutant Each factor is a collection of genes that are expressed together across all samples P53 Wild Type versus Mutant Combinations of factors are predictive of important phenotypes Tamoxifen Didnt receive TamoxifenTreated with Tamoxifen All patients receiving Tamoxifen were ER positive Tamoxifen sensitivity independent of ER status Dark BlueLight Blue Endothelial Cell Signature? Contains 143 of the 188 genes in a known microvascular endothelial cell signature Chi et al., Endothelial cell diversity revealed by global expression profiling, PNAS, 100 (19), 2003 independently of ER status of tumor cells, Tam could affect the microvessel structure through the antagonism with endothelial cells ER Clinical Cancer Research Vol. 7, , September 2001 [Tamoxifen] inhibited tube formation by rat microvascular endothelial cells Gen Pharmacol (2000) 34: Estrogen Receptors Trained on Miller Predictive on others (Wang) Breast Tumor Factors in Lung Factor behavior in Lung tissue Endothelial cell factor Estrogen Receptor factor Endothelial Cell Factor Breast Cancer Samples Lung Cancer Samples Estrogen Receptor Factor Breast Cancer Samples Lung Cancer Samples Summary Correction of laboratory biases Allows aggregation of multiple data sets Discovery of conserved metagenes relevant to Survival Cellular phenotypes, ER, PgR, P53 Identification of novel biology Within a framework that allows identification of meta-genes on single arrays Beyond? Concurrent modeling of multiple different tumor types Collaborators Statistics Mike West Carlos Carvalho Dan Merl Quanli Wang Biology Jen-Tsan Ashley Chi Joe Nevins Julia Ling-Yu Chen Andrea Bild Microarrays to Identify Phenotypes Disease Diagnosis Alzheimers Infection Psoriatic Arthritis Lebers Congenital Amaurosis Usher syndrome. Cancer Survival prediction Metastasis prediction Drug susceptibility. Development Embryonic development Cellular differentiation Radial symmetry Internal structure. Obesity Oligo GEArray Mouse Obesity Microarray: OMM-017 Liver Int Dec;25(6): Obesity Research 11: (2003) Physiological Genomics 20: (2005). PharmaFrontier Co., Ltd. Genetel Pharmaceuticals Hong Kong DNA Chips. Alzheimers Oligo GEArray Human Alzheimer's Disease Microarray PNAS 2004 Feb 17;101(7): Epub 2004 Feb 9 The Journal of Neuroscience, Feb 9, 2005, 25(6): Ann Neurol Dec;58(6): Primorigen Biosciences ProteomTech Ciphergen Biosystems, Inc Cancer Disease Diagnosis Microarray Analyses of Peripheral Blood Cells Identifies Unique Gene Expression Signature in Psoriatic Arthritis Mol Med JanDec; 11(1-12): 2129 Incipient Alzheimer's disease: Microarray correlation analyses reveal major transcriptional and tumor suppressor responses PNAS, February 17, 2004, vol. 101 no. 7, Microarrays at IGSP Joseph Nevins Rb/E2F pathway Jen-Tsan Ashley Chi tumor microenvrionment Phil Febbo gene expression as phenotypes Anil Potti individualized chemotherapy Tom Kepler activation of dendritic cells Gregory Wray development in echinoderms Paul Magwene co-expression in microorganisms Geoffrey Ginsburg expression in peripheral blood John Olson surgical oncology Ornit Chiba-Falek Philip Benfy development and cell differentiation in Arabidopsis >95 papers published by the IGSP microarray facility since 1999 Expanding the Role of Need not include only collection bias Identifying signatures in other samples Factors associated with: Lactic acidosis Hypoxia Various oncogenes Other sources Gene lists Simple experiment Change in g,j Estrogen Receptors Trained on Miller Predictive on others (Wang) Progesteron Receptors Trained on Miller Predictive on others (Massague) Makes use of the ER factor and a new PgR specific factor Predicting Survival Trained on MillerPredictive on the others (Wang) TGF - Progesteron Receptor Breast Cancer Samples Lung Cancer Samples P53 Mutants Breast vs. Lung Breast Cancer Samples Lung Cancer Samples TGF - Breast Cancer Samples Lung Cancer Samples Estrogen Receptor Breast vs. Ovarian Breast Cancer Samples Ovarian Cancer Samples