Environmental Cancer Genomicsfrom High-Throughput Assays to Prevention
Stefano MontiArt beCAUSE consortium meeting
8/4/2015 – 9/1/2015
Overall Goal
Predict long-term in vivo carcinogenicity of chemical compounds from
short-term in vitro genomic assays of exposure
Carcinogenicity Screening
Underlying Hypotheses
Short-term exposure assays can predict long-term phenotype
In-vitro assays can predict in-vivo response
GoalsDevelopment of “Carcinogenicity Biomarker(s)”
CarcinogenicityPrediction Model
Chemical
Carcinogen
Non-carcinogen
Pathways affected Driver alterations Biomarkers …
Understand Why
Expression Profilingmeasuring transcriptional “activity” genome-wide
DNA
RNA
mRNA
Proteins
Transcription /Post-transcription Translation
Low High
expression
sort
non-carcinogens carcinogens non-carcinogens carcinogens
Low High
expression
Com
poun
d 1
Com
poun
d 2
Com
poun
d 3
Com
poun
d 4
Com
poun
d 5
Com
poun
d 6
Com
poun
d 7
Com
poun
d 8
Com
poun
d 9
Com
poun
d 10
Com
poun
d 11
Com
poun
d 12
Com
poun
d 13
Com
poun
d 14
Com
poun
d 15
Com
poun
d 16
Com
poun
d 1
Com
poun
d 2
Com
poun
d 3
Com
poun
d 4
Com
poun
d 5
Com
poun
d 6
Com
poun
d 7
Com
poun
d 8
Com
poun
d 9
Com
poun
d 10
Com
poun
d 11
Com
poun
d 12
Com
poun
d 13
Com
poun
d 14
Com
poun
d 15
Com
poun
d 16
Expression Profilingto predict chemical carcinogenicity
Stefano Monti − BUSM
Transcriptional Signatures
?
non-carcinogens carcinogens
Expression Profilingto predict chemical carcinogenicity
New Compound
Project Design Overview
…Genotoxicity
Carcinogenicity
Compo
und 1
Compo
und 2
Compo
und 3
Compo
und N…
Prediction EvaluationClassification AccuracySensitivity/SpecificityROC curve…
Biology of ExposureExposure MoAPathways“Drivers”Exposure risk models
Carcinogenicity Prediction
“New” compound
Carcinogen
Non-Carcinogen
Cell lines/iPSC treated w/ compounds …
.. and profiled on L1000/ 3’DGE / SFL
Project depends on high-throughput, cost-effective gene expression assay
Long-term Phenotypes
Short-term Assay
Deliverables
The Carcinogenome DB (CGDB) Genome-wide transcriptional profiles of 10,000s of compounds and
mixtures on multiple cell types and at multiple doses/times
Carcinogenicity Biomarker(s) Predictive models of carcinogenicity from in-vitro profiling
Signatures and Pathways of Carcinogenicity An annotated compendium of biological pathways whose (aberrant)
activation is associated with carcinogenicity/cancer induction
Can Carcinogenicity be predicted from GEP?The DrugMatrix/TG-GATEs answer
Rat-based datasets from NIEHS & Japan (thanks Scott Auerbach & Ray Tice @ NTP)
Can Carcinogenicity be predicted from GEP?The DrugMatrix/TG-GATEs answer
Rat-based datasets from NIEHS & Japan (thanks Scott Auerbach & Ray Tice @ NTP)
…Genotoxicity
Carcinogenicity
Compo
und 1
Compo
und 2
Compo
und 3
Compo
und N…
Prediction EvaluationClassification AccuracySensitivity/SpecificityROC curve…
Biology of ExposureExposure MoAPathways“Drivers”Exposure risk models
Carcinogenicity Prediction
“New” compound
Carcinogen
Non-Carcinogen
Cell lines/iPSC treated w/ compounds …
.. and profiled on Luminex-1000
Rats exposed to compounds …
.. and profiled on Affymetrix
…
Gusentleitner et al., PLoS ONE 2014
Long-Term Carcinogenicity can be Predicted from Short-term Expression Assays
Dose-independent
labelingD
ose-dependent labeling
Gusentleitner et al., PLoS ONE 2014
Long-Term Carcinogenicity can be Predicted from Short-term Expression Assays
Carcinogenicity Prediction can be Improved
by increasing the number of chemicals used to build model
“P
red
icti
ve A
ccu
racy”
Gusentleitner et al., PLoS ONE 2014
Genomic Modeling Helps Identify Pathways of Carcinogenicity
Path
ways
CHEMICALSnon-Carcinogens Carcinogens
Carcinogenicity can be Captured by in-vitro (human) models
Enric
hmen
t Sco
re
p < 0.005
carc non-carc
L1000-based gene ranking
DrugMatrix signature genes
Rat carcinogenicity signature can be mapped to human data
Significant similarity of Rat and Human signatures
36 g
enes
(FDR
≤.05
| FC
≥2)
121 samples (39 C vs. 82 NC)
Human lung cell lines exposed to carcinogens and non-carcinogens
Statistically significant markers identified
Luminex-1000 data
In Progresshigh-throughput data generation
Multi-platform (mirror) experiments
Multiple platform comparison Luminex-1000 (L1000) 3’ Digital Gene Expression (3’DGE) Sparse Full Length/RNA-tag seq (SFL)
Experimental design Chemicals selection (and dose/concentration) Tissue types (liver – HepG2; breast – MCF7, MCF10; lung – A549) challenging set-up (chemical procurement and dose determination)
Multiple funding sources Evans ARC BUSRP admin supplement NIH/LINCS 1-year grant w/ Broad Art beCAUSE
Network-Based Analysis of Chemical Perturbations
Discrimination carcinogens/non-carcinogens
Genes driving the response to chemical exposurePredictive model
• Compare control state to multiple perturbation states
• Capture aggregate differences difficult to see with standard analysisNew
goa
ls
Differential expression(standard)
Differentialconnectivity
controlchemically perturbed control
chemically perturbed
2015 ACS Meeting
Network Analysis Overview
Module1
Module2
…
ModulepCo
mpo
und 1
Com
poun
d 2
… Com
poun
d n
lossgain
connectivity
Annota
tionWild-Type
Network
2015 ACS Meeting
Results Summary
Networks structure captures grouping of compounds with similar functions and genotoxicity/carcinogenicity
Differentially connected gene modules enriched for pathways related to chemicals’ action Statins
• Cholesterol biosynthesis, Lipid Metabolism, Steroid biosynthesis, ... Chemoterapeutics
• Cell cycle, DNA replication, DNA damage response (P53)
…
The TeamBroad InstituteAravind SubramanianXiaodong LucMap/LINCS team
NTP/NIEHSScott AuerbachRay Tice
BU CBM/Bioinformatics/SPHDavid SherrAmy LiDaniel GusenleitnerFrancesca Mulas
The End