Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Expression-based variant impact phenotyping of somatic mutations in cancer
Angela Brooks, Ph.D. Alice Berger, Ph.D. Xiaoyun Wu, Ph.D.
Matthew Meyerson, M.D., Ph.D.
Jesse Boehm, Ph.D.
May 11, 2015
Cancer sequencing studies are underpowered to find all functionally relevant mutated genes
Lawrence et al. Nature 2014
No.
of s
ampl
es n
eede
d fo
r 90%
pow
er in
90
% o
f gen
es
20,000
10,000
5,000
3,000 2,000
1,000
500
300 200
100
50
0.1 0.2 0.3 0.5 Somatic mutation frequency (per Mb)
1 2 3 5 10 20
1% 2% 3% 5%
10%
Mutation rate
Rhabdoid Medulloblastoma
AML
Carcinoid
Neuroblastoma
CLL
Prostate
Breast Multiple myeloma
Ovarian
Kidney clear cell
GBM Endometrial
Colorectal
DLBCL
Esophageal adeno.
Head and neck Bladder
Lung adenocarcinoma Lung squamous
Melanoma
Sequencing studies are even more underpowered to find impactful variants from passengers
AML Bladder
Breast CRC
DLBCL Endometrial Esophageal
GBM
Head/Neck Kidney (CC) Lung adeno.
Lung squamous Melanoma
MM Neuroblastoma
Ovarian Prostate
www.tumorportal.org
0 100 200 300 400 500 600 700 800 900
EGFR (amino acid coordinates)
Approaches to high-throughput experimental variant impact phenotyping
Somatic variants from cancer genome studies
Expression profiling
Multiplexed tumorigenesis
Variant-specific ORFs and CRISPRs
Morphologic profiling
Genetic interactions
Lab of Jesse Boehm
Variant-specific reagents: open-reading frame (ORF) or CRISPR
Pathway-specific assays Gene-agnostic assays
Expression-based variant impact phenotyping
Somatic variants from cancer genome studies
Expression profiling
Multiplexed tumorigenesis
Variant-specific ORFs and CRISPRs
Morphologic profiling
Genetic interactions
Lab of Jesse Boehm
Pathway-specific assays Gene-agnostic assays
Variant-specific reagents: open-reading frame (ORF) or CRISPR
Testing functional impact of somatic mutations in lung adenocarcinomas
Lawrence et al. Nature 2014
No.
of s
ampl
es n
eede
d fo
r 90%
pow
er in
90
% o
f gen
es
20,000
10,000
5,000
3,000 2,000
1,000
500
300 200
100
50
0.1 0.2 0.3 0.5 Somatic mutation frequency (per Mb)
1 2 3 5 10 20
Rhabdoid Medulloblastoma
AML
Carcinoid
Neuroblastoma
CLL
Prostate
Breast Multiple myeloma
Ovarian
Kidney clear cell
GBM Endometrial
Colorectal
DLBCL
Esophageal adeno.
Head and neck Bladder
Lung adenocarcinoma Lung squamous
Melanoma
1% 2% 3% 5%
10%
Mutation rate
Approach: Compare gene expression changes upon introduction of WT and mutant variants
ORF library • 47 genes, 303 alleles • Missense and indel variants
found in >400 lung adenocarcinomas*
• KEAP1: 38 alleles + WT • KRAS: 12 alleles + WT • EGFR: 11 alleles + WT
96h
A549 lung adenocarcinoma
cell line
(luminex-based profiling of 1000
transcripts)
384 well plates & replicates
L1000 gene expression profiling WT
MUT
*Imielinski et al. 2012, TCGA Nature 2014
Broad Genetics Perturbation Platform Broad Connectivity Map
Comparing changes in gene expression can predict variant impact
rep 1
……
Likely inert: ARAF V145L
1 ……. . 8
rep 8
-1
0
1
Correlation (connectivity score)
rep MUT
MU
T
rep
WT
WT
MU
T
WT
WT WT vs. MUT MUT
Comparing changes in gene expression can predict variant impact
rep 1
……
1 ……. . 8
WT WT vs. MUT MUT
Functional impact:
ARAF S214F
rep 8
-1
0
1
Correlation (connectivity score)
Likely inert: ARAF V145L
MUT
MU
T
WT
WT
MU
T
MUT
MU
T
WT
MU
T
WT
WT
rep
rep WT
rep 1
rep 8
……
Likely inert: ARAF V145L
1 ……. . 8
Variant impact prediction is determined by a Kruskal-Wallis test (FDR 5%)
Impact: ARAF S214F
WT WT vs. MUT MUT 1
-1
0
row
med
ian
scor
e
1
-1
0
row
med
ian
scor
e
Corrected p-value = 0.10
Corrected p-value = 0.0001
WT
WT vs.
MUT MUT
rep 1
rep 8
……
Likely inert: ARAF V145L
1 ……. . 8
Impact direction score can predict loss-of-function or gain-of-function change
-1
0
1
connectivity score
Impact-GOF: CTNNB1 S33N
Impact-LOF: STK11 D194Y
Impact: ARAF S214F
WT MUT
Sparkler plot of variant impact phenotyping for 151 lung cancer somatic mutations
ARAF V145L
Ryo Sakai Bang Wong
Impa
ct d
irect
ion
scor
e
-log10(corrected p-value) Gain-of-function (n=23) Loss-of-function (n=49)
Likely inert (n=45) Impact (n=34)
-3
0
1
2
-2
-1
3
0 2 3 1 4
CTNNB1 S33N
STK11 D194Y
ARAF S214F
Gene-by-gene variant impact phenotyping
Known oncogenes
Known tumor suppressors
Genes of unknown function
Gain-of-function Loss-of-function
Likely inert Impact
CTNNB1 EGFR KRAS
KEAP1 STK11 FBXW7
DCAF8 TPK1
Gain- and change-of-function predictions of rare mutations in known oncogenes
Known oncogenes
Known tumor suppressors
Genes of unknown function
Gain-of-function Loss-of-function
Likely inert Impact
CTNNB1 EGFR KRAS
KEAP1 STK11 FBXW7
DCAF8 TPK1
CTNNB1
V600G G367V I140T F777S
S45F
G34V S33N
KRAS
G12A G12C G12D G12F G12R G12S G13V
G12Y
D33E
Loss-of-function mutations in known tumor suppressor genes
Known oncogenes
Known tumor suppressors
Genes of unknown function
Gain-of-function Loss-of-function
Likely inert Impact
CTNNB1 EGFR KRAS
DCAF8 TPK1
STK11
G56V
G251F G251V
D194Y G242W
FBXW7
D211N
P620R
V464E
Predicted impact of mutations in genes with unknown function
Known oncogenes
Known tumor suppressors
Genes of unknown function
Gain-of-function Loss-of-function
Likely inert Impact
CTNNB1 EGFR KRAS
DCAF8 TPK1
DCAF8
E63K
G588C
R559P P156S
TPK1
T213S P152T K111M
T205S
E81Q
G48C
L185I
How do we know these predictions are correct?
1) Literature Benchmarks - 100% accuracy with 21 previously studied mutations 2) Subsampling from high replicate experiment to assess false positive rate
- FDR 5% cutoff gives 4.84% false positive calls 3) Correspondence with genetic patterns (e.g. hotspot)
Functional mutation in FBXW7 is adjacent to a hotspot mutation
D211N
V464E
Impa
ct d
irect
ion
scor
e
-log10(corrected p)
0
-3
3
0 2 4 1 3
FBXW7
D211N
P620R
V464E
P620
How do we know these predictions are correct?
1) Literature Benchmarks
2) Simulation to assess false positive rate 3) Correspondence with genetic patterns (e.g. hotspot) 4) Signatures are similar to alleles of known gene function
ORF: 1
272
1 272 …………………………………………….
……
……
……
……
……
……
……
……
….
Correlation
1 0.8
-0.8 -1
EGFR/RAS pathway driver cluster
KEAP1-NFE2L2 anti-correlation
STK11 WT cluster
Identification of major transcriptional classes of lung adenocarcinoma genes
Rare variants cluster with known EGFR/RAS pathway drivers
Rare variants cluster with known EGFR/RAS pathway drivers
KRAS
D33E
Impa
ct d
irect
ion
scor
e
0
3
-3 -log10(corrected p) 0 4
Impa
ct d
irect
ion
scor
e
0
3
-3 -log10(corrected p) 0 4
ARAF
S214C S214F
Impa
ct d
irect
ion
scor
e
0
3
-3 -log10(corrected p) 0 4
Gain-of-function Loss-of-function
Likely inert Impact EGFR
K754E
S645C
Unique signature of a rare non-canonical EGFR mutation is currently being investigated
EGFR/RAS cluster
EGFR variants
Non-canonical signature
EGFR
H1129Y
Gain-of-function Loss-of-function
Likely inert Impact
How do we know these predictions are correct?
1) Literature Benchmarks
2) Simulation to assess false positive rate 3) Correspondence with genetic patterns (e.g. hotspot) 4) Signatures are similar to alleles of known gene function 5) Orthogonal assays - Genetic rescue screens, multiplex tumor xenografts, cell morphologic profiling
Variant impact phenotyping using a gene-agnostic assay
• Variant impact phenotyping is critical for understanding and treating cancer and other diseases.
• Gene-agnostic bioassays can more rapidly characterize
variant impact
• This approach can be used for any gene, regardless of disease type and regardless of knowledge of gene function.
Jesse Boehm Matthew Meyerson Todd Golub Yashaswi Shrestha Candace Chouinard John Doench Itay Tirosh Pablo Tamayo Bang Wong Ryo Sakai
Aravind Subramanian Larson Hogstrom Ted Natoli David Lahr Jackie Rosains John Davis Dan Lam Corey Flynn Josh Gould Rajiv Narayan Ian Smith Federica Piccioni Mukta Bagul Sasha Pantel Cong Zhu Thomas Green Dave Root Xiaoping Yang Steven Corsello
Meyerson Lab Tanaz Sharifnia Heidi Greulich Josh Campbell Jenny Chen Peter Choi Fujiko Duke Hugh Gannon Marcin Imielinski Bethany Kaplan Craig Strathdee Alison Taylor Luc DeWaal Xiaoyang Zhang Kasper Lage Johnathan Mercer Reza Kashani
Gaddy Getz Bill Hahn Nina Ilic Lihua Zou Atanas Kamburov Eejung Kim Mike Lawrence Anne Carpenter Shantanu Singh Mark Bray Paula Keskula Sara Seepo
Alice Berger Xiaoyun Wu
Jesse Boehm Matthew Meyerson Todd Golub Yashaswi Shrestha Candace Chouinard John Doench Itay Tirosh Pablo Tamayo Bang Wong Ryo Sakai
Aravind Subramanian Larson Hogstrom Ted Natoli David Lahr Jackie Rosains John Davis Dan Lam Corey Flynn Josh Gould Rajiv Narayan Ian Smith Federica Piccioni Mukta Bagul Sasha Pantel Cong Zhu Thomas Green Dave Root Xiaoping Yang Steven Corsello
Meyerson Lab Tanaz Sharifnia Heidi Greulich Josh Campbell Jenny Chen Peter Choi Fujiko Duke Hugh Gannon Marcin Imielinski Bethany Kaplan Craig Strathdee Alison Taylor Luc DeWaal Xiaoyang Zhang Kasper Lage Johnathan Mercer Reza Kashani
Gaddy Getz Bill Hahn Nina Ilic Lihua Zou Atanas Kamburov Eejung Kim Mike Lawrence Anne Carpenter Shantanu Singh Mark Bray Paula Keskula Sara Seepo
Alice Berger Xiaoyun Wu
Brooks Lab positions available in cancer transcriptome analysis and functional genomics at UC Santa Cruz! [email protected]