41
Translational Genomics Research Institute | www.tgen.org Cancer Pathway Analysis and Personalized Medicine Jeff Kiefer Research Associate Inves4gator Transla4onal Genomics Research Ins4tute

Psb tutorial cancer_pathways

Embed Size (px)

Citation preview

Page 1: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Cancer Pathway Analysis and Personalized Medicine

Jeff  Kiefer  Research  Associate  Inves4gator  Transla4onal  Genomics  Research  

Ins4tute

Page 2: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Big Cancer Data Resources and Secondary Data Tools

Pathway Analysis - Resources, Methods, and Tools

Personalized Medicine - ‘Interpretation bottleneck’ Drug to Genomic Event Matching

Outline

Page 3: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Cancer Genome Data Repositories

https://www.ebi.ac.uk/arrayexpress/

http://www.ncbi.nlm.nih.gov/geo/

http://cancergenome.nih.gov/

https://icgc.org/

Page 4: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Cancer Genome Data Repositories and Data Portals

https://genome-cancer.ucsc.edu/

http://www.cbioportal.org/public-portal/

http://cancergenome.broadinstitute.orgTumorPortal

https://dcc.icgc.org/

http://genomeportal.stanford.edu/pan-tcga

http://cancer.sanger.ac.uk/cancergenome/projects/cosmic/

Page 5: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

http://www.cbioportal.org/public-portal/

Page 6: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

http://www.cbioportal.org/public-portal/

Page 7: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Pathways Analysis

Pathway analysis encompasses a number of different approaches and methods applied to large-scale -omic data sets.

The goal is to discover meaningful biological knowledge from large data sets often in the form of a gene list.

Pathway is a term that describes a step-wise signal transduction pathway. However, the term ‘pathway’ is also loosely used to

encompass genes sets derived from signatures or other biological processes such as the gene ontology.

Page 8: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

(2012). PLOS Computational Biology, 8(2), e1002375. doi:10.1371/journal.pcbi.1002375.t001

Pathways Analysis

Good general review outlining techniques, resources, and issues in pathway analysis

Page 9: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Pathways Analysis

Threshold-Based = Enrichment analysis performed on a gene list derived from statistical test.

Non-threshold Based = All data is used. First popularized by gene set enrichment analysis (GSEA).

‘de-novo’ Based = Pathways or gene sets derived from primary data.

Page 10: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Pathway Resources

http://www.reactome.org/http://www.genome.jp/kegg/pathway.html

http://www.broadinstitute.org/gsea/msigdb/index.jsp

Commercial Resources

http://www.pathwaycommons.org/about/#main-container

Page 11: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Threshold-based Pathway Enrichment Tools

https://toppgene.cchmc.org/

http://amp.pharm.mssm.edu/Enrichr

http://www.ici.upmc.fr/cluego/

Page 12: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

ToppGene extensive pathway gene sets available for enrichment analysis

Page 13: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Easy to use web interface

Add list of gene identifiers toperform enrichment analysis on.

Page 14: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Results sorted based on significance.

Page 15: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Page 16: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Results

Gene Set/Pathway Categories

Page 17: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Different Result Outputs

Page 18: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

http://www.ici.upmc.fr/cluego/

ClueGO integrates Gene Ontology (GO) terms as well as pathways and creates a functionally organized GO/pathway term network.

COL9A1

COL28A1

COL14A1

COL9A3

COL20A1COL12A1

COL9A2

Collagenbiosynthesis and

modifyingenzymes

Collagen formation

forebraindevelopment

SEMA3A

SYPL2

FGF9CNTNAP2

SLC6A4

NDNF

SLC5A3

HEPH

SLC14A1

Transport ofglucose and othersugars, bile salts

and organic acids,metal ions and

amine compounds

RHBG SLC6A20

TBX5

RAC3

negative regulation of celldifferentiation

negative regulation of Wntsignaling pathway

BICC1PRICKLE1

DKK1SFRP2

EFEMP1

regulation of cell development

COL1A1

EPHA3

SLIT2FES APCDD1SULF1

PPP2R3Aregulation of canonical Wnt

signaling pathway

regulation of Wntsignaling pathway

DDR2

LTF

regulation of celldifferentiation

SP7

MT3

BAX

S100A9

S100A8

NDUFA13

regulation of cysteine-typeendopeptidase activity involved

in apoptotic process

BBC3

regulation of intrinsic apoptoticsignaling pathway

IGFBP3MEGF10

SLN

CACNG4

CCL4CACNB2

ENPP1

KCNH2

regulation of ion transport

positiveregulation of ion

transport

CCL3CTLA4SCN4B

GADD45G

TRIB3

intrinsic apoptotic signalingpathway

p53 signalingpathway

BAI1

SEPT4

CD82

SFN TLR4osteoblastdifferentiation

TLR3

Rheumatoidarthritis

IL8

LOC100509457

CXCL5 ANGPT1

Toll-like receptor signalingpathway

CTSK

RUNX2

Cytoscape App

Page 19: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Non-Threshold Pathway Enrichment Tools

http://www.broadinstitute.org/gsea/index.jsp

Page 20: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

GSEA

Can be accessed through a number of resources and methods

Java DesktopR-GSEAGene Pattern

Page 21: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

GSEA Use CaseAnaplastic Thyroid Cancer vs Non-Tumor Thyroid

Page 22: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

GSEA Use Case

Page 23: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

GSEA Visualization with Enrichment Map

Page 24: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

GSEA Visualization with Enrichment Map

(2010) PLoS ONE, 5(11), 1–12. doi:10.1371/journal.pone.0013984.t001

http://www.baderlab.org/Software/EnrichmentMapCytoscape App

Page 25: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

EDDY computes the discrepancy between probability distributions of candidate networks structures based on likelihood of each network across classes of samples.

Page 26: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Methodology that can exploit complex interactions between two conditions, such as tumor v normal that might be missed in traditional approaches based on differential gene expression

Page 27: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Investigate differential dependencies between conditions –  Evaluation of Differential DependencY –  Computes the differential dependency statistics (JS) and its statistical

significance (p-value, via permutation) between conditions, based on the likelihoods of genetic networks (a probabilistic distribution)

Likelihood

… Possible (or probable) dependency structures JS

A

B C

Gene set of interest

A

B C

A

B C

Class 1

Class 2

MSigDB, …

Gene set DB

Class 2 specific dependency

Class 1 specific dependency

Common dependency

EDDY computes the discrepancy between probability distributions of candidate networks structures based on likelihood of each network across classes of samples.

Page 28: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Likelihood

… Possible (or probable) dependency structures

A

B C

A

B C

Class 1

Class 2

A

B C

A

B C A

B C

Class 1 Specific dependency

Class 2 Specific Dependency

A

B C

Common dependency

Page 29: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

•  GSEA appears under-powered, and also select disproportionate amount.

•  GSCA appears to be overly sensitive – high false positive

(#): Overlap with EDDY gene sets

The number of identified subtype-specific gene sets

Simulation II: comparison of EDDY with other methods

Figure 5 illustrates the ROC curves of GSCA, GSEA andEDDY from Simulation II, and Table 1 lists the areaunder curve values of the corresponding ROC curves inFigure 5.

From the results of the interaction-focused simulationexperiments (Figure 5), EDDY demonstrates superiorperformance than the other two methods. This is partlydue to the fact that the data were generated from modelsassuming conditional dependencies in gene expressionsrather than simple linear correlations, which is alsoassumed by the Bayesian network model that the currentimplementation of EDDY uses. The performance ofEDDY declines with increasing the size of gene sets, butit improves with more computations (by using larger M asshown in Figure 5). Another observation is significantlylower false positive rates of EDDY than that of othermethods (Supplementary Figure S10–S12).

This simulation study clearly indicates that EDDYsignificantly outperforms other methods when differentialgene sets are defined in the sense of gene interactions, withsignificantly lower false-positive rates.

Besides the simulation scenario covered in this study,there can be various different simulation configurationsdepending on the methods to generate synthetic datasets and the definition of differential gene sets. However,it is not feasible to enumerate and cover all such differentcases, and thus, they are left for future studies.

Comparison of EDDY with other methods in applicationto TCGA GBM gene expression data

Table 2 lists the number of statistically significant genesets identified with the three different methods for eachsubtype. EDDY and GSEA produced different results,as EDDY identified 10! 22 gene sets for each subtype,whereas GSEA identified 245 gene sets for mesenchymalbut just a few for other subtypes. Moreover, there is onlyone common gene set (for mesenchymal) between theresults from the two methods. A possible hypothesis ofGSEA identifying many gene sets only for mesenchymalis that mesenchymal is the most differentiated form ofGBM (physiologically or genotypically) (27), and manygenes are differentially expressed in mesenchymalcompared with other subtypes. Compared with GSEA,the results of EDDY are relatively less biased to aspecific subtype (for the lists of identified gene sets fromEDDY and GSEA, see Supplementary Tables S1–S8).Compared with the other two methods,

GSCA identified much more gene sets as statisticallysignificant, from 68 to 80% of the tested gene sets,making it almost noninformative (for the lists of identifiedgene sets from GSCA, see the supplementary file providedin http://biocomputing.tgen.org/software/EDDY). Thisbecomes clearer from Figure 6, where the P-values fromGSCA are much closer to 0 in general than that of EDDYand GSEA (for direct comparison of P-values from

A B C

Figure 5. The comparison of ROC of GSCA, GSEA and EDDY in identifying differential gene sets from the interaction-focused simulationexperiments. (A) v ¼ 10. (B) v ¼ 20. (C) v ¼ 30.

Table 1. The area under curve values of GSCA, GSEA and EDDYin identifying differential gene sets from Simulation II

Method v ¼ 10 v ¼ 20 v ¼ 30

GSCA 0.5774 0.5822 0.5965GSEA 0.4911 0.5574 0.6075EDDY (M=1000) 0.7440 0.6768 0.6704EDDY (M=5000) 0.8287 0.7580 0.7064

Bold face indicates top performance.

Table 2. The number of statistically significant gene sets for eachsubtype

Method Classical Mesenchymal Neural Proneural

EDDY 13 10 22 22GSEA 1 (0) 245 (1) 6 (0) 3 (0)GSCA 1590 (11) 1432 (7) 1681 (21) 1563 (17)

The number of common cases with EDDY is indicated in theparentheses.

Nucleic Acids Research, 2014 9

by guest on February 6, 2014http://nar.oxfordjournals.org/

Dow

nloaded from

Page 30: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

G2 pathway and p53 pathway gene sets to have differential dependencies that are related to the enrichment of p53 mutations in the proneural subtype. Heat maps show that genes in pathway are not differentially expressed so would not be identified by GSEA technique.

Two Pathways Identified with EDDY Enriched in Proneural Glioblastoma Phenotype

Page 31: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

PARADIGM

March 20, 2014 Vol507 Nature

MEMo

https://www.genome.gov/Multimedia/Slides/TCGA1/TCGA1_Ciriello.pdf

Both methods employ multiple genomicdata types to identified altered pathways.

Employed in TCGA studies

Page 32: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Personalized Medicine ‘Interpretation Bottleneck’

Drug Target Annotation

Page 33: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Personalized Medicine Pipeline

Good, B. M., Ainscough, B. J., McMichael, J. F., Su, A. I., & Griffith, O. L. (2014). Organizing knowledge to enable personalization of medicine in cancer, 1–9. doi:10.1186/s13059-014-0438-7

Page 34: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Drug Target Matching for Personalized Medicine

Good, B. M., Ainscough, B. J., McMichael, J. F., Su, A. I., & Griffith, O. L. (2014). Organizing knowledge to enable personalization of medicine in cancer, 1–9. doi:10.1186/s13059-014-0438-7

Page 35: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Framework for Clinical Mapping Genomic Aberration to Drugs

Good, B. M., Ainscough, B. J., McMichael, J. F., Su, A. I., & Griffith, O. L. (2014). Organizing knowledge to enable personalization of medicine in cancer, 1–9. doi:10.1186/s13059-014-0438-7

Page 36: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Drug Target Resources

A number of resources available for drug mapping to gene targets.

Issues with available sources•Different annotations schemes and data structures leads to misleading results for end user.•Contextual information around the drug and target is often not annotated.•Not all annotations are therapeutically actionable or appropriate.

Page 37: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Drug to Target AnnotationInformation for linking drugs to genes should be based on primary literature.

Curated information should be annotated with controlled vocabulary and arrayed in a structured format.

Rules need to capture explicit drug-target response information but also be flexible enough to capture inferred information that may not always be explicitly stated. Important for further research.

Page 38: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Example annotation workflow for capturing drug to target information.

Page 39: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Visualization of Drug Target Network

Page 40: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

CNV

OtherEXPDRUG

SNV

Aberration Type Color Key=no_direct=no_inferred=yes_inferred

=yes_directEdge Interaction Key Aberration Type Color Key

=DRUG

=BIOMARKER

=MODIFIER

Patient Specific Drug Target Network

Patient Genomic Information

Page 41: Psb tutorial cancer_pathways

Translational Genomics Research Institute | www.tgen.org

Impact Areas for Text Mining

•Identify and extract interaction information for network and pathway reconstruction.

•Aid in identifying and extracting genomic events linked to drug response to better enable personalized medicine.