44
METACORE January 2020 Data-Mining and Pathway Analysis

Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

METACORE

January 2020

Data-Mining and Pathway Analysis

Page 2: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

2

• As technological advances lead to the generation of more omics data, scientists are facing a new challenge in making sense of it all.

• New tools are needed that can meet this challenge to enable researchers to generate actionable hypothesis from their data.

Big data challenges in the Life Sciences

Page 3: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

3

• Gain molecular understanding of disease

• Analyze and understand experimental findings (Omics data) in the context of validated biological pathways

• Generate and confirm hypotheses for novel biomarkers, targets, mechanisms of action

MetaCore: Your GPS in pathway analysis

?Knowledge

Mining

‘Omics’ Data Analysis

Pathway Analysis

Platform

MetaCore delivers high-quality biological systems content in context, giving you essential data and analytics to

accelerate your scientific research. MetaCore contains sophisticated integrated pathway and network analysis for

multi-omics data.

With MetaCore you can analyze and understand experimental findings in the context of validated biological pathways.

Page 4: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Why Pathway Analysis Software?

• A learning tool– Study a group of gene products.

• A data analysis tool.– Which pathways are particularly affected?

– What disease has similar biomarkers?

• A hypothesis generation tool– Can provide insight into mechanism of regulation of your genes.

Which is the likely causative agent for the observed changes? What is likely to happen as a result of these changes?

– Suggest effects of gene knock-in or knock-outs.

– Suggest side-effects of drugs.

– Can highlight new phenomena that needs further investigation. What does the program not explain?

Page 5: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

o 158,000 new network interactions including

82,000 new protein-protein interactions

13,000 new substrate-product reactions (interactions)

13,000 new RNA-protein interactions

o 57,000 new unique gene-disease associations

o 10 new regulatory, metabolic and disease maps

o 14,500 new articles (network – interactions, reactions)

2018 Enhancements: More data, more interactions and new maps

5

Page 6: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

MetaCore Specialty Modules are now available under standard MetaCore license

o MetaCore Specialty Modules are available

MetaCore Allergic Contact Dermatitis Module

MetaCore Lung Disease Module

MetaCore Metabolic Disease Module

MetaCore Neurology Module

MetaCore Normal Stem Cells Module

MetaCore Oncology Module

MetaCore Systems Toxicology Module

Page 7: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

7

31,545

8,441

16,956

68,968

20,251

60,492

METABASE/METACORE CONTENT OVERVIEW

• Human genes

• Human SwissProt proteins

• Mouse genes

• Mouse SwissProt proteins

• Rat genes

• Rat SwissProt proteins• Compounds

• Compounds with structure

• Endogenous compounds

• Nutritional compounds

• Metabolites of xenobiotic

• Drugs

- Biologics

- Small Molecules

- Approved drugs

- Withdrawn drugs- Clinical trial drugs

- Discontinued drugs

- Preclinical drugs- Unknown

- Drug combination regimens

• Human genes in network

• Mouse genes in network

• Rat genes in network

• Chemical compounds

• Drugs

• Endogenous compounds• Metabolic reactions

• Transport reactions

• Processing Reactions • Pubmed journals

• Pubmed records

• Pubmed articles (unique)• Total amount of interactions

- Protein – Protein

- Compound – Protein

- Compound – Compound

- Metabolic enzyme -Reaction

- Transporter – Reaction

- Substrate, Product – Reaction

- RNA – Protein

• Pathway maps

- Human genes in maps

- Mouse genes in maps

- Rat genes in maps

- Interactions in maps

849,694

760,787

11,340

47,130

4,681

101,732

148,689

1,550

5,457

7,529

6,856

6,704

47,959

8,021

831,988

126

29,754

9,166

1,344

7,822

2,293

261

5,037

1,188

136

251

24,798

21,519

18,730

4,603

3,557

36,945

4,162

3,700

2,141,916

273,267

1,924,033

850,9064,681

390,215

Page 8: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

8

METABASE/METACORE CHEMICAL COMPOUND CONTENT

Kinases (30174)

Binding Proteins (11886)

Phospholipases (2186)

Proteases (39135) Phosphatases (5005)

Transporters (15849)

Ion Channels (30043)

Transcriptional Factors (16949)

Enzymes (93557)Nuclear Receptors (15404)

Receptors with Kinase Activity (13026)

Receptors GPCR (122598)

Compound distribution by target

Compounds 850,906 100.0%

Compounds in

Network 390,215 45.9%

Compounds in

reactions 38,267 4.5%

Chemical

compounds related

to toxic pathology 3,548 0.4%

Page 9: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

9

INTERACTIONS BY TYPE

38.2%

5.2%

0.4%2.4%

47.1%

6.7%

Protein-Protein, RNA-Protein Interactions

Regulation of Transcription (381290)

Influence on Expression (52041)

Unspecified Regulation (3629)

Covalent Modification (24435)

Direct interaction (470914)

Co-regulation of transcription (66530)

0.1% 3.2%0.3%

46.4%

50.0%

Chemical Compound-Protein Interactions

Other (1649)

Unspecified (48672)

Influence on expression (4858)

Binding (705547)

Small Molecule-Prot interactions (760726)

Page 10: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

10

Sites of modification

1,328 Modified sites

4,108

METABASE PTM CONTENT

sitesinvolved in

interactions after modification

Enzyme-Substrate

interactions

Amount of

interactions by

mechanism of

modification

Amount of

interactions with

defined sites of

modification

Amount of

interactions

with effect

and with

defined sites

of

modification

Amount of

records with

defined sites of

modification

Number of unique

substrate proteins

with defined sites of

modification

Number of unique

sites of

modification

Total 23,023 4,102 2,575 7,330 2,423 4,108

Phosphorylation 16,661 3,482 2,141 6,264

Dephosphorylation 1,234 226 169 410

Other types 5,128 394 265 656 337 550

2,236 3,558

Interactions with modified

proteins

Amount of

interactions with

defined modified

sites of proteins

Amount of

records with

defined modified

sites of proteins

Number of

unique

proteins

interacting in

modified

status

Number of

unique modified

sites in

interactions

Total 3,013 3,855 1,177 1,328

Phosphorylated sites 2,631 3,357 1,018 981

Other sites 410 501 232 347

1,024

Page 11: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

11

NON-CODING RNAs

59%

5%4%

4%

28%41%

Human RNAs in Network

microRNA

asRNA

lincRNA

snoRNA

otherRNAs

Non-coding RNAs

interactions Total amount of interactions

Incoming

interactions

Amount of

records for

incoming

interactions

Outgoing

interactions

Amount of

records for

outgoing

interactions

Predicted vs

validated

interactions

microRNA 143,083 10,935 14,255 132,183 183,029 109136/33947

Other types of non-

coding RNAs 8,101 5,743 6,447 2,365 2,652 3722/4379

Non-coding RNAs

per organism in

Network

Human Mouse Rat

microRNA 4,455 3,102 1,254

Other types of non-

coding RNAs 1,918 544 44

Page 12: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

12

INTERACTIONS BY MECHANISM

6.7%

3.3%

0.5%

1.1%0.1%

41.6%

0.1%0.1%0.5%

25.1%

4.4%

3.7%

3.1%

0.4%

0.1%0.5%

8.7%

Interaction substrates and products with reactions (101732)

Unspecified (50628)

Covalent modification (7137)

Phosphorylation (16735)

Dephosphorylation (1247)

Binding (632478)

Competition (1113)

Transformation (1391)

Cleavage (7065)

Transcription regulation (381290)

Co-regulation of transcription (66530)

Influence on expression (55979)

Catalysis (47130)

Transport (5468)

Pharmacological effect (2159)

Toxic effect (8077)

miRNA binding (132423)

Page 13: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

13

(30367)

(5019) (4068)

6,532

19,561

65,135

30,367

DRUG-TARGET INTERACTION STATISTICS

6%

94%

Drug Network Objects

Small molecules Biologics

77.3%

14.6%

7.7%

0.5%

Interaction by mechanism

Binding

Unspecified

Influence on Expression

Covalent modification

0.4%

99.6%

Target Network Objects

RNAs Proteins

interactions with

records.

unique targets

from unique

articles.

Page 14: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

14

NETWORK OBJECTS

8.5% 3.2%

73.6%

12.3%

1.2% 1.2%

Proteins (25424)

RNA (9611)

Chemical Compounds (220717)

Metabolic reactions (36899)

Transport Reactions (3628)

Processing Reactions (3742)

Page 15: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

15

METABASE TOXICITY CONTENT

Database objects related to Toxic pathology Volume

Gene aberrations in Tox notes 366

Genes with aberrations in tox notes 323

Chemical compounds related to toxic pathology 3,548

Chemical Toxic agents 2,585

Chemical Protective agents 1,159

Chemical markers 309

Proteins and RNAs as markers 4,067

Genes encoding marker Proteins/RNAs 3,027

References for Toxicity content Volume

Total amount of tox notes 90,228

Notes with toxic agents 60,078

Notes with protective agents 17,343

PubMed articles in tox notes 8,339

Other refs in tox notes 648

Associations related to Toxicity content Volume

Chemical agent-Protein-Pathology-PMID associations 212,125

Chemical agent-RNA-Pathology-PMID associations 52,403

Chemical agent-Endogenous compound-Pathology-PMID associations 123,128

69%

31%

Toxic/Protective agents

Chemical Toxic agents

Chemical Protective agents

Page 16: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

16

TISSUE-SPECIFIC TOXIC PATHOLOGY STATISTICS

Database objects related to Toxic pathology

Genes with

aberrations related

to pathology

Chemical Toxic

agents

Chemical

Protective agents Chemical markers

Proteins and

RNAs as

markers

Genes encoding

marker

Proteins/RNAs

Bone joint pathology 0 8 8 7 60 54

Bone marrow pathology 5 154 19 23 150 140

Bone pathology 3 67 29 25 178 163

Epididymis pathology 1 326 44 54 289 271

Esophagus pathology 4 118 0 8 47 53

Forestomach pathology 4 196 10 14 67 75

Glandular stomach pathology 8 180 130 28 149 135

Heart pathology 8 330 95 51 489 438

Intestine pathology 276 545 558 148 1,538 1,163

Kidney pathology 25 1,078 231 96 922 737

Liver pathology 42 1,334 327 128 1,503 1,216

Lung pathology 22 471 159 71 562 458

Nose pathology 3 200 3 8 35 47

Testicular pathology 7 479 77 83 512 436

Trachea pathology 1 73 0 7 38 45

Page 17: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

17

1,550

PRE-BUILT MAPS AND NETWORKS

Graphic content

Maps : Networks:

ACM2 and ACM4 activation of ERK

Inflammation_IL-6 signaling

Unique content Volume

Processes networks 159

Metabolic endo

networks 118

Toxity networks 395

Metabolic networks 250

Disease biomarkers

networks 88

Drug target networks 92

1,102

Unique content Volume

Regulatory maps 662

Disease maps 721

Metabolic maps 138

Toxicity maps 29

Page 18: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

18

9,422

94,252

2,762

DISEASE STATISTICS (MetaCore)

Genes NOT linked to Diseases

(40047)66%

Genes linked to Diseases

(20445)

34%

Human genes total

Diseases linked to

Genes

29%

Diseases NOT linked to

Genes

71%

Diseases, based on MESH+OMIM

genes are linked to 20,445 diseases

unique gene-disease associations from 182,786

60,492

articles

Page 19: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

19

117,374

3,478

DISEASE STATISTICS (METABASE)

Genes NOT linked to Diseases

(38947)64%

Genes linked to Diseases

(21545)

36%

Human gene total

Diseases linked to Genes

(3478)

37%

Diseases NOT linked to

Genes (5944)

63%

Disease, based on MESH+OMIM

genes are linked to diseases

unique gene-disease associations from articles

21,545

240,223

60,492 9,422

Page 20: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

20

OBJECT TYPES IN GENE-DISEASE ASSOCIATIONS

Database object Number of GenesNumber of

Objects

Genetic variants and epigenetics 18,618 331,948

Amplification 999 1,051

Rearrangement 942 3,221

Locus change 125 128

Fusion gene 201 135

Haplotype/SNP 18,323 325,417

Epigenetic modification 1,513 1,693

mRNA level 7,647 8,979

Major transcript , prot-coding 6,673 6,573

Alternative transcript, prot-coding 418 931

miRNA/other non-coding RNA 921 1,427

Protein abundance, activity, concentration or localization

change 5,653 6,779

Generic protein 5,597 5,737

Peptide 27 52

Posttranslational modifications 265 432

Isoform 233 429

Mutant protein 47 129

Endogenous metabolites - 583

Genetic variants and epigenetics

- Genes (18443)

mRNA (7411)

Protein (5595)

38,832

14,738Aberration biomarkers

Quantitative biomarkers

Page 21: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

21

What do you need to do?

Disease pathway modeling

and investigation of casual

mechanisms

Knowledge mine manually curated data from peer reviewed sources to generate hypothesis.

Incorporate ‘omics’ data to further validate these hypothesis.

Target and biomarker

assessment and validationPathway analysis of ‘omics’ data for drug and biomarker discovery.

Understand the

mechanisms of genes

associated with variants

Combine variant data with other ‘omics’ data for a systems view of your disease.

Patient stratification and

comprehensive sample

comparison

Compare multiple ‘omics’ datasets at once to uncover differences and similarities in patient groups, time courses, or drug treatments.

Page 22: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

22

Systems Biology Solutions

Disease

Specialty

Modules

Manually curated from Journal Literature

Programmatic AccessAPI of SQL ACCESS

Content

Add-ons

Analysis

Platforms

Systems

Toxicology

Modules

METACOREPathway Analysis

Platform

METADRUGCompound Activity

Prediction Platform

Pipeline

Pilot

METABASE

Page 23: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

23

Confidence in interactions used for interpretation

ONE Complete Global Network

“The results of this study show that for the majority of pathway databases, the overlap between experimentally obtained target genes and targets reported in transcriptional regulatory pathway databases is surprisingly small and often is not statistically significant. The only exception is MetaCore pathway database which yields statistically significant intersection with experimental results in 84% cases.”

Percentage of statistically significant

intersections with gold standards

Transcription factor/

Gold standard ID#

Systematic study of transcription factors

and their targets identified through “gold

standard experiments” and intersection

with transcriptional regulatory interactions

in free and commercially available

databases

16% Ingenuity (Transcription)

36% Ingenuity (All)

32% TransPath

16% TransFac

16% Biocarta

24% KEGG

8% Wikipathways

16% Cell Signaling Technology

16% GeneSpring (Expression or Binding)

4% GeneSpring (Expression and Binding)

28% PathwayStudio

84% MetaCore

Assessing quality and completeness of human transcriptional regulatory

pathways on a genome-wide scale

Biology Direct 2011, 6:15 doi: 10.1186/1745-6150-6-15

Page 24: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

24

From peer reviewed articles to signaling pathways

195 publications for EGF-EGFR interaction

Manual annotation from publications• Team of PhDs, MDs• More than 10 years

Publications Molecular Interaction 1,600CANONICAL AND

DISEASE SIGNALING PATHWAYS

1,800,000 molecular interactions

Global Network

Page 25: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

25

Unique features of MetaCore interactions

Molecular function feature

MetaBase Does it exist in public domain?

Directionality There is always a initiator molecule and effector (acceptor) molecule. •Directionality exists for all types of interactions:both presented and not presented on pathway maps

Directionality/effect/mechanism may be available for interactions presented in pathway databases, but these are exclusive onlu to interactions found in pathways not the entire network

Effect Indicates if an initiator molecule activates or inhibits an effector. •Effect exists for all types of interactions: both presented and not presented on pathway maps

Some specialized databases that focus on miRNA action, Transcription

Factor regulation or PTM modifications have directionality/mechanism

due to the nature of their interaction mechanisms. However they are source specific and often include low trust experimental sources. Effect, however, is usually unavailable.

Mechanism of interaction Describes physico-chemical process by which molecules interact. •Theses are not exclusive to physical interactions (e.g. binding, phosphorylation, etc) but also catalysis, miRNA binding, transcriptional regulation and indirect causal effects like influence on expression caused by other molecules including natural ligands and compounds (including drugs).

The majority of databases will have different gene/protein associations

1) physical binding, 2) Co-expression, 3) Disease co-association;4) Genome neighborhood.However these databases lack functional feature data especiallymechanism of interaction.

Page 26: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

26

Flexibility in data analysis

11 Different Network Building Algorithms, all with

written and visual descriptions

Multiple automated Workflows to

save, share and export

But also One-Click analysis for

instant answersCausal reasoning algorithm to find key hubs

Page 27: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

27

MetaCore— Genomic Analysis Tool

Genomic Analysis Tools (GAT) allow analyzing gene variant data obtained by Next Generation Sequencing techniques. It contains tools for cohort and family trio analysis followed by comprehensive annotation and interpretation of gene variants required for identifying potentially causal ones. All gene variants could be used for pathway and network analysis.

Functional Prediction Pathway AnalysisAnnotation

Filter based on standard and knowledge based options

Understand variant role in disease through pathway analysis

Predict damaging changes due to variant

Gene Variant

IDENTIFY TOP-SCORED VARIANTS

Gene Variant

Gene Variant

DISEASE

Variants With KnownPhenotypic Impact

Variants Predicted To Be Damaging

Page 28: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

28

Multi Omics Simultaneous Analysis

Combining metabolomic,

transcriptomic and gene variants

for side by side –omics

enrichment analysis

Page 29: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

29

Companion modules available with MetaCore

Data Annotation and Processing Tool (DAPT) serves for processing of raw microarray data with following uploading of resulting differentially expressed gene sets directly to MetaCore. You can upload a dataset from a public repository and make a differential expressed gene list.

Pathway Map Creator (PMC) is a companion application that allows creation of custom pathway diagrams in MetaCore style. Such maps might be a visualization of analysis made in MetaCore adapted for publication or uploaded to MetaCore making it a part of analysis of new data.

Page 30: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Pathway Creation Algorithms in MetaCore

Page 31: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Direct Interactions Algorithm

• Draws direct interactions between selected objects.

No additional objects are added to the network

Page 32: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Self regulatory Networks

Finds the shortest directed paths containing transcription factors between your genes in the gene list.

(better used for small number of targets)

Page 33: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Auto expand

Draws sub-networks around

the selected objects, stopping

the expansion when the sub-

networks intersect

Page 34: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Pathway Creation Algorithms in MetaCore

• Analyze Network: Creates a list of possible networks, ranked according to how many objects in the network correspond to the user's list of genes, how many nodes are in the network, how many nodes are in each smaller network.

• Analyze Transcription Network similar to above, sub-networks created are centered on TFs.

• Analyze Networks (Transcription Factors) focusses on presence of TFs at end notes.

• Analyze Networks (Receptors) focusses on presence on Receptors at end point of a network.

Page 35: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Analyze Network Algorithm

Generates sub-

networks highly

saturated with selected

objects. Sub-networks

are ranked by a P-

value and

G-Score and

interpreted in terms of

Gene Ontology

Page 36: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Analyze Networks (Transcription Factors) Algorithm- an example

Favors network construction where the

end-nodes of transcriptionally

regulated pathways are present in the

original gene list.

P=7.2e-46

Example from an mRNA

expression analysis data set

comparing healthy and lesion

skin.

Page 37: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Analyze Network (Receptors) Algorithm- an example

Favors network construction where the

end-point of a pathway leads to a receptor

(through “receptor binding”) and the starting

point of a pathway (a transcription factor, or

ligands, etc.) is present in the original gene

list, regardless of the presence of the end-

point receptor in the list.

Page 38: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Transcription Regulation Algorithm

13 targets/14 nodes

P=7.3e-31

Generates sub-networks

centered on transcription

factors. Sub-networks are

ranked by a P-value and

interpreted in terms of

Gene Ontology

Page 39: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Immune response: Histamine H1 receptor signaling in immune response (p=1e-4)

Page 40: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Disease biomarker enrichment

Page 41: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Network-Disease Associations

1) Carcinoma (72% coverage, p=3.3e-10)

2) Neoplasms, connective and soft tissue. (42% coverage, p=8e-10)

Page 42: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Use of Pathway Analysis in Candidate Gene Identification

1061 genesare located to mapped region for disease

FGF2, WNT5A, Tenascin-C, EGF, ILI1RN, BDNF, TGF-beta2, FGF2, OSF-2, CSPG4(NG2), IL-8, ENA-78, GCP2, SLIT2, SLIT3, Activin beta A, Annexin I

360 genes up- or down-regulated by >2x

17 receptor ligand genes are important “input” nodes to pathways formed by genes with changed expression.

Other up- or down-regulated genes

Page 43: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

Pathway analysis narrows down number of candidate genes for disease

ErbB2PECAM1DDX5BCAS3 microRNA1 RARalpha MUL VHR WIPErbB2 NIK Plakoglobin HEXIM1 Prohibitin STAT5A STAT3ClathrinPSME3PSMC5ErbB2

FGF2, ILI1RN,ErbB2

360 genes up- or down-regulated by >2x

Other up- or down-regulated genes

These genes, from mapped region of interest, are able to form interaction pathways going through these receptor ligands identified by first analysis.

Page 44: Data-Mining and Pathway Analysis January 2020 · Why Pathway Analysis Software? • A learning tool – Study a group of gene products. • A data analysis tool. – Which pathways

A caveat

Not every gene belongs to a pathway in the database…