21
module-maps: improving interpretability and prediction in systems biology David Amar School of Computer Science Tel Aviv University July 2013 1

NetBioSIG2013-Talk David Amar

Embed Size (px)

DESCRIPTION

Presentation for Network Biology SIG 2013 by David Amar, Tel-Aviv University, Israel. “Algorithms for Mapping Modules in Pairs of Biological Networks”

Citation preview

Page 1: NetBioSIG2013-Talk David Amar

1

From gene networks to module-maps:

improving interpretability and prediction in systems biology

David AmarSchool of Computer Science

Tel Aviv UniversityJuly 2013

Page 2: NetBioSIG2013-Talk David Amar

2

Biological interaction networksNodes: genes/proteins or other moleculesEdges based on evidence for interaction

Voineagu et al. 2011 Nature

Breker and Schuldiner 2009

Gene co-expression

Protein-protein interaction

Genetic interaction

Goal: Integrated analysis of different types of networks

Page 3: NetBioSIG2013-Talk David Amar

3

Integration of networksBetter picture, reduces noiseTraditional approaches:

Look for “conserved” clusters co-clustering (Hanisch et al. 2002); JointCluster

(Narayanan et al. 2011), Look for clusters with special properties

MATISSE (Ulitsky and Shamir 2008)

Page 4: NetBioSIG2013-Talk David Amar

4

Analysis of network pairsInteractions types can differ: within (“positive”)

vs. between (“negative”) functional units Input: networks P, N with same vertex setGoal: summarize both networks in a module map

Node – module: gene set highly connected in PLink – two modules highly interconnected

in NBetween-pathway models

Kelley and Ideker 2005Ulitsky et al. 2008Kelley and Kingsford 2011Leiserson et al. 2011

PN

Page 5: NetBioSIG2013-Talk David Amar

5

AlgorithmsDifferent definitions for the links and the

optimization objective functionProblems are NP hardApproximation is also hard (weighted

graphs)

Our algorithmic strategy: Initiators: Find a good initial solutionImprovers: refine by merging/excluding

modules

Page 6: NetBioSIG2013-Talk David Amar

6

Initiators Cluster P

HierarchicalNode addition

Find linked module pairs DICER: Local search in

the P and N (Kelley, Ideker 2005, Amar et al. 2013)

MBC-DICER: Find bi-cliques

Define candidate sets U and V that are bicliques in N

Exhaustive solver (FP-MBC Li et al. 2007) - requires tuning

Page 7: NetBioSIG2013-Talk David Amar

7

Local Improvement (DICER algorithm, Amar et al. PLoS CB 2013)Link: sum of N weights between modules is

positiveGoal: enlarge links

Greedy approachMerge module links or add single nodes to link

Page 8: NetBioSIG2013-Talk David Amar

8

Global analysis: node vs. moduleNull hypothesis: edges

between v and M are drawn randomly (n=deg(v))

Hyper-geometric p-valueOptions for weighted

graphs:Use Wilcoxon rank-sum

testSet a threshold and use the

same test

M

Not M

v

Page 9: NetBioSIG2013-Talk David Amar

9

Global analysis: module vs. moduleCalculate a p-value for each

node in V and each node in UMerge p-values using Fisher’s

method

Under the null-hypothesis follows a Chi-square distribution (dfs=number of p-values)

U V

Other nodes

Page 10: NetBioSIG2013-Talk David Amar

10

Global analysisGiven a set of modules M and a set of

significant links L, the solution score:

Improvement steps: merge modules if the score improves (select the best step iteratively)

Fast and accurate analysis:Decide when to recalculate p-values Perform many merges simultaneously

Page 11: NetBioSIG2013-Talk David Amar

11

Experimental Results

Page 12: NetBioSIG2013-Talk David Amar

12

(0) SimulationsGraphs with 500 nodes, edge weight 1, non edge -1Plant a tree map with 6 modules (module size 10-20)Add random Gaussian noise (mean 0, SD = 1.2), additional

modules, bi-cliques

MBC-D

ICER

DICER5

hier

arch

ical

NodeA

dditi

on

DICER

MBC-D

ICER

DICER5

NodeA

dditi

on

hier

arch

ical

DICER

MBC-D

ICER

DICER5

hier

arch

ical

NodeA

dditi

on

DICER

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Jaccard

Global Local Initiator only

Page 13: NetBioSIG2013-Talk David Amar

13

(1) Yeast PPI and GI networks3979 genesP: protein-protein interactions (45,456 edges)N: negative genetic interactions (76,267

edges)Local improvers: poor results (less than 3

links)Results for global improver:

Initiator Modules Gene coverage

Max module

size

Enriched GO terms

Enriched modules

(%)

Enriched links (%)

Links

MBC-DICER 100 946 49 243 87 80 430

DICER5 103 957 46 249 82 74 438

DICER 104 837 34 192 67 61 498

Hierarchical 123 877 30 186 68 59 394

NodeAddition 102 950 49 240 83 79 430

Page 14: NetBioSIG2013-Talk David Amar

14

Link p <10-50

Chromatin related hubs similar to Baryshnikova et al. 2011

The yeast module map

Page 15: NetBioSIG2013-Talk David Amar

15

The top links in the map (p <10-70)

Between complexes

Between

subcomplexes

Page 16: NetBioSIG2013-Talk David Amar

16

Comparison to extant methodsAnalysis of the Collins et al. 2007 dataComparing to extant methods that exploit

both positive and negative GIs and their weights

AlgorithmNumber of modules

Gene coverage

Maximal module size

Number of enriched GO terms

Percent enriched modules

Percent enriched

linksNumber of

links

MBC-DICER (Global) 32 238 20 53 84 79 67

Genecentric (Leiserson et al. 11)

116 1248 25 39 63 43 58

Kelley and Kingsford 11 117 355 17 32 17 6 403

Page 17: NetBioSIG2013-Talk David Amar

17

(2) Arabidopsis PPI & MD networks P: PPIs. N: metabolic dependencies (Tzfadia et al. 2012)

Discover protein complexes and their metabolic links

Page 18: NetBioSIG2013-Talk David Amar

18

Using the module map for function predictionValidated modules by their ability to predict gene

functions in MapMan Function assignment: the gene’s module best

assignmentLOOCV: precision and recall > 80%

Gene MapMan termModule p-value

AT5G48000sulfur-containing.glucosinolates 0.0001

AT5G42590sulfur-containing.glucosinolates 0.0001

AT2G30870redox.ascorbate and glutathione.ascorbate 0.0028

AT4G15440 isoprenoids.carotenoids 0.0002

AT1G62830 isoprenoids.carotenoids 0.0003

AT4G01690 isoprenoids.carotenoids 0.0003

New predictions

Page 19: NetBioSIG2013-Talk David Amar

19

(3) Human case-control profilesData: expression profiles of Lung cancer (blood)P: multi-phenotype co-expression network ; N: differential

correlation (DC): change in correlation in disease vs. controls

Cross-validation: most links show high DC in the test set

Link example:

Breakage of immune activation in cancer (enrichment q-value<1E-10)

Enrichment for NSLC-specific causal miRNA (mir-34 family, p =0.002, mir2disease DB)

Page 20: NetBioSIG2013-Talk David Amar

20

SummaryIntegration of networks

Considering different interaction typesA summary module-map

AlgorithmsInitiatorsImprovers

Algorithms perform well in simulations and real dataPPI+GIPPI+MDHuman disease: correlation and differential correlation

Next steps (?)Cytoscape app (maybe next year…)Can we use module maps instead of gene networks for network

inference?

Page 21: NetBioSIG2013-Talk David Amar

21

Thank you!

Ron Shamir