29
Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Embed Size (px)

Citation preview

Page 1: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Network inference from repeated observations of node sets

Neil Clark, Avi Ma'ayan

Page 2: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Network InferenceProtein-Protein interaction network Cell signaling network

Page 3: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Overview

• Network inference - the deduction of an underlying network of interactions from indirect data.

1. A general class of network inference problem2. Network inference approach3. Application:

1. inference of physical interactions: PPI 2. Inference of gene associations: Stem cell genes3. inference of statistical interactions: Drug/side effect

network

Page 4: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

GMT files

Page 5: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

The inference problem

• Input: a set of entities (genes or proteins or ...) in the form of a GMT file - the results of experiments, or sampling more generally.

• Assumptions:• 1 An underlying network exists which relates the

interactions between the entities in the GMT file• 2 Each line of the GMT file contains information on the

connectivity of the underlying network

• The problem: Given a GMT file can we extract enough information to resolve the underlying network?

Page 6: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

A synthetic example

Page 7: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Approach...• Forget for the moment that we know the underlying network and

pretend we only have the GMT file.

• Attempt to use the accumulation of our course data to infer the fine details of the underlying network.

• Consider the set of all networks that are consistent with our data - there are likely to be many.

• Use an algorithm to sample this ensemble of networks randomly.

• The mean adjacency matrix gives the probability of each link being present within the ensemble.

Page 8: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Inference live!

Page 9: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Information content

Page 10: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Analytic Approximation• When applying this approach to real data typically there are large

numbers of nodes

• Sample space of networks can be very large -> computationally demanding

• Write a simple analytical approximation which mimics the action of the algorithm. 𝑝𝑖𝑗 = 1−ෑ� ቆ1− 2𝛼𝑛𝑖𝑗𝑘ቇ𝑘

Page 11: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Compare analytic approximation

Page 12: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Correction for sampling bias• Destroy any information by a random permutation of the GMT file

and compare the actual edge weight to the distribution of edge weights from the randomly permuted GMT files:

Page 13: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Application to Infer PPIs

Malovannaya A et al. Analysis of the human endogenous coregulator complexome. Cell. 2011 May 27;145(5):787-99

Page 14: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

PPI network

Page 15: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Validataion

• Compare inferred PPI network to the following databases: – BioCarta– HPRD PPIInnateDB– IntAct– KEGG– MINT mammalia– MIPS– BioGrid

Page 16: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Comparison

Page 17: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Validation

Page 18: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Validation

Page 19: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Application to stem cells

• We used two types of high-throughput data from the ESCAPE database (www.maayanlab.net/ESCAPE).

• Chip X data: from Chip-Chip and Chip-seq experiments.– 203,190 protein DNA binding interactions in the proximity of

coding regions from 48 ESC-relevant source proteins.• Logof followed by microarray data: A manually compiled

database of Protein-mRNA regulatory interactions deriving from loss-of-function gain-of-function followed by microarray profiling.– 154,170 interactions from 16 ESC-relevant regulatory proteins

from loss-of-function studies, and 54 from gain-of-function studies.

Page 20: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Chip X network

Page 21: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Logof network

Page 22: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Combining networks

• Each data source gives a different perspective on the associations between the genes

• New insights may possibly be gained by combining the different perspectives. e.g. small but consistent associations across different perspectives will be revealed by the enhanced signal-to-noise ratio.

𝑝𝑖𝑗 = 1− ෑ� ቆ1− 2𝛼𝑛𝑖𝑗𝑘1ቇ𝑘1 ෑ� ቆ1− 2𝛽𝑛𝑖𝑗𝑘2ቇ𝑘2

ሾ … ሿሾ … ሿ…

Page 23: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Combination of Chip X and Logof

Page 24: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

An extension of the approach...

Page 25: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Application II: Inference of Network of statistical relationships in AERS database

• Adverse Event Reporting System (AERS) database contains records of ....

AERS Record 1 Drug 1, Drug 2, ... Side-effect 1, Side-effect 2, ...AERS Record 2 Dug 3, Drug 4, ... Side-effect 3, Side effect 4, ...

… …

Page 26: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

AERS sub network

Page 27: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

AERS Large-scale Adjacency Matrix

Page 28: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

And finally…

Page 29: Network inference from repeated observations of node sets Neil Clark, Avi Ma'ayan

Summary

• We described a general class of problem in network inference.

• A network of physical interactions between proteins is inferred based on high-throughput IP/MS experiments

• The method has been applied to examine associations between stem-cell genes from multiple perspectives

• We have begun to apply the approach to the inference of statistical interactions between drugs and side-effects based on the AERS database

• More details can be found on the website

�www.maayanlab.net/S2N