20
Network-augmented Genomic Analysis (NAGA) Applied to Cystic Fibrosis studies Salvatore Loguercio, Ph.D. [email protected] @sal99k http://sulab.org July 11, 2014 Network Biology SIG – ISMB 2014

NetBioSIG2014-Talk by Salvatore Loguercio

Embed Size (px)

DESCRIPTION

NetBioSIG2014 at ISMB in Boston, MA, USA on July 11, 2014

Citation preview

Page 1: NetBioSIG2014-Talk by Salvatore Loguercio

Network-augmented Genomic Analysis (NAGA) Applied to Cystic Fibrosis studies

Salvatore Loguercio, Ph.D. [email protected]

@sal99k http://sulab.org

July 11, 2014 Network Biology SIG – ISMB 2014

Page 2: NetBioSIG2014-Talk by Salvatore Loguercio

Cystic fibrosis overview

• inherited recessive chronic disease - chest infection, lung damage, and bowel obstruction.

• 30,000 children and adults in the US (70,000 worldwide); 1,000 new cases diagnosed each year.

• Predicted median age of survival for a person with CF: late 30s.

• Primary therapy: airway clearance techniques (ACT)

Source: Cystic Fibrosis Foundation

Page 3: NetBioSIG2014-Talk by Salvatore Loguercio

CFTR and mucous flow 3

Source: http://www.flickr.com/photos/ajc1/3737955649

• Mutation cause the body to produce unusually thick, sticky mucus

• Clogs the lungs and leads to life-threatening lung infections

• Obstructs the pancreas and stops natural enzymes from helping the body break down and absorb food

Page 4: NetBioSIG2014-Talk by Salvatore Loguercio

Golgi

ER

Lysosome

WT CFTR

WT

chloride conductance

B

C

SDS-PAGE

endosomes

Apical

surface

degradation

DF508 CFTR cannot

exit the ER

DF508

X

Credit: Bill Balch

CFTR mutations affect protein folding and export

Page 5: NetBioSIG2014-Talk by Salvatore Loguercio

A systematic approach to CF correction

Cell line: CFBE

Functional: siRNA screen

ΔF508 CFTR against PN library*

368 siRNAs that significantly rescue CFTR function

*Collection of 2500 siRNA targeting proteins involved in protein homeostasis (‘proteostasis’)

Biochemical: MudPIT proteomics

775 differentially interacting proteins (WT/ ΔF508-CFTR)

Page 6: NetBioSIG2014-Talk by Salvatore Loguercio

A systematic approach to CF correction

Functional: siRNA screen

ΔF508 CFTR against PN library*

368 siRNAs that significantly rescue CFTR function

Biochemical: MudPIT proteomics

775 differentially interacting proteins (WT/ ΔF508-CFTR)

(368)

Page 7: NetBioSIG2014-Talk by Salvatore Loguercio

Connect Functional with Biochemical data

Page 8: NetBioSIG2014-Talk by Salvatore Loguercio

Target

1

2

3

I) Compute all shortest paths from siRNA hits to the target through a weighted protein interaction network (Dijstra algorithm)

II) Prioritize connecting proteins specific to the set of high-scoring siRNA hits considered.

Connect siRNA hits to a target through the Human Interactome

2

2

Page 9: NetBioSIG2014-Talk by Salvatore Loguercio

I. Build integrated PPI network

II. Run Shortest Path analysis

III. Control for unrelated protein hubs

Page 10: NetBioSIG2014-Talk by Salvatore Loguercio

Publicly available interaction data: From 10 source databases and 11 studies

14796 proteins 169625 interactions

Quality score [0:1] for each interaction, based on experimental evidences*

*Source: Human Integrated Protein- Protein Interaction reference (HIPPIE)

d = 9 Average path length: 3.6

I. Build a weighted protein interaction network – include MS data

+ Experimental interactome

(nodes + edges)

Updated scores, based on databases and experimental interactome S(u,v) = 2 – Sexp – Sdb

Sexp=

1 if e(u,v) in exp 0

Page 11: NetBioSIG2014-Talk by Salvatore Loguercio

Target

1

2

3

2

2

I. Build integrated PPI network

II. Run Shortest Path analysis

Page 12: NetBioSIG2014-Talk by Salvatore Loguercio

Target

1

2

3

2

2

I. Build integrated PPI network

II. Run Shortest Path analysis

III. Control for unrelated protein hubs

Page 13: NetBioSIG2014-Talk by Salvatore Loguercio

siRNA library

Randomly select a subset of the same size of the target set

shortest path analysis

Repeat n times

Randomized “hubness” For each connecting node

Target

Randomization – select proteins specific for the set of siRNA hits

For each protein connecting siRNA hits to the target, compute:

Nsp: number of distinct siRNA hits that utilize the protein on its shortest path to the target

Nrnd: randomized Nsp

p-value = 𝑠𝑢𝑚(𝑁𝑟𝑛𝑑≥𝑁𝑠𝑝)

𝑙𝑒𝑛𝑔𝑡ℎ(𝑁𝑟𝑛𝑑)

Nsp, Nrnd and the associated p-value are used to prioritize connecting proteins specific to the set of siRNA hits considered

Page 14: NetBioSIG2014-Talk by Salvatore Loguercio

CFTR – PN connectors – first degree – real vs. randomized

Nsp ≥3 Select: Nsp ≥3 Nsp /Nrnd≥2 (12 proteins)

Page 15: NetBioSIG2014-Talk by Salvatore Loguercio

Assessing candidate regulators 15

42 candidate regulators

31 previously screened

11 novel genes

22 (71%) previously

identified as hits

8 (73%) validate in de novo

experiments

Page 16: NetBioSIG2014-Talk by Salvatore Loguercio

Validation of predicted protein targets

siRNA screen CFTR rescue of function

8/11 (73%) novel candidate regulators validate

x

x

x

Page 17: NetBioSIG2014-Talk by Salvatore Loguercio

Gene Symbol

Solo vs. MudPit

Vx809 vs. MudPit

SRRM1 x

CDC5L x NDKB x

TPR x AIFM1 x

2ABB x KPCD2 x PLSCR1 x

MAP3K14 x TFG x x

XRCC5 x x CTNB1 x

XPO1 x MCM7 x WDR61 x

PP2AB x H2AFX x

MYC x

Validation of predicted targets - Specificity

X: predicted : validated

siRNA screen CFTR rescue of function

New condition: Vx-809 drug

Page 18: NetBioSIG2014-Talk by Salvatore Loguercio

X: predicted : validated

siRNA screen CFTR rescue of function

Validation of predicted targets - Coverage

Restrain flow through a subset of direct interactors

Gene Symbol

Solo vs. MudPIT (partial)

Solo vs. MudPIT

(full)

Vx809 vs. MudPIT

(full)

SRRM1 x x EIF3L x STAU1 x

CAN2 x SNRPA x

AUP1 x

Good specificity Sub-optimal coverage

Page 19: NetBioSIG2014-Talk by Salvatore Loguercio

Summary

• NAGA is a network-based method to integrate functional genomics data (e.g. siRNA screens) with interactomics datasets (e.g. AP-MS, MudPIT)

• Useful for prioritizing novel functional targets and for

identifying relevant network modules

• It leverages publicly available information on protein-protein

interactions and thus is readily applicable to many scenarios where a connection between functional and biochemical data is sought

• Good specificity, coverage to be improved

Page 20: NetBioSIG2014-Talk by Salvatore Loguercio

Contact [email protected]

@sal99k http://sulab.org

Andrew Su

Su Lab

William Balch

Darren Hutt

Daniela Roth

Chao Wang

Anita Pottekat

Sumit Chanda

Stephen Soon

Dieter Wolf

Trey Ideker

Anne Carvunis

Jean Wang

Daniel Quan

Travel funding to ISMB 2014 was generously

provided by NSF and the NetBio SIG committee NetBio SIG