NetBioSIG2014 at ISMB in Boston, MA, USA on July 11, 2014
Text of NetBioSIG2014-Talk by Salvatore Loguercio
Network-augmented Genomic Analysis (NAGA) Applied to Cystic
Fibrosis studies Salvatore Loguercio, Ph.D. [email protected]
@sal99k http://sulab.org July 11, 2014 Network Biology SIG ISMB
2014
Cystic fibrosis overview inherited recessive chronic disease -
chest infection, lung damage, and bowel obstruction. 30,000
children and adults in the US (70,000 worldwide); 1,000 new cases
diagnosed each year. Predicted median age of survival for a person
with CF: late 30s. Primary therapy: airway clearance techniques
(ACT) Source: Cystic Fibrosis Foundation
CFTR and mucous flow 3 Source:
http://www.flickr.com/photos/ajc1/3737955649 Mutation cause the
body to produce unusually thick, sticky mucus Clogs the lungs and
leads to life-threatening lung infections Obstructs the pancreas
and stops natural enzymes from helping the body break down and
absorb food
Golgi ER Lysosome WT CFTR WT chloride conductance B C SDS-PAGE
endosomes Apical surface degradation DF508 CFTR cannot exit the ER
DF508 X Credit: Bill Balch CFTR mutations affect protein folding
and export
A systematic approach to CF correction Cell line: CFBE
Functional: siRNA screen F508 CFTR against PN library* 368 siRNAs
that significantly rescue CFTR function *Collection of 2500 siRNA
targeting proteins involved in protein homeostasis (proteostasis)
Biochemical: MudPIT proteomics 775 differentially interacting
proteins (WT/ F508-CFTR)
A systematic approach to CF correction Functional: siRNA screen
F508 CFTR against PN library* 368 siRNAs that significantly rescue
CFTR function Biochemical: MudPIT proteomics 775 differentially
interacting proteins (WT/ F508-CFTR) (368)
Connect Functional with Biochemical data
Target 1 2 3 I) Compute all shortest paths from siRNA hits to
the target through a weighted protein interaction network (Dijstra
algorithm) II) Prioritize connecting proteins specific to the set
of high-scoring siRNA hits considered. Connect siRNA hits to a
target through the Human Interactome 2 2
I. Build integrated PPI network II. Run Shortest Path analysis
III. Control for unrelated protein hubs
Publicly available interaction data: From 10 source databases
and 11 studies 14796 proteins 169625 interactions Quality score
[0:1] for each interaction, based on experimental evidences*
*Source: Human Integrated Protein- Protein Interaction reference
(HIPPIE) d = 9 Average path length: 3.6 I. Build a weighted protein
interaction network include MS data + Experimental interactome
(nodes + edges) Updated scores, based on databases and experimental
interactome S(u,v) = 2 Sexp Sdb Sexp= 1 if e(u,v) in exp 0
Target 1 2 3 2 2 I. Build integrated PPI network II. Run
Shortest Path analysis
Target 1 2 3 2 2 I. Build integrated PPI network II. Run
Shortest Path analysis III. Control for unrelated protein hubs
siRNA library Randomly select a subset of the same size of the
target set shortest path analysis Repeat n times Randomized hubness
For each connecting node Target Randomization select proteins
specific for the set of siRNA hits For each protein connecting
siRNA hits to the target, compute: Nsp: number of distinct siRNA
hits that utilize the protein on its shortest path to the target
Nrnd: randomized Nsp p-value = ( ) ( ) Nsp, Nrnd and the associated
p-value are used to prioritize connecting proteins specific to the
set of siRNA hits considered
CFTR PN connectors first degree real vs. randomized Nsp
3Select: Nsp 3 Nsp /Nrnd2 (12 proteins)
Assessing candidate regulators 15 42 candidate regulators 31
previously screened 11 novel genes 22 (71%) previously identified
as hits 8 (73%) validate in de novo experiments
Validation of predicted protein targets siRNA screen CFTR
rescue of function 8/11 (73%) novel candidate regulators validate x
x x
Gene Symbol Solo vs. MudPit Vx809 vs. MudPit SRRM1 x CDC5L x
NDKB x TPR x AIFM1 x 2ABB x KPCD2 x PLSCR1 x MAP3K14 x TFG x x
XRCC5 x x CTNB1 x XPO1 x MCM7 x WDR61 x PP2AB x H2AFX x MYC x
Validation of predicted targets - Specificity X: predicted :
validated siRNA screen CFTR rescue of function New condition:
Vx-809 drug
X: predicted : validated siRNA screen CFTR rescue of function
Validation of predicted targets - Coverage Restrain flow through a
subset of direct interactors Gene Symbol Solo vs. MudPIT (partial)
Solo vs. MudPIT (full) Vx809 vs. MudPIT (full) SRRM1 x x EIF3L x
STAU1 x CAN2 x SNRPA x AUP1 x Good specificity Sub-optimal
coverage
Summary NAGA is a network-based method to integrate functional
genomics data (e.g. siRNA screens) with interactomics datasets
(e.g. AP-MS, MudPIT) Useful for prioritizing novel functional
targets and for identifying relevant network modules It leverages
publicly available information on protein-protein interactions and
thus is readily applicable to many scenarios where a connection
between functional and biochemical data is sought Good specificity,
coverage to be improved
Contact [email protected] @sal99k http://sulab.org Andrew Su
Su Lab William Balch Darren Hutt Daniela Roth Chao Wang Anita
Pottekat Sumit Chanda Stephen Soon Dieter Wolf Trey Ideker Anne
Carvunis Jean Wang Daniel Quan Travel funding to ISMB 2014 was
generously provided by NSF and the NetBio SIG committee NetBio
SIG