24
Cytoscape Cyberinfrastructure Leveraging Microservices to the Cloud and Beyond Chapter 1 ISMB/NetBioSIG 2015 Dublin, Ireland July 10, 2015 Keiichiro Ono, Dexter Pratt & Barry Demchak Ideker Lab 1

Cytoscape ci chapter 1

Embed Size (px)

Citation preview

Page 1: Cytoscape ci chapter 1

1

Cytoscape CyberinfrastructureLeveraging Microservicesto the Cloud and Beyond

Chapter 1

ISMB/NetBioSIG 2015Dublin, IrelandJuly 10, 2015

Keiichiro Ono, Dexter Pratt & Barry DemchakIdeker Lab

Page 2: Cytoscape ci chapter 1

2

Cytoscape’s 3 Wishes

More … memory for networks cores for analysis code reusability languages/libraries for coding

Better browser presence Access to long running calculations Quicker/cheaper novel workflows Higher quality, more shareable code Even more vibrant NB community

Page 3: Cytoscape ci chapter 1

3

Cytoscape Cyberinfrastructure (CI)

Internet-based computing ecosystem that Complements Cytoscape Supports producers, consumers, and

operators (as COIs) Scales and evolves to support data

acquisition, computing, storage, management, integration, mining, and visualization

Sharable and Testable Coevolution – community w/ CI / community

Service Oriented Architecture (SOA) Microservices + data bus + discovery

Page 4: Cytoscape ci chapter 1

4

Roadmap

Existing Ecosystem Cyberinfrastructure (CI) & Network Biology

Use Cases Strategy

Technology SOA & REST CX & middleware

CI Now & Later Support Call to Community

Page 5: Cytoscape ci chapter 1

5

Existing Ecosystem Visual Workflow Systems - Taverna & Galaxy (& MyExperiment)

Service Repositories - BioCatalogue

General programming languages & tools - Python, R, Java, Matlab, IPython/Jupyter

Network Analysis & Visualization – Cytoscape, cytoscape.js, GeneMANIA

Cytoscape CI shared workflows shared services novel workflows scalable

Page 6: Cytoscape ci chapter 1

6

CI & Network Biology

Identify network

Add data to

network

Layout nodes

Color nodes

Publish

New Service New Service New Service New Service

BridgeDB

Clients Services

Critical CI Outcomes Cheap services ~ innovation Reproducible workflows Interoperable tool chains Code & algorithm reusability Community Community Community

Page 7: Cytoscape ci chapter 1

7

CI - Future of Publishing

Page 8: Cytoscape ci chapter 1

8

NAV – CI-based Workflow

Page 9: Cytoscape ci chapter 1

9

Generic Microservices

Producer Database

OK

StoreData(xxx)

Tim

e

Producer Database

Message Bus

Sto

reD

ata(

xxx)

OK

)(xfy

For a service, the meaning of life:

Benefits Loose Coupling Late Binding Decentralized

Governance Scalability Reusability

Distributability Portability Composability Interoperability Testability

Page 10: Cytoscape ci chapter 1

10

Cytoscape CI

CytoscapeDesktop

Message Bus (Internet)Message Bus (Internet)

Analytics LayoutNDEx (Store/

Retrieve)

Journal Publishing NeXO

Personal Publishing

R/Python/Matlab

LayoutLayoutLayoutAnalyticsAnalyticsAnalytics

cyNetShare

Gene-MANIA

BridgeDB MCODE

Data Model Layouts

Serv

ices

Appl

icati

ons

CX is an aspect-oriented transfer format CX carries networks and related data

Page 11: Cytoscape ci chapter 1

11

CX Transfer Format

1

3 2

Example Graph

4

5

nodes Aspect

edges Aspect

cartesianLayout Aspect

Aspect Relationships

Organizes

Positions

ID=1 ID=2 ID=3

nodes Aspect (3 nodes)

edges Aspect (2 edges)

Source Target Source TargetID=4 ID=5

cartesianLayout Aspect

X=100 Y=100ID X=200 Y=200ID X=100 Y=200ID

CX Encoding

Benefits Streamable (large networks) Lossless (BioPAX, SGML, OpenBEL…) Extensible (new aspects) Mature parsers (JSON) JSON LD (RDF compatible) Purpose-optimized transfers (aspects) Community, community, community

Page 12: Cytoscape ci chapter 1

12

CX in Action

[

 {"nodes": [{"@id": "_:1"}, {"@id": "_:2"}]},

{"edges": [{"source": "_:1", "@id": "_:4", "target": "_:2"}]},

 {"cartesianLayout": [{"x": "100", "node": "_:1", "y": "100"}]},

 {"cartesianLayout": [{"x": "200", "node": "_:2", "y": "300"}]},

 {"nodes": [{"@id": "_:3"}]},

 {"edges": [{"source": "_:2", "@id": "_:5", "target": "_:3"}]},

 {"cartesianLayout": [{"x": "100", "node": "_:3", "y": "200"}]}

]

Page 13: Cytoscape ci chapter 1

13

API Perspective - Simple

ServiceClient

CX Library

Service call (w/CX)

REST

Results return (w/CX)CX Library

Long running jobs require long running clients Allows only one service at a time

Page 14: Cytoscape ci chapter 1

14

API Perspective - Elaborated

Node

ServiceInterfa

ce

CX Library

ServiceInterfa

ce

CX Library

ServiceInterfa

ce

CX Library

SubmitAgent

...

Node

Running

Results Collector

Results DatabaseResults Database

Client

CX Library

Complete

Monitor DatabaseMonitor Database

Status Monitor

Service call (w/CX)Service return (jobID)

Status call (jobID)Status return

REST

Mes

sage

Bro

ker

Service call (w/CX)MQ

Save

resu

lts

Query status (jobID)

Results call (jobID)Results return (w/CX)

QueuedLoad

Balancer

Page 15: Cytoscape ci chapter 1

15

Implementation Perspective

NodeService (any language)Interfa

ce

(Zero M

Q)

CX Library

SubmitAgent

(Python Flask)

Node

Results Collector (Python)

Results DatabaseResults Database

Client (any

language)

CX Library

Monitor DatabaseMonitor Database

Status Monitor

(Python)

Zero

MQ

REST MQ

Page 16: Cytoscape ci chapter 1

16

CI Now

Cytoscape

R / Python / Matlab / C#

cyREST

cyNetShare

cytoscape.js

cytoscape.js

cytoscape.js

ScienceDirect

Cyrface

cytoscape.js

NDEx

cytoscape.js

NAVNetwork

Based Stratification

Heat Dissipation

ID Translation (BridgeDB)

XGMML

.cyjs

App Store

.cyjs

WS/SOAP

Page 17: Cytoscape ci chapter 1

17

CI Later

Cytoscape

R / Python / Matlab / C#

cyREST/CX

cyNetShare

cytoscape.js

cytoscape.js

cytoscape.js

ScienceDirect

Cyrface

cytoscape.js

NDEx

cytoscape.js

NAVNetwork

Based Stratification

?DREAM?

?GIANT?Heat Dissipation

ID Translation (BridgeDB)

Layouts

Clustering (?MCODE?)

Network Prediction

(?GeneMANIA?)

Attribute Merge

CX

Enrichment

CX

CX

CX

?Taverna?

?Galaxy?

CIAuthApp Store

Page 18: Cytoscape ci chapter 1

18

CI Later w/Reuse

Cytoscape

R / Python / Matlab / C#

cyREST/CX

cyNetShare

cytoscape.js

cytoscape.js

cytoscape.js

ScienceDirect

Cyrface

cytoscape.js

NDEx

cytoscape.js

NAVNetwork

Based Stratification

?DREAM?

?GIANT?Heat Dissipation

ID Translation (BridgeDB)

Layouts

Clustering (?MCODE?)

Network Prediction

(?GeneMANIA?)

Attribute Merge

CX

Enrichment

CX

CX

CX

?Taverna?

?Galaxy?

CIAuthApp Store

Page 19: Cytoscape ci chapter 1

Support

National Resource for Network Biology (NRNB) Supports software and staging hardware

Pharma & NCI support NDEx Elsevier All sources open and on GitHub

19

Page 20: Cytoscape ci chapter 1

20

Call to Community

App authorshipCytoscape community thrives Pride of authorship, listing in App Store Tangible realization of useful research Valuable workflows for all to use Publishable results (e.g., F1000)

CI community inherits all of these! … but also: More direct path from algorithm to useful code Wider audience Easier coding & dissemination Better coding practices More resources

More [email protected]

Page 21: Cytoscape ci chapter 1

21

Reading List http://martinfowler.com/articles/microservices.html http://home.ndexbio.org/about-ndex-2 http://idekerlab.github.io/cy-net-share Lincoln Stein. Towards a cyberinfrastructure for the biological

sciences: progress, visions and challenges. http://www.nature.com/nrg/journal/v9/n9/full/nrg2414.html

Barry Demchak, et al. PALMS: A Modern Coevolution of Community and Computing Using Policy Driven Development. https://sosa.ucsd.edu/ResearchCentral/view.jsp?id=203

Stephen Goff, et al. The iPlant collaborative: cyberinfrastructure for plant biology. http://journal.frontiersin.org/article/10.3389/fpls.2011.00034/pdf

Page 22: Cytoscape ci chapter 1

22

End of Deck

Backup slides are beyond here

Page 23: Cytoscape ci chapter 1

23

Existing Ecosystem

Visual Workflow Systems Taverna & Galaxy – high level orchestration MyExperiment – sharing workflows

Service Repositories BioCatalogue

General programming languages & tools Python, R, Java, Matlab, IPython/Jupyter

Network Analysis & Visualization Cytoscape & cytoscape.js GeneMANIA

Cytoscape Cyberinfrastructure (?)

Shar

ing /

Reuse

Nove

l wor

kflow

s

Wor

ks w

/Cy

Scale

s

++ - - ++

++ - - ++

++

+ ++ + -

+ + ++ -

+ - +

++ ++ ++ ++

Page 24: Cytoscape ci chapter 1

24

CX Timings

Using Human network (18K nodes, 127K edges) CX output around 150MB Timings exclude accessing Cytoscape data

model – Cytoscape data model increases timings by 2-4x

Aspect Read (ms) Write (ms)Nodes 6 3

Edges 97 51

NodeAttrs 77 58

EdgeAttrs 1289 1077