F06-Cloud-Enabling NGS

Preview:

Citation preview

Enabling NGS analysis with(out) the infrastructure

Enis Afgan, the Galaxy Team, Anton Nekrutenko, James Taylor

Discovery of human heteroplasmic sitesenabled by an accessible interface to

cloud!computing infrastructure

Enis Afgan, Hiroki Goto, Ian Paul, Francesca ChiaromonteKateryna Makova, Anton Nekrutenko, James Taylor

15

CAMPAIGN EMORY BRAND GUIDELINESwww.campaign.emory.edu

Each school and unit will have its own campaign stationery, which includes the logo and marble and is immediately recognizable as part of Campaign Emory.

Stationery

N E L L H O D G S O N W O O D R U F F S C H O O L O F N U R S I N G

Emory University . 1520 Clifton Road, NE . Atlanta, Georgia 30322-4207 . 000.000.0000

C A M P A I G N

Emory UniversityNell Hodgson Woodruff School of Nursing1520 Clifton Road, NEAtlanta, Georgia 30322-4207

C A M P A I G N

AMY DORRILLChief Development Officer

Emory UniversitySchool of Nursing Development1520 Clifton Road, NEAtlanta, Georgia 30322-4207P 404.727.1234E adorrill@emory.edu

C A M P A I G N

C A M P A I G N

notecard: 7in. x 5in.

letterhead: 8.5in. x 11in.

envelope: No. 10

business card: 3.5in. x 2in. (standard size)

Permissions: you are free to blog or live-blog about this presentation as long as you attribute the work to its authors

Discovery of human heteroplasmic sitesenabled by an accessible interface to

cloud!computing infrastructure

Enis Afgan, Hiroki Goto, Ian Paul, Francesca ChiaromonteKateryna Makova, Anton Nekrutenko, James Taylor

15

CAMPAIGN EMORY BRAND GUIDELINESwww.campaign.emory.edu

Each school and unit will have its own campaign stationery, which includes the logo and marble and is immediately recognizable as part of Campaign Emory.

Stationery

N E L L H O D G S O N W O O D R U F F S C H O O L O F N U R S I N G

Emory University . 1520 Clifton Road, NE . Atlanta, Georgia 30322-4207 . 000.000.0000

C A M P A I G N

Emory UniversityNell Hodgson Woodruff School of Nursing1520 Clifton Road, NEAtlanta, Georgia 30322-4207

C A M P A I G N

AMY DORRILLChief Development Officer

Emory UniversitySchool of Nursing Development1520 Clifton Road, NEAtlanta, Georgia 30322-4207P 404.727.1234E adorrill@emory.edu

C A M P A I G N

C A M P A I G N

notecard: 7in. x 5in.

letterhead: 8.5in. x 11in.

envelope: No. 10

business card: 3.5in. x 2in. (standard size)

Permissions: you are free to blog or live-blog about this presentation as long as you attribute the work to its authors

Bioinformatics Open Source Conference, July 16, 2011, Vienna

B. Isolated Galaxy instance(s) A. Users in different labs C. Dense data center

CloudMan-as-a-Bridge

B. Isolated Galaxy instance(s) A. Users in different labs C. Dense data center

SaaS

CloudMan-as-a-Bridge

B. Isolated Galaxy instance(s) A. Users in different labs C. Dense data center

SaaS IaaS

CloudMan-as-a-Bridge

B. Isolated Galaxy instance(s) A. Users in different labs C. Dense data center

CloudMan + GalaxySaaS IaaS

CloudMan-as-a-Bridge

CloudMan Platform

A complete solution for instantiating and managing cloud resources

With automatically configured Galaxy (if desired)

Scope of tools and reference datasets exceed Galaxy Main

Deploy a (Galaxy) cluster in minutes!

CloudMan features• Deployment on Amazon Web Services Cloud

• Wizard-guided setup: requires no computational expertise, no infrastructure, no software

• Automated (thus reproducible) configuration for machine image, tools, and data

• Four modes of cluster type setup

• Dynamic persistent storage

• Elastic resource scaling: manual or automatic based on workload

• Standalone deployment, requiring no external dependencies or services

• Customizable by individual users

• Sharing of derived cluster instances -> even the customized ones!

CloudMan features• Deployment on Amazon Web Services Cloud

• Wizard-guided setup: requires no computational expertise, no infrastructure, no software

• Automated (thus reproducible) configuration for machine image, tools, and data

• Four modes of cluster type setup

• Dynamic persistent storage

• Elastic resource scaling: manual or automatic based on workload

• Standalone deployment, requiring no external dependencies or services

• Customizable by individual users

• Sharing of derived cluster instances -> even the customized ones!

Deploying a cluster on AWS

1.

Deploying a cluster on AWS

1.

2.

Deploying a cluster on AWS

1.

2.3.

Deploying a cluster on AWS

1.

2.3.

4.

CloudBioLinux + Galaxy + CloudMan =

• A lot of (NGS) tools immediately available and easily accessible

• 700GB of reference genome data

Bowtie, BWA, Samtools, MAQ, BFAST, ABySS, Velvet, MACS, Tophat, Cufflinks, MegaBLAST, BLAST, Sputnik, Taxonomy, HyPhy, Lastz, Perm, GATK, Srma, Beam, Pass, LPS, Plink, Haploview, Freebayes, Mosaik, Picard, ...

But what if your tool (or data) is missing?

1. Add it! (via automation)- CloudMan instances are self-contained

2. Save & share- With individual users or make it public

Add tools

Add data

Deployment sharing

Deployment sharing

Deployment sharing

Use CloudMan as SaaS

Use CloudMan as PaaS

It’s automated, reproducible, extensible, and transparent.

Supported by the NHGRI (HG005542, HG004909, HG005133), NSF (DBI-0850103), Penn State University, Emory University, and the Pennsylvania Department of Public Health

Dan Blankenberg Nate Coraor

Kelly Vincent

Greg von Kuster

Enis Afgan Dannon Baker

Kanwei Li

Jeremy Goecks

Anton NekrutenkoJames Taylor

Dave Clements Jennifer Jackson

Recommended