36
Cloud infrastructure for training in Life Sciences Manuel Corpas The Genome Analysis Centre

Cloud infrastructure for training in Life Sciences

Embed Size (px)

DESCRIPTION

Cloud infrastructure for training in Life Sciences. The Genome Analysis Centre. Manuel Corpas. [ egi.edu ]. The Genome Analysis Centre @ manuelcorpas. The Genome Analysis Centre @ manuelcorpas. Bottleneck is NOT. Production of data Technology Budget. - PowerPoint PPT Presentation

Citation preview

Cloud infrastructure for training in Life Sciences

Manuel Corpas

The Genome Analysis Centre

[egi.edu]The Genome Analysis CentreThe Genome Analysis Centre

@manuelcorpas

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Bottleneck is NOT

• Production of data• Technology• Budget

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Bottleneck IS

•TRAINING!

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Bottleneck IS

•TRAINING!–Bioinformatics

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Bioinformatics Training

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

The Genome Analysis Centre

Mick Watson

Roslin Institute

The Genome Analysis Centre@manuelcorpas

1. Most bioinformaticians are bad scientists

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Most bioinformaticians are bad scientists

2. Most biologists are bad bioinformaticians: poor computer skills, bad at maths/statistics

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Most bioinformaticians are bad scientists

2. Most biologists are bad bioinformaticians: poor computer skills, bad at maths/statistics

3. Short courses benefit no-one

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

The Genome Analysis Centre

Carole Goble

University of Manchester

The Genome Analysis Centre@manuelcorpas

• Students and trainers don’t like learning how to use new things

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

• Students and trainers don’t like learning how to use new things

• Trainees need to be eased in by using familiar stuff

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

How can we bridge the gap?

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

The Genome Analysis Centre

Titus BrownMichigan State University

The Genome Analysis Centre@manuelcorpas

1. Participants bring their laptops

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Participants bring their laptops2. Pre installed machines

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Participants bring their laptops2. Pre installed machines3. Cloud computing

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Cloud + Bioinformatics + Training

=

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Why Bioinformatics Training in the Cloud?

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

3 Advantages

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Participants can use own – Computers–Web browser

2. Graphical interaction via– X Windowes– IPython– Knitr

3. Compute can be scaled up/down depending on what it’s being taught

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Participants can use own – Computers–Web browser

2. Graphical interaction via– X Windows– IPython– Knitr

3. Compute can be scaled up/down depending on what it’s being taught

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Participants can use own – Computers–Web browser

2. Graphical interaction via– X Windowes– IPython– Knitr

3. Compute can be scaled up/down depending on what it’s being taught

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

3 Challenges

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Institutional resistance– Privacy of clinically sensitive data

2. Reliable network access and servers needed –> 30 people clicking at the same time!

3. Cost

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Institutional resistance– Privacy of clinically sensitive data

2. Reliable network access and servers needed –> 30 people clicking at the same time!

3. Cost

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

1. Institutional resistance– Privacy of clinically sensitive data

2. Reliable network access and servers needed –> 30 people clicking at the same time!

3. Cost

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Materials

Data

NM

Trainee Trainer

Registry

Genomics

VMs+tools

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

National eResearch Collaboration Tools and Resources (NeCTAR)

Watson-Haigh et al. 2013

MRC UK Microbial Genomics

• Open Stack• Each VM 32Gb RAM, 8 cores, 1Tb• Biolinux

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Nick Loman, University of Birmingham

Why Cloud?

• Very little technical knowledge required

• Snapshot ready for replication• User can take instance home

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Cloud + Bioinformatics + Training

=

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

The Genome Analysis CentreThe Genome Analysis Centre@manuelcorpas

Rafael Jiménez

[email protected]