24
Pasteur Institute Mobyle Developers Workshop 28 September 2012 Jennifer Dommer, HPC Web Developer Alex Levitsky, HPC Infrastructure Team Lead NIAID OCICB Bioinformatics & Computational Biosciences Branch (BCBB)

HPC Web overview - Mobyle Workshop - September 28, 2012

Embed Size (px)

Citation preview

Page 1: HPC Web overview - Mobyle Workshop - September 28, 2012

Pasteur Institute – Mobyle Developers Workshop

28 September 2012

Jennifer Dommer, HPC Web Developer

Alex Levitsky, HPC Infrastructure Team Lead

NIAID OCICB Bioinformatics & Computational

Biosciences Branch (BCBB)

Page 2: HPC Web overview - Mobyle Workshop - September 28, 2012

Outline

What is HPC Web?

Project Goals and Background (5 min.)

HPC Web Design (10 min.)

Use of the Mobyle Framework in HPC Web (10 min.)

HPC Web Video Demos (15 min.)

BMID and BMPS Overview

HPC Web Next Steps

Questions/Discussion (10 min.)

2

Page 3: HPC Web overview - Mobyle Workshop - September 28, 2012

What is HPC Web?

Web application developed by National Institute of Allergy and Infectious Diseases (NIAID) Bioinformatics and Computational Biosciences Branch (BCBB)

HPC Web Team:• Alex Levitsky, HPC Infrastructure Team Lead

• Vivek Gopalan, Former HPC Infrastructure Team Lead

• Jennifer Dommer, Software Developer• Jie Li, Former Software Developer

• Ramandeep Kaur, Software Developer

• Karlynn Noble, Designer/Communications• Darrell Hurt, Mariam Quinones, Andrew Oler, Vijay

Nagarajan, Xavier Ambroggio, Kurt Wollenberg, Mike Dolan, Burke Squires, Maarten Leerkes, Subject Matter Experts

• Nick Weber, Project Manager

• Tram Huyen, Project Sponsor3

Page 4: HPC Web overview - Mobyle Workshop - September 28, 2012

What is HPC Web?

Web interface to NIAID High Performance Computing

(HPC) cluster

Leverages Mobyle framework for job submission, data

management, and pipeline creation

4

Page 5: HPC Web overview - Mobyle Workshop - September 28, 2012

NIAID HPC Cluster Configuration

5

Page 6: HPC Web overview - Mobyle Workshop - September 28, 2012

Project Goals

Democratize access to high performance computing

resources

• Allow non-command-line-savvy bench researchers

to access sophisticated computational tools and

infrastructure for their high-throughput research data

Provide capabilities to:

• Engage an interactive user community

• Access, manage, and share HPC files through an

intuitive web interface

• Run, track progress, and re-run jobs using simple

web forms and interfaces

• Create simple, automated analysis pipelines

6

Page 7: HPC Web overview - Mobyle Workshop - September 28, 2012

Project Background

2010

• NIAID HPC infrastructure established– Small cluster of ~5 nodes, 30 cores

• Late 2010 HPC Web v1 released– Static content about how to use HPC resources, which

applications were installed, and how to use them

– Frameworks established, including integration of Mobyle

– Simple functionality for requesting accounts and support, viewing cluster status, engaging with community, etc.

– Integrated with custom UCSC Track Manager application

2011

• HPC Web phase II development began– Cluster had grown from 5 to nearly 40 nodes, from 30 to nearly

400 cores

– Project scope to include job submission, data mangement, and pipeline creation from web

7

Page 8: HPC Web overview - Mobyle Workshop - September 28, 2012

Project Background (continued)

2012

• Cluster continuing to grow (now ~50 nodes, 600+ cores, GPU- and Infiniband-enabled)

• Approximately 750 TB data, with plans in place to expand data storage and implement hierarchical storage management / archiving mechanisms to support future growth

• HPC Web Phase II released in May 2012– ~20 applications with Mobyle interfaces, for a total of ~60 forms

for job submission (including sub-packages for applications, e.g., tools within SAMtools suite)

– Limited number of standardized workflow templates

E.g., RNA-seq-single-sample-mapping, which maps RNA-seq reads to a reference genome using TopHat, then passes the alignment file to 1) Cufflinks to assemble transcripts and quantify the expression and to 2) SAMtools to index the alignment file)

8

Page 9: HPC Web overview - Mobyle Workshop - September 28, 2012

SGE submit

host

DRMAA library

HPC Web Server

SGE Compute

nodes

Storage/Shared folder

/group folder

/application folder

Mobyle library

HPC Web job submission implementation schema using Mobyle

Apache user (hpcwebadm) Apache user

Apache user

Apache user

Apache user

Authorization

module

Apache user

Apache user

Page 10: HPC Web overview - Mobyle Workshop - September 28, 2012

HPC Web Mobyle Job Management Interface

Let‘s focus on the job bl2seq.T11045404625893

Page 11: HPC Web overview - Mobyle Workshop - September 28, 2012

BLAST result obtained from server

Mobyle job results page for bl2seq.T11045404625893

Page 12: HPC Web overview - Mobyle Workshop - September 28, 2012

SGE account details job

bl2seq.T11045404625893

qacct command for the job

Job runs using SGE

DRMAA library is used

for job submission from

Mobyle

Job runs as apache user

We could show any of

these parameters in the

HPC Web interface

• Start time

• Queue time

• End time

• Cpu time

Page 13: HPC Web overview - Mobyle Workshop - September 28, 2012

HPC Web Video Demos

Navigating the HPC Web interface:

• http://www.youtube.com/watch?feature=player_emb

edded&v=cxxALr5PGlY

Using My File Manager in HPC Web

• http://www.youtube.com/watch?feature=player_emb

edded&v=9K8h2l28S2Y

Submitting jobs to Cluster from HPC Web

• http://www.youtube.com/watch?feature=player_emb

edded&v=9K8h2l28S2Y

13

Page 14: HPC Web overview - Mobyle Workshop - September 28, 2012

BCBB Mobyle Interface Designer (BMID)

A web based GUI for creating Mobyle XML using drag-and-drop options and wizards

Eliminates the need to manually generate XML, aiming to facilitate community creation of interfaces and minimize development “bottlenecks”

14

Page 15: HPC Web overview - Mobyle Workshop - September 28, 2012

15

Mobyle Framework: Command-line Application to Web Application

Page 16: HPC Web overview - Mobyle Workshop - September 28, 2012

BCBB Mobyle Pipeline System (BMPS)

Leverages Mobyle framework to string applications

together such that the output of one process becomes

the input of the next

Simplifies analysis by automating standard set of

procedures that may have previously required manual

processing

Enables sharing of useful/novel pipelines among

users

Facilitates QC analysis by making it easy to iteratively

tweak one or a few parameters of an application

within a saved pipeline and validate results

16

Page 17: HPC Web overview - Mobyle Workshop - September 28, 2012

Example BMPS Template

17

Other BMPS template examples

available in HPC Web:

• ChIP-seq-with-control

• Map-reads-and-index

• Fastq-quality-boxplot

Page 18: HPC Web overview - Mobyle Workshop - September 28, 2012

Next Steps in HPC Web Development

Continued development of web forms, especially for

NGS and structural biology applications

BMID interface enhancements

BMPS/Pipeline system enhancements, including

additional templates

Integration with Mobyle2 framework

18

Page 19: HPC Web overview - Mobyle Workshop - September 28, 2012

Feature Request Considerations

Workflow template sharing between HPC users

Data sharing with non-HPC account holders, including

those outside NIH

Ability for users to create their own application

interfaces using BCBB Mobyle Interface Designer

(BMID), and share interfaces with others

19

Page 20: HPC Web overview - Mobyle Workshop - September 28, 2012

Discussion

Comments/Questions?

20

Page 21: HPC Web overview - Mobyle Workshop - September 28, 2012

Thank You!

For more information, please contact:

21

Page 22: HPC Web overview - Mobyle Workshop - September 28, 2012

Reference Slides

22

Page 23: HPC Web overview - Mobyle Workshop - September 28, 2012

23

HPC Web System Architecture

Page 24: HPC Web overview - Mobyle Workshop - September 28, 2012

24

Enterprise Storage

Client

CollabSharpointsite

GWT GWT - DND

GWT - IncubatorWeb server/SGE submit host

SGE DRMAA Library

CXF web serviceslibrary MobyleFramework

Apache Web Server Tomcat web server

PythonJava

Jespa library

SGE workerBio Applications

TopHatBowTieSSAHAetc

SGE

LDAP server

JavaScript enabledBrowser

Ajax libraries

(Only during development)

JSON object

SOAP Object