Upload
herve-menager
View
712
Download
0
Tags:
Embed Size (px)
Citation preview
Pasteur Institute – Mobyle Developers Workshop
28 September 2012
Jennifer Dommer, HPC Web Developer
Alex Levitsky, HPC Infrastructure Team Lead
NIAID OCICB Bioinformatics & Computational
Biosciences Branch (BCBB)
Outline
What is HPC Web?
Project Goals and Background (5 min.)
HPC Web Design (10 min.)
Use of the Mobyle Framework in HPC Web (10 min.)
HPC Web Video Demos (15 min.)
BMID and BMPS Overview
HPC Web Next Steps
Questions/Discussion (10 min.)
2
What is HPC Web?
Web application developed by National Institute of Allergy and Infectious Diseases (NIAID) Bioinformatics and Computational Biosciences Branch (BCBB)
HPC Web Team:• Alex Levitsky, HPC Infrastructure Team Lead
• Vivek Gopalan, Former HPC Infrastructure Team Lead
• Jennifer Dommer, Software Developer• Jie Li, Former Software Developer
• Ramandeep Kaur, Software Developer
• Karlynn Noble, Designer/Communications• Darrell Hurt, Mariam Quinones, Andrew Oler, Vijay
Nagarajan, Xavier Ambroggio, Kurt Wollenberg, Mike Dolan, Burke Squires, Maarten Leerkes, Subject Matter Experts
• Nick Weber, Project Manager
• Tram Huyen, Project Sponsor3
What is HPC Web?
Web interface to NIAID High Performance Computing
(HPC) cluster
Leverages Mobyle framework for job submission, data
management, and pipeline creation
4
NIAID HPC Cluster Configuration
5
Project Goals
Democratize access to high performance computing
resources
• Allow non-command-line-savvy bench researchers
to access sophisticated computational tools and
infrastructure for their high-throughput research data
Provide capabilities to:
• Engage an interactive user community
• Access, manage, and share HPC files through an
intuitive web interface
• Run, track progress, and re-run jobs using simple
web forms and interfaces
• Create simple, automated analysis pipelines
6
Project Background
2010
• NIAID HPC infrastructure established– Small cluster of ~5 nodes, 30 cores
• Late 2010 HPC Web v1 released– Static content about how to use HPC resources, which
applications were installed, and how to use them
– Frameworks established, including integration of Mobyle
– Simple functionality for requesting accounts and support, viewing cluster status, engaging with community, etc.
– Integrated with custom UCSC Track Manager application
2011
• HPC Web phase II development began– Cluster had grown from 5 to nearly 40 nodes, from 30 to nearly
400 cores
– Project scope to include job submission, data mangement, and pipeline creation from web
7
Project Background (continued)
2012
• Cluster continuing to grow (now ~50 nodes, 600+ cores, GPU- and Infiniband-enabled)
• Approximately 750 TB data, with plans in place to expand data storage and implement hierarchical storage management / archiving mechanisms to support future growth
• HPC Web Phase II released in May 2012– ~20 applications with Mobyle interfaces, for a total of ~60 forms
for job submission (including sub-packages for applications, e.g., tools within SAMtools suite)
– Limited number of standardized workflow templates
E.g., RNA-seq-single-sample-mapping, which maps RNA-seq reads to a reference genome using TopHat, then passes the alignment file to 1) Cufflinks to assemble transcripts and quantify the expression and to 2) SAMtools to index the alignment file)
8
SGE submit
host
DRMAA library
HPC Web Server
SGE Compute
nodes
Storage/Shared folder
/group folder
/application folder
Mobyle library
HPC Web job submission implementation schema using Mobyle
Apache user (hpcwebadm) Apache user
Apache user
Apache user
Apache user
Authorization
module
Apache user
Apache user
HPC Web Mobyle Job Management Interface
Let‘s focus on the job bl2seq.T11045404625893
BLAST result obtained from server
Mobyle job results page for bl2seq.T11045404625893
SGE account details job
bl2seq.T11045404625893
qacct command for the job
Job runs using SGE
DRMAA library is used
for job submission from
Mobyle
Job runs as apache user
We could show any of
these parameters in the
HPC Web interface
• Start time
• Queue time
• End time
• Cpu time
HPC Web Video Demos
Navigating the HPC Web interface:
• http://www.youtube.com/watch?feature=player_emb
edded&v=cxxALr5PGlY
Using My File Manager in HPC Web
• http://www.youtube.com/watch?feature=player_emb
edded&v=9K8h2l28S2Y
Submitting jobs to Cluster from HPC Web
• http://www.youtube.com/watch?feature=player_emb
edded&v=9K8h2l28S2Y
13
BCBB Mobyle Interface Designer (BMID)
A web based GUI for creating Mobyle XML using drag-and-drop options and wizards
Eliminates the need to manually generate XML, aiming to facilitate community creation of interfaces and minimize development “bottlenecks”
14
15
Mobyle Framework: Command-line Application to Web Application
BCBB Mobyle Pipeline System (BMPS)
Leverages Mobyle framework to string applications
together such that the output of one process becomes
the input of the next
Simplifies analysis by automating standard set of
procedures that may have previously required manual
processing
Enables sharing of useful/novel pipelines among
users
Facilitates QC analysis by making it easy to iteratively
tweak one or a few parameters of an application
within a saved pipeline and validate results
16
Example BMPS Template
17
Other BMPS template examples
available in HPC Web:
• ChIP-seq-with-control
• Map-reads-and-index
• Fastq-quality-boxplot
Next Steps in HPC Web Development
Continued development of web forms, especially for
NGS and structural biology applications
BMID interface enhancements
BMPS/Pipeline system enhancements, including
additional templates
Integration with Mobyle2 framework
18
Feature Request Considerations
Workflow template sharing between HPC users
Data sharing with non-HPC account holders, including
those outside NIH
Ability for users to create their own application
interfaces using BCBB Mobyle Interface Designer
(BMID), and share interfaces with others
19
Discussion
Comments/Questions?
20
Thank You!
For more information, please contact:
21
Reference Slides
22
23
HPC Web System Architecture
24
Enterprise Storage
Client
CollabSharpointsite
GWT GWT - DND
GWT - IncubatorWeb server/SGE submit host
SGE DRMAA Library
CXF web serviceslibrary MobyleFramework
Apache Web Server Tomcat web server
PythonJava
Jespa library
SGE workerBio Applications
TopHatBowTieSSAHAetc
SGE
LDAP server
JavaScript enabledBrowser
Ajax libraries
(Only during development)
JSON object
SOAP Object