Upload
dothuan
View
232
Download
0
Embed Size (px)
Citation preview
HPC at NIBRNick Holway, NIBR Scientific Computing GroupSpeedup 2016September 15, 2016Twitter: @nickholwayLinkedIn: https://ch.linkedin.com/in/nickholway
Novartis Institutes for Biomedical Research(NIBR)
Novartis Institutes for Biomedical Research
Today’s talk
1. HPC at NIBR – a quick introduction
2. HPC in the cloud
3. Accelerating a “compound search engine” using HPC
4. Expediting drug discovery with GPUs
Public2
Novartis Institutes for Biomedical Research
HPC at NIBR - Hardware
• x86 servers– Intel Xeon– 128-768GB RAM– FDR Infiniband– 10GigE
• Nvidia GPUs on some servers• Isilon storage
– CIFS/NFS– 10GigE to Arista switches
• Lustre– Scratch
Public3
Novartis Institutes for Biomedical Research
HPC at NIBR - Software
• RHEL 6.x
• Univa Grid Engine for scheduling
• Software compilation & configuration– Easybuild– Modules– GCC, Intel, Nvidia compilers
• Languages: C++, Fortran, CUDA, Python, R, Matlab
Public4
Novartis Institutes for Biomedical Research
HPC at NIBR - Humans
• Global team (Europe, USA, Asia)
• Complementary backgrounds and skills– Sysadmins– Mathematicians– Scientists
• HPCWire award winners in 2014
• NB: HPC exists elsewhere in the Company for Clinical Trial analysis, CFD etc.
Public5
HPC in the cloud
Novartis Institutes for Biomedical Research
HPC in the cloud
• NIBR have used Amazon EC2 for compute workloads– Cycle computing
• ISVs eg DNANexus– Bioinformatics NGS
Public7
Novartis Institutes for Biomedical Research
Docking at scale in the cloud
• Ligand-protein docking is “to predict the position and orientation of a ligand (a small molecule) when it is bound to a protein receptor or enzyme” (Wikipedia)
• Embarrassingly parallel - compute-heavy / data-light
• We used the cloud to screen 10 million molecules against a cancer target
Public8
Novartis Institutes for Biomedical Research
How we did it
• Cycle computing’s software (Cycle server, Cyclecloud)
• Over 10,000 EC2 spot instances– Extensive benchmarking to select instance type
• Licence files (licence servers cannot cope with the load)
• Proprietary compounds run in NIBR’s VPC, others in “public”
• See http://opensource.nibr.com/videos/aws-litster/ and http://cyclecomputing.com/novartis-taps-cloud-hpc-for-faster-drug-discovery-better-science/
Public9
Novartis Institutes for Biomedical Research
Where we’re going in the cloud
• “Cloud by default” for many non-HPC applications
• Clinical data (subject to “informed consent”)
• HPC where appropriate– IB etc for tightly-coupled parallel jobs usually unavailable– Data locality challenging
Public10
Accelerating a compound target search engineSlides courtesy Douglas Selenger
Novartis Institutes for Biomedical Research
Introduction
• There are many disparate public and private sources of information which is hard for experts to query and almost impossible for “normal” scientists
• Scientists would like to ask questions like “What is the target and Mechanism of Action (MoA) of my compound?”
• MOA Central is a web-based tool
Public12
Novartis Institutes for Biomedical Research
”Flow” of a query
Public13
Novartis Institutes for Biomedical Research
Diagrammatic network
Public14
Novartis Institutes for Biomedical Research
Example output
Glivec (Imatinib)
Public15
Novartis Institutes for Biomedical Research
Impact of HPC
• Our scientists developed a tool, MOA Central, using graph analysis techniques using well known workflow software
• MOA Central worked so well that the server couldn’t keep up with demand
• We helped port it to Python (Pandas, SciKit etc)– Large queries and data preparation can now run on the cluster– Version control!
• Moving from CSV files, database queries & web services to HDF5 will improve scalability
Public16
Novartis Institutes for Biomedical Research
Want to know more about MOA Central?• Look for more information at
https://www.researchgate.net/profile/Douglas_Selinger
Public17
Accelerating Motor Neuron Disease drug discovery with GPUsSlides courtesy Imtiaz Hossain
Novartis Institutes for Biomedical Research
In-vitro model for neuromuscular junctions• Faulty junctions between motor neurons and muscle
cells are implicated in MND
• We’d like to create a drug which corrects this
• Motor neurons & myotube (muscle fibre) cells were “co-cultured” in a “plate” to which drug candidates are added
• Cells were imaged in real time to measure their contractility
• This is very hard to see by eye and also hard to segment using computers
Public19
Novartis Institutes for Biomedical Research
What do the cells look like?
Public20
Novartis Institutes for Biomedical Research
Motion estimated with Optic Flow
Public21
Different contracting regions
Total area under contraction
Novartis Institutes for Biomedical Research
Impact of HPC
• A good joint project between bench scientists, lab automation experts & informaticians
• 80x increase of throughput compared to CPU
• NIBR scientists have access to new method of monitoring myotube contractility
Public22
Novartis Institutes for Biomedical Research
The future
• GPUs– Deep learning– Cryo-EM
• Real time collection & processing of data from clinical trials
• Integration of “big data” technologies such as Apache Spark into HPC
Public23
Thank you
Novartis Institutes for Biomedical Research
Backup: MOA Central predicting a side effect
Public25