Utility HPC: Right Systems, Right Scale, Right Science

Utility HPC: Right Systems, Right Scale,

Right Science

Jason Stowe, CEO @jasonastowe, @cyclecomputing

I’m here to recruit you, for a cause

We believe utility access to compute power

makes impossible science, possible.

Dynamic, utility access to compute power

is as important as uptime

(that’s why coded infrastructure is critical)

Skeptical? Flickr: Tourist on Earth

In prior years (today?)

Researchers/engineers waited for computing

For the horsepower

For the place to put it

For it to be Configured..

Flickr: vaxomatic

Yesterday, high performance engineering, science clusters

were…

Too small when you need it most,

Too large every other time.

The Innovation Bottleneck: Researchers/Scientists/Engineers

Forced to size questions to the infrastructure you have

Multi-‐tenant systems create float capacity That is critical to innovation

The 60’s

The 70’s

The 80’s

The 90’s

The 00’s

From centralized to decentralized, collaborative to independent

and right back again!

The 10’s

Mainframes VAX The PC Beowulf Clusters Central Clouds

100% 60% 0% 40% ??? %

SHARIN

G ~ 0Mbit ~ 1Mbit ~ 10Mbit ~ 1000 Mbit ~ 10,000 Mbit

Bigger, better but further and further away from the scientist’s lab

Ask a Question Hypothesize Predict Experiment /

Test Analyze Final Results

The Scientific Method

Test and Analyze stages require the most time,

compute, and data

The Scientific Method

Any improvements to this cycle yield multiplicative

benefits

A Challenge Across Industries � 3 of Top 5 Insurance � 6 of Top 8 Pharmaceutical � 2 of Top 3 Banks � 2 of Top 3 Genomics Sequencing � 1 of Top 2 FPGA

Utility HPC in the News�WSJ, NYTimes, Wired, Bio-IT World BusinessWeek

To accelerate science, we need automation

Management Software

CC1/CCG Instances EBS S3

Shared FS

Utility HPC Cluster -‐ Scales to 50,000+ cores -‐ Data Scheduling -‐ Workload portability

Data & Application

Aware Movement

Traditional Scheduler

Massive Scale Based upon workload

Secure, HPC Cluster

HPC Reporting &

50,000-core CycleCloud Using Chef and AWS

ChefConf 2012

10,600-instance cluster against cancer target

ChefConf 2013

Created in 2 hours Configured with Search,

with Data bags

one Chef 11 server

We make software tools to easily orchestrate complex workloads and data access across Utility HPC

Today is a survey of use cases…

10,600 instance Life Science

Molecular Modeling

600 core Manufacturing Nuclear Power Plant for safety

simulation

Genomic Analysis RNA for

Stem Cells

#1: “Better” Science =

“Answer the question we want to ask”, not constrained to what fits

on local compute power

#2 “Faster” Science =

Run this “better” science, that would have taken

months or years in hours or days

Survey of Use Cases þ Drug Design þ CAD/CAM þ Genomics …

Life Sciences & Compute? C

Data/Bandwidth

Genomics

Molecular Modeling

CAD/ CAM

All Sample Analysis

Proteomics Biomarker/

Image Analysis

Sensor Data Import

Creating fake Charts, with Fake Data

Why is this important?

(W.H.O./Globocan 2008)

~2 million Type 2 diabetics, ~200k Type 1

Every day is crucial and costly

Before: Trade-off compute time vs.

accuracy

Now: Accurate analysis, fewer false

negatives, faster Initial

Coarse Screen

Higher Quality

Analysis

Best Quality

Process for Drug Design

Higher Quality

Analysis

Best Quality

Big 10 Pharma Built 10,600 instance cluster

($44M) in 2 hours, ran 40 years of science

in 11 hours for $4,372

Most Recent Utility Supercomputer server count:

AWS Console view:

Cycle’s view of this cluster:

One Chef 11 Server

Earlier Drug Design Novartis discussed at BioIT2012

� Needed �  Push-button Utility Supercomputer for molecular

modeling � Created

�  30,000 core run across US/EU Cloud (AWS) �  10 years of compute in 8 hours for $10,000 �  Found 3 compounds now in the wetlab as a result

�  Capacity is no longer an issue

�  Hardware = software �  Testing (error handling, unit testing, etc.)

e.g. Cycle spent ~$1M dollars on AWS over 5 years

�  The only way to do this is to automate

Lessons learned

Servers are not house plants

Servers are wheat

Nuclear Power Plant simulation

We don’t’ know what they’re running, but it has “Safety”

600-core CAD/CAM 3 Quarters of a year wait became 3 weeks

Site Data

Corporate

Firewall

3 Weeks instead Of 3 Quarters

Secure HPC

Cluster

TBs FS

External Cloud

~600 CPU cluster Scheduled

Data Engineer

Gene Expression Analysis Morgridge Institute for Research

Run holistic comparison of all 78 terabyte stem cell RNA samples to build a unique gene expression database

Make it easier to replicate disease in petri dishes w/induced stem cells

78 TB of Stem Cell RNA

1 Million compute hours, 115 years of computing in

1 week for $19,555

Gene Expression Analysis Morgridge Institute for Research

� Cluster details

�  5,000 to 10,000 cores for a week �  Very long individual analysis were check-pointed = Spot instance usage possible

Code can accelerate Science

The Scientific Method on Utility HPC

Yield “Better”, “Faster” Research for less $

I’m here to recruit you, for a cause

Contribute to Chef. Make the community better.

And you will help Cycle make impossible science,

possible.

2013 BigScience Challenge

$10,000 of free computing to science benefitting humanity

2012 winner: 115yr Genomic analysis

Enter at: http://cyclecomputing.com/big-science-challenge/enter

Thank You! Questions?

Utility HPC: Right Systems, Right Scale, Right Science

Technology

International DNO experience - maintaining the right skills base Tim Balcon Energy & Utility Skills

HPC Cloud Bad; HPC in the Cloud Good

UTILITY VERSUS PROXIMITY - International Right of Way

HIGHWAY/UTILITY PROGRAM NW Regional Right-of-Way Conference October 28, 2015

Lenovo HPC Strategy - HPC Advisory Council€¦ · Lenovo HPC Strategy Lenovo Luigi Brochard –Distinguished Engineer, WW HPC & AI, DCPG. 2 Lenovo Proven T1 HPC Partner Co-Innovation

HPC VENDOR SELECTION GUIDE - Penguin Computing · CHOOSING THE RIGHT OCP PARTNER FOR YOUR NEEDS HPC VENDOR SELECTION GUIDE: ... Does the vendor offer HPC as a service so that you

COMSOL Multiphysics HPC Workstation Benchmarks · The right HPC workstation for your application ... the price- performance ratio ... How to find the right HPC Workstation recommended

RAILROAD RIGHT-OF-WAY EASEMENTS, UTILITY · PDF fileHEFTMAN.DOC 4/22/2003 11:37 AM 1401 RAILROAD RIGHT-OF-WAY EASEMENTS, UTILITY APPORTIONMENTS, AND SHIFTING TECHNOLOGICAL REALITIES

CHAPTER 1 PATENTS AND UTILITY MODEL RIGHT CHAPTER 2 … THE... · 2014-07-01 · table of contents i table of contents chapter 1 patents and utility model right································

< QC | HPC >: Quantum for HPC

Jharrod LaFon (HPC-3) Jim Williams (HPC-3)

Hyak, the HPC Escalator & You: A New MOU for Utility HPC · Hyak, the HPC Escalator & You: A New MOU for Utility HPC UW Hyak Governance Board Meeting May 25, 2016 Chance Reschke,

The Handstand HPC - HPC Gymnastics

Utility Work within Public Right of Way

HPC IN CONTAINERS - NVIDIAon-demand.gputechconf.com/...hpc-in-containers-why-containers-wh… · GTC’18: HPC Containers 24 BUILDING AN HPC APPLICATION IMAGE 1. Use the HPC base

IRVINE RESIDENCE - Saint Paul, Minnesota · PROPOSED PRIMARY STRUCTURE; WOOD FRAME PROPOSED ... cable into the public right of way to accommodate utility ... IRVINE RESIDENCE HPC

romeoLAB : HPC Training Platform on HPC facility Archive/tech_poster...romeo LAB: HPC Training Platform on HPC facility The ROMEO HPC Center - Grand-Est region, France - is a High

The Right Terminations for Reliable Liquid Cooling in HPC

Utility Corridor Structures and Other Utility ... · UTILITY CORRIDOR STRUCTURES AND OTHER UTILITY ACCOMMODATION ALTERNATIVES IN TXDOT RIGHT OF WAY 5. Report Date September 2002 6

HPC Midlands Launch - Introduction to HPC Midlands