Google Cloud Platform HPC Bursting with · 11/7/2017  · Karan Bhatia, Google Jonathon LeFaive,...

Preview:

Citation preview

HPC Bursting with Google Cloud PlatformKaran Bhatia, GoogleJonathon LeFaive, University of Michigan

Proprietary + ConfidentialSEPTEMBER 28, 2016

Cloud Deployment Manager

Proprietary + Confidential

● Repeatable

● Declarative

● Template Driven

Cloud Deployment Manager

Proprietary + Confidential

% gcloud deployment-manager deployments create mycluster --config condor-cluster.yaml

% gcloud compute ssh condor-submit

Proprietary + Confidential

Cores from Google

Proprietary + Confidential

Not interested in Infrastructure? Just want to run jobs and get results...

Confidential & ProprietaryGoogle Cloud Platform 8

MIT Research w/ VMs

Products used: Google Compute Engine, Cloud Storage, DataStore

220,000 cores on preemptible VMs

2,250 32-core instances, 60 CPU-years of computation in a single afternoon

Answers in hours v. months

580,000 cores

Broad Firecloud:WDL, Cromwell and Google Genomics

WDL: an external DSL used by computational biologists to express the analytical pipelines

Cromwell: a scalable, robust engine for executing WDL against pluggable backends including local, Docker, Grid Engine or …

Google Genomics Pipelines API: co-developed by Broad and Google Genomics, a scalable Docker-as-a-Service with data scheduling

Pipeline definition{

"name": "samtools index",

"description": "Run samtools index to generate a BAM index file",

"inputParameters": [

{"name": "inputFile",

"localCopy": {

"disk": "data",

"path": "input.bam"

}

},

{"name": "outputFile",

"localCopy": {

"disk": "data",

"path": "output.bam.bai"

}

},

],

"resources": {

"minimumCpuCores": 1,

"minimumRamGb": 1,

"disks": [{

"name": "data",

"type": "PERSISTENT_HDD"

"sizeGb": 200,

"mountPoint": "/mnt/data",

}]

},

"docker": {

"imageName": "quay.io/cancercollaboratory/dockstore-tool-samtools-index",

"cmd": "samtools index /mnt/data/input.bam /mnt/data/output.bam.bai"

}

}

Create, run, monitor, and kill pipelines

Create$ gcloud alpha genomics pipelines create --pipeline-json-file PIPELINE-FILE.json --pipeline-json-file samtools_index.json

Created samtools index, id: PIPELINE-ID

Run$ gcloud alpha genomics pipelines run --pipeline_id PIPELINE-ID \

--logging gs://YOUR-BUCKET/YOUR-DIRECTORY/logs \

--inputs inputFile=gs://genomics-public-data/gatk-examples/example1/NA12878_chr22.bam \

--outputs outputFile=gs://YOUR-BUCKET/YOUR-DIRECTORY/output/NA12878_chr22.bam.bai

Running: operations/OPERATION-ID

Status$ gcloud alpha genomics operations describe OPERATION-ID

Kill$ gcloud alpha genomics operations cancel OPERATION-ID

DSUB (google genomics pipelines)

Task Tailored Resources

Preemptible VM Instances

● What Preemptible VMs are○ Up to 80% cheaper than regular VMs. (~$0.01 per core hour)○ Very easy to use -- just flip one switch in the UI, API or command line○ Many of our biggest customers run huge clusters (10k+ cores) with great success and

savings.

● Things to keep in mind○ Same great disk, OS images and network○ Google Compute Engine can preempt (i.e. shutdown/take-away) the VM with 30

seconds of notice ○ Maximum 24 hours of uptime○ No SLAs or guarantees of any kind but we historically see preemption rates of 5-15%

Intel Skylake

● Significant “per core” performance improvements

● Intel® Advanced Vector Extension 512 (Intel® AVX-512)

○ 2x flops/second● Accelerated IO with Intel® Omni-Path

Architecture (Fabric)● Integrated Intel® QuickAssist Technology

(crypto & compression offload)● Intel® Resource Director Technology (Intel®

RDT) for Efficiency & TCO

Hardware Accelerated

● Available Today: NVIDIA K80 GPU, P100s

● Coming Soon: Tensor Processing Unit (TPU)

● Custom ASIC built and optimized for TensorFlow

● Used in production at Google for over 16 months

● 7 years ahead of GPU performance per watt

Tensorflow Processor Units (TPU) - 180TFlops

Cloud Storage

Cloud Bigtable

Cloud Datastore

Cloud SQL

Good for:Input & Output Binary or object data (BLOB) Long term storage

Such as:Images, videos, etc.

Good for:Hierarchical metadata

Good for:Metadata & other related data

Good for:Heavy read + write, events

Big Query

Good for:Data Warehouse

Analytics, Dashboards

Relational NoSQL Object Warehouse

Good for:Local VM file storage & work space

Block

Persistent Disk (GCE)

Where do I store data for my batch jobs?

Block storageReliable, high-performance block storage for any GCE VM instance

Local SSDFastest, Attached, Ephemeral

Persistent Disk: SSDFast, Persistent, Durable, Remote

Persistent Disk: HDDCheapest, Persistent, Durable, Remote

Targetscenarios

- High-performance scratch space. Frequently accessed data.- Excellent for scientific workloads, especially when combined with fast compute VMs like GPU instances

- Latency sensitive applications and files.- High performance database and enterprise applications- Databases

- Large data processing workloads- Latency incentive tasks with lots of data: Genomics processing, video transcoding in GCE

Features

- Ephemeral storage- Highest-performance ($0.218 GB)- IOPS: 680k read / 360k write

- Persistent storage- Performance sensitive ($0.17GB)- IOPS: up to 40k read / 30k write

- Persistent storage- Cost sensitive ($.04 GB)- IOPS: 3k read / 15k write

Encryption3TB - 375 GB per partition, up to 8 partitions

Encryption, Snapshots64 TB, Disk Size sets performance

(Attach larger VMs for max SSD performance)

GCS: Object/Blob store

● Google Cloud Storage is a scalable object storage service suitable for all kinds of unstructured data

● Cloud Storage vs Perst. Disk:○ Scales to exabytes○ Accessible from anywhere; REST interface○ Higher latency than PD○ Write semantics include insert and overwrite

file only○ Offers versioning○ Cheaper - put your data here until you need it

● Lots of guidelines on picking storage on our site

Human Genetics on Google Cloud Platform

Jonathon LeFaiveUniversity of Michigan

Center for Statistical Genetics

Human Genetics, Sample Sizes over Time

TOPMed Sequencing as of April 25, 2017http://nhlbi.sph.umich.edu/

● 71,713 genomes○ 70,497 pass quality checks (98.3%)○ 823 flagged for low coverage ( 1.2%)○ 393 fail quality checks ( 0.5%)

● Mean depth: 38.3x● Genome covered: 98.7%● Contamination: 0.27%

9 x 1015 sequenced bases

9 x 1015 sequenced bases

On the same scale as the number of grains of sand in a small beachImage: Wikimedia Commons

What Does Alignment Mean?

Image: Wikimedia Commons

Compute Workload● 70,000 genomes● ~ 20 GiB per genome● 456 core hours per genome

1.3 PiBHighly Compressed Input Data

3644Core Years

Local Cluster Limitations● Not enough CPU cores● NFS saturation● Shared resources

GCE Price Table

Machine Type vCPUs Memory Price (USD) Preemptible price (USD)

n1-standard-1 1 3.75 GB $0.0475 $0.0100

n1-standard-2 2 7.5GB $0.0950 $0.0200

n1-standard-4 4 15GB $0.1900 $0.0400

n1-standard-8 8 30GB $0.3800 $0.0800

PVMs are approximately ⅕ the cost.

$0.0375 x 456 hr = $17.10

Savings of $17 per genome.

$17.10 x 70,000 genomes = $1.2m

Total savings of $1.2m

PVMs are Important …… so how do we utilize them?

● Change nothing● Process checkpointing (CRIU, BLCR, etc.)● Linux “suspend to disk” (swsusp)● Proprietary solution

Pipeline Flow● Pre-process step outputs chunked

files● Main processing runs on chunked

files using preemptible instances● Chunks are merged before

post-processing

Application Design● Checks out job from master

database

● Downloads input from bucket

● Processes input

● Uploads output to bucket

● Uploads log to bucket

● Checks in job to master

database.

Advice for New Users● Use Google Cloud Storage● Create custom network● Exponential backoff retries● Cater to the majority when provisioning resources

What Catching Up Looks Like

Thank You!

Acknowledgments: Chris Scheller, Hyun M. Kang, Goncalo Abecasis, GCP Team

Recommended