29
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind WEBINAR April 13, 2016, 11:00 AM ET

Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Embed Size (px)

Citation preview

Page 1: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

WEB

INA

R

April 13, 2016, 11:00 AM ET

Page 2: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Housekeeping

• Audio help• Attachments• Questions• Rating

Page 3: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Today’s Speakers

Rick FriedmanVice President, Solution

DevelopmentCycle Computing

Scott JeschonekDirector of Product

Management, Cloud Avere Systems

Page 4: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Agenda

• Discuss the current state of HPC • Clouds and their impact on your HPC world• Reasons why you aren’t 100% cloud-based already• The Hybrid Cloud and HPC• Possible implementations • Delivering File Systems using Avere Systems• Orchestration using Cycle Computing

Page 5: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

HPC Today (and Yesterday, and Tomorrow)

Page 6: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

What Drives Today’s Needs

• Data– Who, what, when, how much, where?

• Datacenter limitations– Can I defy physics?

• User expectations– Can we even do that?

• Technology shifts– What is the “best practice”?

Page 7: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Big Compute Workloads: How are they handled?

Compute Demand vs. Cluster Size

Cluster Size

Compute Demand

Missed Opportunity

Wasted Resources

• Internal infrastructure has huge value and some limitations

• Access, not capacity, is the barrier to continued growth

• Perception limits scale of problem solving

• Public cloud = cost-effective, readily available resources to users with problems & deadlines.

• Financial services, manufacturing and life sciences are leading the way.

Page 8: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Basic HPC Environment Requirements

Resource Manager

Jobs Manager / Scheduler

Workload

NAS Storage

Lots of compute resources (“Grid”)

Page 9: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Advantages of Clouds

Significantly reduce infrastructure management costs both in money and time

Maintain operational flexibility during scale-out jobs…let the provider deal with scale challenges

Page 10: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Why the Cloud for Big Compute?

• Scientist / Engineer User perspective– Zero queue times, capacity in minutes– Scale compute to problems size, not vice versa– Try / support new computational approaches and software quickly

• SysArchitect perspective– Dynamically adjust workloads to “lowest cost/impact” provider– Focus on computational excellence, not hardware management– Support a wide range of user types efficiently

• Organizational perspective– Match spending to actual consumption– Increase responsiveness to business dynamics– Grow user base without hardware limitations

Page 11: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Clouds Have Awesome New Capabilities

• Big Data– Analytics Tools– Massively scalable NoSQL– Data warehousing

• Machine Learning– Voice/Vision/Speech– Early days

Page 12: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

So…why isn’t everything in the cloud?

• Current infrastructure investment (capex)• Cloud costs not yet completely in line • Software infrastructure in place

– Costs to refactor, dependencies to consider• Data environment in one or more data centers• Orchestration and management of cloud clusters is hard• Network bandwidth / latency concerns• Business Continuity

Page 13: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Other Reasons You’re Not 100% in the Cloud

• Corporate budgets• Corporate policies• Corporate politics• Education / awareness• Government regulations• Interest groups• Vendor relationships

Page 14: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Near Future, Hybrid Cloud

Tokyo office London officeAnalysts

Analysts

NYC officeAnalysts AnalystsAnalysts

Analysts AnalystsAnalysts

AnalystsAnalysts

Hong Kong office

• Adoption of one or more cloud providers• > 1 hedge on price and SLA

• Mix of on-prem and cloud resources• Regulatory, proprietary and/or security

characteristics will likely keep data in the DC

NAS

Primary DC

Cloud Provider

1

Cloud Provider

2

NAS

Secondary DCSubmit Jobs

Submit Jobs

Page 15: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Cloud ComputeEnvironment

Data

HPC in the Cloud

Cloud Compute API

Scheduler

NAS Storage

Analysts

Scheduler

AnalystsAnalysts Analysts Analysts Analysts

JobsOn-Premises Data Center

Page 16: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Cloud ComputeEnvironment

HPC in the Cloud, “Grids on Demand”

Cloud Compute API

Data

NAS Storage

Analysts

Scheduler

AnalystsAnalysts Analysts Analysts Analysts

JobsOn-Premises Data Center

Scheduler1 Scheduler2Scheduler3 Scheduler4

Page 17: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Challenges with HPC in the Cloud

• How do you get the data close to your compute nodes?

• How do you orchestrate on-demand clusters/grids of compute nodes?

• How does this all come together??

Page 18: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Cloud ComputeEnvironment

Data Access Layer

Cloud Compute API

Scheduler1

Data

NAS Storage

Analysts

Scheduler

AnalystsAnalysts Analysts Analysts Analysts

JobsOn-Premises Data Center

Data Access Layer

Scheduler2 Scheduler3 Scheduler4

• File System• Caching Layer• Only load necessary

blocks of files• Opaque to compute

nodes

Page 19: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Advantages of Data Access / Cache Layer

• Keep your data on prem! – Data in cloud is only there while the compute nodes work the jobs. – Reduce the security objections, simplify the move to cloud

• Increase cloud compute performance – using file system caching, most of the data will be in RAM, close to the nodes– Avoids ingest latencies and slashes transit latency after first read

• Scale out – Using solution that facilitates 10s of 1000s of core file system connections

Page 20: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Typical File Access in Hadoop Cluster

Caching files will work for certain types of jobs

Where typical file is accessedBy multiple clients

source: http://blog.cloudera.com/blog/2012/09/what-do-real-life-hadoop-workloads-look-like/

Page 21: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Hybrid Cloud using Avere FXT and vFXT Edge Filers

CloudComput

e

On-PremComput

e

CloudStorage

On-PremStorage

NAS

Object

Bucket 1 Bucket 2

Bucket n

Virtual Compute Farm

Virtual FXT

File Storage for

Private Object

NAS Optimization

Cloud NAS

Cloud

Bursting

Cloud

Sto

rage

Gateway

Physical FXT

The “Edge” = locating your dataClose to your computeWithout truly moving it from yourNAS environment

Page 22: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Avere Building Blocks

“Avere is uniquely positioned to offer scale across tens of thousands of cloud compute cores while leaving the data where it originates, on premises, with it’s global file system and caching capabilities.”

- Unnamed CTO

Cloud Compute

Virtual FXT

NAS

Object

Physical FXT

Cloud

On-Premises

File Acceleration

12-20msEncrypted

Page 23: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Cloud ComputeEnvironment

Orchestration and Management Layer

Cloud Compute API

Data

On-Premises Data Center

Scheduler1 Scheduler2Scheduler3 Scheduler4

NAS Storage

Analysts

Scheduler

AnalystsAnalysts Analysts Analysts Analysts

Jobs

Page 24: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Optimization• Benchmark instances• Make Workflow UI • Human workflow

Provisioning• Workload placement

Optimal scale• Cost optimization• Data scheduling

Cluster Configuration• Multi-cloud, without changes• Pre-set or User-defined “types”• Abstraction for all cluster data,

attributes (roles, OS, etc)

Monitoring• Auto-scaling• Usage tracking• Error Handling• Reporting

Internal

File: DeclarativeCluster Definition

Packages, InstallersContainers, Data

Admin

Scope Configure Run on Cloud Optimize

User

Complete Multi-Cloud Workflow Control

Page 25: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

User

Web UI API

CMDLine

Job & Data Workflow

AutomatedJob Placement,

Cost optimization

Auto-scaling, Benchmarking,

Compliance, Reporting tools

Multi-cloud Without Changes

InternalCluster

How Cycle Makes Cloud Productive

• Scientist / Engineer productivity: – Simple workflows– Zero queue time– Auto-scaling

• SysAdmin productivity: – Instant access to additional resources– Workflows linking internal and multiple clouds– Simple reliable tools to enable apps with

special requirements• Organizational productivity:

– Secure, consistent cloud access– Usage tracking– Ability to leverage multiple providers

Page 26: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

26

Big Data w/o Disrupting Production

• Challenge– Estimate the carbon stored in Saharan biomass– Rapidly establish a baseline for later research using

large amounts of high-resolution remote sensing data

– Existing internal compute resources fully committed– Limited window to complete processing

• Cycle solution– Full workflow including data management between

internal data capture and cloud processing– Leverage spot pricing to minimize cost while

maximizing computation• Results

– Linearly scalable, predictable enabling plan for next steps

– Science being done that could not be done otherwise

– 1 month start to initial runs

Page 27: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

Overall Architecture – Data In-House

27

Cloud Compute

Scheduler

Avere FXT Edge Filer

Avere FXT

Workload

Cloud API

NAS Storage

Scheduler

Cloud Storage

Page 28: Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind

What We Covered…

• The Current State of HPC • Clouds and Their Impact on Your HPC World• Reasons Why You aren’t 100% Cloud-based Already• The Hybrid Cloud and HPC• Possible Implementations • Delivering File Systems Using Avere Systems• Orchestration Using Cycle Computing