38
Cloud Computing for Science June 2009 21st International Conference on Scientific and Statistical Database Management Kate Keahey [email protected] Nimbus project lead University of Chicago Argonne National Laboratory

Cloud Computing for Science

Embed Size (px)

Citation preview

Page 1: Cloud Computing for Science

Cloud Computing for ScienceJune 2009

21st International Conference on

Scientific and Statistical Database Management

Kate Keahey

[email protected]

Nimbus project lead

University of Chicago

Argonne National Laboratory

Page 2: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Cloud Computing is in the news…

…is it good news for Science?

Page 3: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Cloud Computing for Science

Complex codes

Need for control

Page 4: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Grid ComputingAssumption: control over the manner in which resources

are used stays with the site

R

RRR

R

Site A

Site B

VO-A

R

Site-specific environment and mode of access

Site-driven prioritization

But: site control -> rapid adoption

Page 5: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Cloud Computing

Enabling factors: virtualization and isolation Challenges our notion of a site Lends itself to more explicit service level negotiation But: slow adoption

Change of assumption: control over the resource is turned over to the user

R

RRR

R

Site A

Site B

VO-A

RR

RR

RR

R

Page 6: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Grids to Clouds: a PersonalPerspective

“A Case for Grid Computingon VMs”

In-Vigo, VIOLIN, DVEs,Dynamic accounts

Policy-driven negotiation

Xen released

First WSRFWorkspace Service

release

EC2 gatewayavailable

Support for EC2 interfaces

2003 20092006

EC2 goes online

First STAR productionrun on EC2

Nimbus Cloudcomes online

Context Brokerrelease

Page 7: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Benefits to Consumers

Eliminate expense of acquiring, managing and operating

hardware

Elastic computing Pay-as-you-go model

capital expense operational expense

Page 8: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Benefits to Providers

Economies of scale to amortize the costs of buying and operating

resources

Avoid cost and complexity of managing multiple customer-specific

environments and applications

Streamline and specialize

Page 9: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Unclouding the Cloud

Infrastructure-as-a-Service (IaaS)

Platform-as-a-Service (PaaS)

Software-as-a-Service (SaaS)

Community-specific applicationsand portals

Page 10: Cloud Computing for Science

IaaS Infrastructure

Page 11: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Nimbus: Cloud Computing Software

Nimbus goals: Allow providers to build clouds

Private clouds (privacy, expense considerations) Workspace Service: open source EC2 implementation

Allow users to use cloud computing Do whatever it takes to enable scientists to use IaaS Context Broker: turnkey virtual clusters

Allow developers to experiment with Nimbus For research or usability/performance improvements

Community extensions and contributions

Page 12: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

VWSService

The Workspace Service

Page 13: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

The Workspace Service

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

The workspace service publishesinformation about each workspace

Users can find outinformation about theirworkspace (e.g. what IP

the workspace wasbound to)

Users can interactdirectly with their

workspaces the sameway the would with a

physical machine.

VWSService

Page 14: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

User Environments

Cloud Computing Ecosystem

Appliance ProvidersMarketplaces, commercial providers,

Virtual OrganizationsAppliance management software

Deployment Orchestrator

VMM/DataCenter/IaaS User EnvironmentsVMM/DataCenter/IaaS

Page 15: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

MPIMPI

Turnkey Virtual Clusters

Turnkey, tightly-coupled cluster Shared trust/security context Shared configuration/context information

IP1IP1 HK1HK1

IP1IP1

IP2IP2

IP3IP3

HK1HK1

HK2HK2

HK3HK3

Context BrokerContext Broker

IP2IP2 HK2HK2

IP1IP1

IP2IP2

IP3IP3

HK1HK1

HK2HK2

HK3HK3

IP3IP3 HK3HK3

IP1IP1

IP2IP2

IP3IP3

HK1HK1

HK2HK2

HK3HK3

Page 16: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

What’s in the Box?

workspacecontrol

workspaceresourcemanager

workspacepilot

workspaceservice

workspaceclient

cloudclient

IaaSgateway c

onte

xt b

roke

r

contextclient

EC2potentially other providers

storageservice

EC2

WSRF

Page 17: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Open Source IaaSImplementations

OpenNebula Open source datacenter implementation University of Madrid, I. Llorente & team, 03/2008

Eucalyptus Open source implementation of EC2 UCSB, R. Wolski & team, 06/2008

Cloud-enabled Nimrod-G Open source implementation of EC2 Monash University, MeSsAGE Lab, 01/2009

Industry efforts openQRM, Enomalism

Page 18: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Nimbus: Extensions

Nimbus core: Tim Freeman & David LaBissoniere(UC/ANL team)

Nimbus monitoring: Ian Gable & team (UVICATLAS)

Cumulus: Raj Kettimuthu and John Bresnahan(ANL)

EBS: Marlon Pierce, Xiaoming Gao, Mike Lowe (IU) Others:

OpenNebula project (University of Madrid) Descher et al (Technical U of Vienna): privacy

extensions

Page 19: Cloud Computing for Science

Scientific Cloud Resources andApplications

Page 20: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Science Clouds Goals

Enable experimentation with IaaS

Evolve software in response to user needs

Exploration of cloud interoperability issues

Participants University of Chicago (since 03/08, 16 nodes), University

of Florida (05/08, 16-32 nodes, access via VPN), MasarykUniversity, Brno, Czech Republic (08/08), Wispy @Purdue (09/08)

In progress: Grid5K, IU, Vrije

Using EC2 for large runs

Science Clouds Marketplace: OSG cluster, Hadoop, etc.

Come and run: http://workspace.globus.org/clouds

Page 21: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

100+DNs

projectsrangingacrossScience,CS,education,build&test…

Who Runs on Nimbus?

Page 22: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

STAR experiment

STAR: a nuclear physicsexperiment at BrookhavenNational Laboratory

Studies fundamentalproperties of nuclearmatter

Problem: computationsrequire complex andconsistently configuredenvironments that arehard to find in existinggrids

Page 23: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

STAR Virtual Clusters

Virtual resources A virtual OSG STAR cluster: OSG headnode (gridmapfiles,

host certificates, NFS, Torque), worker nodes: SL4 + STAR One-click virtual cluster deployment via Nimbus Context

Broker

From Science Clouds to EC2 runs Running production codes since 2007 Work by Jerome Lauret, Leve Hajdu, Lidia Didenko

(BNL), Doug Olson (LBNL) The Quark Matter run: producing just-in-time results for

a conference: http://www.isgtw.org/?pid=1001735

Page 24: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Infrastructure-as-a-Service

Gateway/Context Broker

STAR Quark Matter Run

Page 25: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

STAR Quark Matter Run (2)

Application stats: Processed 1.2 M events Moved ~1TB of data over duration (small I/O needs)

Run facts: 300+ nodes over ~10 days Instances, 32-bit, 1.7 GB memory:

EC2 default: 1 EC2 CPU unit High-CPU Medium Instances: 5 EC2 CPU units (2 cores)

Cost: Comp: ~ $6,000: ~ $1,7 K (default) + ~ $3,9K (medium) Data: ~ $150

Page 26: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

A Large Ion ColliderExperiment (ALICE)

Heavy ion simulationsat CERN

Problem: integrateelastic computing intocurrent infrastructure

Collaboration withCernVM project

With ArtemHarutyunyan andPredrag Buncic

Page 27: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Elastic Provisioning for ALICE HEP

Infrastructure-as-a-Service

queue sensor AliEn

Context Broker

ALICE queue

Page 28: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

AliceHEPExperimentatCERN

CHEP09 paper, Harutyunyan et al.

Can we elastically extend a local scheduler?

Page 29: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Sky Computing

Enabling factors: cloud computing and virtual networks Network leases would help Instead of a bunch of disonnected domains, one domain

overlapping the Internet

Change of assumption: we can now trust remote resources

R

RRR

R

Site A

Site B

VO-A

RR

RR

RR

R

Page 30: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Sky Computing Environment

U of FloridaU of Chicago

ViNErouter

ViNErouter

ViNErouter

Purdue

Work by A. Matsunaga, M. Tsugawa, University of Florida

Page 31: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Hadoop in the Science Clouds

Papers: “Sky Computing”, by K. Keahey, A. Matsunaga, M. Tsugawa, J.

Fortes. Submitted to IEEE Internet Computing. “CloudBLAST: Combining MapReduce and Virtualization on

Distributed Resources for Bioinformatics Applications” by A.Matsunaga, M. Tsugawa and J. Fortes. eScience 2008.

U of FloridaU of Chicago

Purdue

Hadoop cloud

Page 32: Cloud Computing for Science

Cloud Computing for Science:Issues and Challenges

Page 33: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Building the Ecosystem

Configuring and maintaining appliances Not just VMs, a variety of formats CernVM, rBuilder (rPath)

Licenses Still vendor-specific approaches

Getting used to dynamic sites Host certificates and keys, community

visibility, failure processing, etc.

Infrastructure and leveraging

Page 34: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Security and Privacy Issues

Leaks in isolation, improper use, newtechnology issues

Lack of features Fine-grained authorization Paper: Palankar et al., Amazon S3 for Science Grids: a

Viable Solution?

Data privacy Paper: Descher et al., Retaining Data Control in

Infrastructure Clouds, ARES (the InternationalDependability Conference), 2009.

Page 35: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Performance

Difficult to track in a virtualized environment I/O can be an issue Tradeoffs between CPU power and throughput New networking solutions

Several studies of cloud performance One paper: Walker, Benchmarking Amazon EC2 for

high-performance scientific computing Low bandwidth from existing providers:

2-5 MB/sec, 17/21 MB/sec, 30MB/sec

Generally speaking, the existing cloud providers donot offer a very high-end computer

Page 36: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Price

Price for what? Experimenting with business models Estimating the cost is hard

Price of Base Services for AWS: Computation / EC2

On-demand: starting at $0.1 per hour Reserved: starting at $325 per year for $0.03 per hour

Data / S3 Storage: $0.15 per GB/month, Transfer: $0.17 per GB AWS import/export for bulk

Hosting Scientific datasets for free Free on AWS for frequently used datasets

Page 37: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Service Levels

Service levels Computation: immediate, advance

reservations, best-effort

Data: durability, high/low availability,access performance

Cross-cutting concern: security and privacy

Different price points for differentavailability

Page 38: Cloud Computing for Science

6/5/09 The Nimbus Toolkit: http//workspace.globus.org

Parting Thoughts

IaaS cloud computing is science-driven

Scientific applications are successfullyusing the existing infrastructure forproduction runs

Many more could be using it, butchallenges exist…

Project for the next few years: solve them!