24
Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo! http://opencirrus.org 6/22/2009 For more info, contact Dave O’Hallaron, [email protected] Intel Open Cirrus team: David O’Hallaron, Michael Kozuch, Michael Ryan, Richard Gass, James Gurganus, Milan Milenkovic, Eric Gayles, Virginia Meade

Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo! 6/22/2009 For more info, contact Dave

Embed Size (px)

Citation preview

Page 1: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

Open Cirrus™ Cloud Computing Testbed

A joint initiative sponsored by HP, Intel, and Yahoo!

http://opencirrus.org

6/22/2009

For more info, contact Dave O’Hallaron, [email protected]

Intel Open Cirrus team: David O’Hallaron, Michael Kozuch, Michael Ryan, Richard Gass, James Gurganus, Milan Milenkovic, Eric Gayles, Virginia Meade

Page 2: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

2 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus™ Cloud Computing Testbed

Shared: research, applications, infrastructure (11K cores), data sets

Global services: sign on, monitoring, store. Open src stack (prs, tashi, hadoop)

Sponsored by HP, Intel, and Yahoo! (with additional support from NSF)

• 9 sites currently, target of around 20 in the next two years.

2

Page 3: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

3 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus Goals

• Goals— Foster new systems and services research around cloud computing— Catalyze open-source stack and APIs for the cloud

• How are we unique?— Support for systems research and applications research— Federation of heterogeneous datacenters

Page 4: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

4 Dave O’Hallaron – Open Cirrus™ Overview

Process

• Central Management Office, oversees Open Cirrus— Currently owned by HP

• Governance model— Research team — Technical team — New site additions — Support (legal (export, privacy), IT, etc.)

• Each site — Runs its own research and technical teams — Contributes individual technologies — Operates some of the global services

• E.g. — HP site supports portal and PRS— Intel site developing and supporting Tashi— Yahoo! contributes to Hadoop

Page 5: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

5 Dave O’Hallaron – Open Cirrus™ Overview

45 Mb/s T3 to Internet

Rack of 5 3u storage

nodes

Node: 12TB disk (12x1TB)

Rack of 40 blade

compute/storage nodes

Node: 8 Core2 cores (4x2), 8GB RAM, 0.3TB

disk (2x150GB)

1 Gb/s (x4 p2p)

Switch48 Gb/s

1 Gb/s (x5 p2p)

Rack of 15 2u

compute/storage nodes

Node: 8 Core2 cores (2x4), 8GB RAM, 6TB

disk (6x1TB)

x3

Switch48 Gb/s

Rack of 40 blade

compute/storage nodes

1 Gb/s (x4 p2p)

Switch48 Gb/s

10 nodes: 8 Core2 cores (4x2), 8GB RAM, 0.3TB disk (2x150GB)20 nodes: 1 Xeon core, 6GB RAM, 366GB disk (36+300GB)10 nodes: 4 Xeon cores (2x2), 4GB RAM, 150 GB disk (2x75GB)

1 Gb/s (x15 p2p)

Rack of 15 1u

compute/storage nodes

Node: 8 Core2 cores (4x2), 8GB RAM, 2TB

disk (2x1TB)

x2

Switch48 Gb/s

1 Gb/s (x4)

1 Gb/s (x4)

1 Gb/s (x4)

1 Gb/s (x4)

Nodes/cores: 40/140 40/320 30/240 45/360 [155/1060]RAM (GB): 240 320 240 360 [1160]Storage (TB) 11 12 60 270 60 [413]Spindles: 80 80 60 270 60 [550]

1 Gb/s (x15 p2p)

Switch48 Gb/s

[totals]

Intel BigData Open Cirrus sitehttp://opencirrus.intel-research.net

Page 6: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

6 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus Sites

SiteCharacteristics

#Cores#Srvrs Public Memory Storage Spindles Network Focus

HP 1,024 256 178 3.3TB 632TB 115210G internal1Gb/s x-rack

Hadoop, Cells, PRS, scheduling

IDA 2,400 300 100 4.8TB43TB+

16TB SAN600 1Gb/s

Apps based onHadoop, Pig

Intel 1,060 155 145 1.16TB353TB local 60TB attach

550 1Gb/sTashi, PRS, MPI,

Hadoop

KIT 2,048 256 128 10TB 1PB 192 1Gb/sApps with high

throughput

UIUC 1,024 128 64 2TB ~500TB 288 1Gb/sDatasets, cloud infrastructure

CMU 1,024 128 64 2TB -- -- 1 Gb/s Storage, Tashi

Yahoo(M45)

3,200 480 400 2.4TB 1.2PB 1600 1Gb/sHadoop on

demand

11,780 1,703 1,029 25 TB 2.6 PBTotal

Page 7: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

7 Dave O’Hallaron – Open Cirrus™ Overview

Testbeds

Open Cirrus

IBM/Google TeraGrid PlanetLab EmuLabOpen Cloud Consortium

Amazon EC2

LANL/NSF cluster

Type of research

Systems & services

Data-intensive

applications research

Scientific applications

Systems and

servicesSystems

Interoperab. across clouds using open

APIs

Commer. use

Systems

Approach

Federation of hetero-geneous

data centers

A cluster supported by Google

and IBM

Multi-site hetero

clusters super comp

A few 100 nodes

hosted by research

instit.

A single-site cluster with

flexible control

Multi-site heteros

clusters, focus on network

Raw access to

virtual machines

Re-use of LANL’s retiring clusters

Participants

HP, Intel, IDA, KIT, UIUC, Yahoo!CMU

IBM, Google, Stanford, U.Wash,

MIT

Many schools

and orgs

Many schools

and orgs

University of Utah

4 centers AmazonCMU, LANL,

NSF

Distribution

6 sites1,703 nodes11,780 cores

1 site11

partners in US

> 700 nodes

world-wide

>300 nodes univ@Utah

480 cores, distributed in four locations

1 site

1 site1000s of older, still

useful nodes

Testbed Comparison

Page 8: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

8 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus Stack

Compute + network + storage resources

Power + cooling

Management and control subsystem

Physical Resource set (PRS) service

Credit: John Wilkes (HP)

Page 9: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

9 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus Stack

PRS service

Research Tashi NFS storage service

HDFS storageservice

PRS clients, each with theirown “physical data center”

Page 10: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

10 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus Stack

PRS service

Research Tashi NFS storage service

HDFS storageservice

Virtual cluster Virtual cluster

Virtual clusters (e.g., Tashi)

Page 11: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

11 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus Stack

PRS service

Research Tashi NFS storage service

HDFS storageservice

Virtual cluster Virtual cluster

BigData App

Hadoop

1. Application running2. On Hadoop3. On Tashi virtual cluster4. On a PRS5. On real hardware

Page 12: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

12 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus Stack

PRS service

Research Tashi NFS storage service

HDFS storageservice

Virtual cluster Virtual cluster

BigData app

Hadoop

Experiment/save/restore

Page 13: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

13 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus Stack

PRS service

Research Tashi NFS storage service

HDFS storageservice

Virtual cluster Virtual cluster

BigData App

Hadoop

Experiment/save/restore

Platform services

Page 14: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

14 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus Stack

PRS service

Research Tashi NFS storage service

HDFS storageservice

Virtual cluster Virtual cluster

BigData App

Hadoop

Experiment/save/restore

Platform services

User services

Page 15: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

15 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus Stack

PRS

Research Tashi NFS storage service

HDFS storageservice

Virtual cluster Virtual cluster

BigData App

Hadoop

Experiment/save/restore

Platform services

User services

Page 16: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

16 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus stack - PRS

• PRS service goals— Provide mini-datacenters to researchers— Isolate experiments from each other— Stable base for other research

• PRS service approach— Allocate sets of physical co-located nodes, isolated inside VLANs.

• PRS code from HP being merged into Tashi Apache project.— Running on HP site— Being ported to Intel site— Will eventually run on all sites

Page 17: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

17 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus Stack - Tashi

• An open source Apache Software Foundation project sponsored by Intel (with CMU, Yahoo, HP)

— Infrastructure for cloud computing on Big Data — http://incubator.apache.org/projects/tashi

• Research focus: — Location-aware co-scheduling of VMs, storage,

and power.— Seamless physical/virtual migration.

• Joint with Greg Ganger (CMU), Mor Harchol-Balter (CMU), Milan Milenkovic (CTG)

Page 18: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

18 Dave O’Hallaron – Open Cirrus™ Overview

ClusterManager

Tashi High-Level Design

Node NodeNodeNodeNode

Storage Service

Virtualization Service

Node

Scheduler

Cluster nodes are assumed to be commodity machines

Services are instantiated through virtual machines

Data location and power informationis exposed to scheduler and services

CM maintains databasesand routes messages;decision logic is limited

Most decisions happen inthe scheduler; manages compute/storage/power

in concert

The storage service aggregates thecapacity of the commodity nodes to house Big Data repositories.

Page 19: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

19 Dave O’Hallaron – Open Cirrus™ Overview19

Open Cirrus Stack - Hadoop

• An open-source Apache Software Foundation project sponsored by Yahoo!

— http://wiki.apache.org/hadoop/ProjectDescription

• Provides a parallel programming model (MapReduce), a distributed file system, and a parallel database (HDFS)

Page 20: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

20 Dave O’Hallaron – Open Cirrus™ Overview

How do users get access to Open Cirrus sites?

• Project PIs apply to each site separately.

• Contact names, email addresses, and web links for applications to each site will be available on the Open Cirrus Web site (which goes live Q209)— http://opencirrus.org

• Each Open Cirrus site decides which users and projects get access to its site.

• Developing a global sign on for all sites (Q2 09)— Users will be able to login to each Open Cirrus site for which they

are authorized using the same login and password.

Page 21: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

21 Dave O’Hallaron – Open Cirrus™ Overview

What kinds of research projects are Open Cirrus sites looking for?

• Open Cirrus is seeking research in the following areas (different centers will weight these differently):

— Datacenter federation— Datacenter management— Web services— Data-intensive applications and systems

• The following kinds of projects are generally not of interest:— Traditional HPC application development.— Production applications that just need lots of cycles.— Closed source system development.

Page 22: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

22 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus Systems Research at Intel

• Tashi— Open source software infrastructure for cloud computing on big

data (IRP, with Apache Software Foundation, CMU, Yahoo, HP) • http://incubator.apache.org/projects/tashi

— Research focus: — Location-aware co-scheduling of VMs, storage, and power — Seamless physical/virtual migration

— Joint with Greg Ganger (CMU), Mor Harchol-Balter (CMU), Milan Milenkovic (CTG)

• Sprout— Software infrastructure for parallel video stream processing (IRP,

IRS, ESP project)— Central parallelization infrastructure for ESP and SLIPStream

projects

Page 23: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

23 Dave O’Hallaron – Open Cirrus™ Overview

Open Cirrus Application Research at Intel

• ESP (Everyday Sensing and Perception) — Detection of everyday activities from video

• SLIPStream (with CMU)— Parallel event and object detection in massive videos— Robot sensing and perception— Gesture based game and computer interfaces— Food Recognizing Interactive Electronic Nutrition Display (FRIEND)

• NeuroSys - Parallel brain activity analysis— Interactive functional MRI (with CMU)

• Parallel machine learning (with CMU)— Parallel belief propagation on massive graphical models— Automatically converting movies from 2D to 3D

• Stem cell tracking (with UPitt/CMU)

• Parallel dynamic physical rendering (with CMU)

• Log-based architecture (with CMU)

• Autolab autograding handin service for the world (with CMU)

Page 24: Open Cirrus™ Cloud Computing Testbed A joint initiative sponsored by HP, Intel, and Yahoo!  6/22/2009 For more info, contact Dave

24 Dave O’Hallaron – Open Cirrus™ Overview

Summary

• Intel is collaborating with HP and Yahoo! to provide a cloud computing testbed for the research community

• Primary goals are to — Foster new systems research around cloud computing— Catalyze open-source stack and APIs for the cloud

• Opportunities for Intel and Intel customers