17
Davide Salomoni INFN-CNAF EPIKH School June 13, 2011 The WNoDeS Project A Grid/Cloud Integration Framework http://web.infn.it/wnodes

The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

Davide SalomoniINFN-CNAF

EPIKH SchoolJune 13, 2011

The WNoDeS ProjectA Grid/Cloud Integration Frameworkhttp://web.infn.it/wnodes

Page 2: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

Background to this work INFN is the Italian National Institute for Nuclear Physics and

is engaged in many international physics experiments and in developing, delivering and supporting computing services for them.

I am the manager of computing services at the INFN National Computing Center (CNAF), located in Bologna, Italy

CNAF actually hosts about 8,500 computing cores, 8 PB of disk space and 10 PB of tape spaceEach day, about 40,000 jobs get executed at CNAFWe support about 20 international scientific experimentsCNAF is the Italian Tier-1 for CERN-based LHC experiments and

Tier0/1 for several others

2

Page 3: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

Cloud Computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable com- puting resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.(NIST Working Definition of Cloud Computing)

3

The essence of the [definition] can be captured in a simple checklist, according to which a Grid is a system that:✓ coordinates resources that are not subject to centralized control…✓ … using standard, open, general-purpose protocols and interfaces…✓ … to deliver nontrivial qualities of service.

(I. Foster, What is the Grid? A Three Point Checklist, 2002)

The compulsory slide on definitions: Grids vs. Clouds

Page 4: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

Observation: Grids never found the right business model in the industry – so they failed to get industry uptake.

What will happen when we want to federate clouds?Does anybody want to? How about authentication, authorization,

accounting, brokering?To state the obvious: different organizations – industrial, scientific &

governmental – will have different requirements and perceived risks for cloud computing.

“Cloud is a seamless extension of the Grid”Dan Reed (Microsoft VP of Technology Policy and Strategy) @ OGF30

Clouds are about provisioning resourcesGrids are about federating resources

More pragmatically…

4

Page 5: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

New requirements to existing Grid infrastructures While Grid interfaces are widely used among large scientific

communities, Cloud computing offers significant advantages for many reasons (e.g., pay-as-you-go models, simplified access)

Ideally, though, one would like to adopt Cloud services so that:Resources are shared between access interfaces (Grid, Cloud, or

else).Scalability is ensured.Existing services and agreements are not required to change

substantially.Resource centers policies are honored and know-how is preserved.New services can attract both existing and new customers.

5

Page 6: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

Examples of servicesrequested today Some of the typical new service requests we receive:

Customer-definable software environments.Setting up dynamic pools of virtual servers (to be used e.g. as

front-end pools, or as personal compute nodes)Instantiating pre-packaged, ready-to-go services.Truly distributed, on-demand Cloud storage.Not everybody “speaks Grid”: providing access to distributed,

traditional Grid infrastructures as if they were not Grids. This might be offered to non-traditional users as well (e.g. public administrations, or private companies)

The key problem is one of integration between resources and multiple access interfaces (Grid, Cloud, or else)

6

Page 7: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

A Grid/Cloud integration example: WNoDeS The Worker Nodes on Demands Service (WNoDeS) is a framework

developed by INFN. It is built around a tight integration with a LRMS (a “batch system”) and is running in production at CNAF since November 2009. The WNoDeS focus is on making resource polymorphism easy; its main architectural characteristics are: Integration with existing scheduling, policing, monitoring and accounting

workflows On-demand virtual resource provisioning and VLAN support

Using Linux KVM as VM manager Support for users to select and access WNoDeS-based resources through

Grid, Cloud interfaces, or through direct job submissions Either via command line, or using a Web portal

Scalability WNoDeS currently handles about 2,000 VMs at CNAF

No concept of “Cloud over Grid” or “Grid over Cloud”7

Page 8: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS 3/11/2010 15

WNoDeS

8

Page 9: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

WNoDeS VM instantiation

9

Page 10: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

WNoDeS Grid and local access

All WNoDeS-based resources may be accessed transparently by users through:the local batch system

For example, we have set up some batch queues so that jobs submitted to them run on automatically created (on-demand) VMs

the GridJobs of some VOs are automatically directed on VMs

without user interventionGrid users may explicitly specify the VM they want to use,

adding a CE requirement statement in the JDL for their jobs

10

Page 11: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

On-demand local provisioning:VIP

11

WNoDeS integrates the so-called Virtual Interactive Pools (VIP) interface A way for a local user to dynamically and

autonomously instantiate a compute node for interactive access

Characteristics: no root access, mount of external storage system, environment identical to the one locally found on other “static” and shared systems

User can specify RAM, number of CPUs, etc.

Resource usage billed to his group Can be used e.g. for software

development, to submit jobs, etc. Being tested right now by the local CMS

group In a Tier-3 environment, hosted at CNAF

and integrated in the bigger Tier-1

Page 12: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

WNoDes Cloud provisioning WNoDeS allows generic

instantiation of compute cores Allocated to the user requesting

them; not released until the user explicitly says so

User chooses resource size, operating system, and gets root access to them

Can be used to develop, install, test new software, run services, create small ad-hoc farms, etc.

“Cloud” resources are taken from the same resource pool used e.g. for “Grid” resources This is key for optimal resource

usage

12

Current “Cloud” instances:➡Small: 1 core, 1.7 GB RAM, 50 GB HD➡Medium: 2 cores, 3.5 GB RAM, 100 GB RAM➡Large: 4 cores, 7 GB RAM, 200 GB RAM➡Extra-large: 8 cores, 14 GB RAM, 400 GB RAM

Current “Grid” VM images normally fall into the “Small” instance above.

Two further options foreseen, initially for Grid and VIP jobs:➡Whole-node, hard: all hardware cores, (1.7 * num. cores) GB RAM, (50 * num. cores) GB HD➡Whole-node, soft: all available cores (with a minimum), (1.7 * num. cores) GB RAM, (50 * num. cores) GB HD

Note: network and distributed storage not considered here.

Page 13: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

WNoDeS Cloud access WNoDes supports the

Open Cloud Computing Interface (OCCI) standard But this is rarely used

directly by users Cloud instantiations

typically happen through a Web-based portal Integrating VOMS / gLite

Argus support for authentication and authorization

Allowing flexible policies to determine who you are, and what you are allowed to do

13

Page 14: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

The WNoDeS Authentication Framework Main goals:

Letting Cloud users access existing Grid resources Easy access to distributed resources for Cloud users, allowing exploitation of previous investments

Letting Grid users access resources “the Cloud way” Support new use models for existing users

14

Page 15: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

Next: the IntegratedGrid/Cloud Portal

Key goals: Web-based

access for the submission of Grid jobs (w/ JDL customizations)

Allowing “Grid users” to instantiate Cloud resources

Allowing “Cloud users” to exploit Grid resources

Integration of accounting visualization (HLRmon portlet)

15

Image courtesy M. Bencivenni, INFN-CNAF

Page 16: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

Conclusions WNoDeS is an example of a framework targeted at integrating Grid and

Cloud resources running in production mode at a sizable computing center

WNoDeS 2 is targeted for public release in September 2011 Supporting (among other things) PBS/Torque and Platform LSF as batch systems

Performance and scalability (which we have not discussed in any detail here) is always an important topic Just “provisioning and running VMs” (even hundreds of them) is relatively easy Many issues often only come up with real use cases involving hundreds or

thousands of VMs, associated to massive storage systems Network and storage virtualization are fields not often considered, but

essential with Cloud-related assignments Interconnection of Grids is a reality today. WNoDeS is being developed so

that some of the Grid developments of the past decade in that area can be reused, to achieve interconnection of multiple Clouds.

16

Page 17: The WNoDeS Project - agenda.ct.infn.itagenda.ct.infn.it/event/603/session/13/contribution/134/material/... · (a “batch system”) and is running in production at CNAF since November

EPIKH School - June 13, 2011D.Salomoni, INFN-CNAF - WNoDeS

That’s it1.[Cloud computing is] nothing more than a faddish term for the

established concept of computers linked by networks. A cloud is water vapor. (Larry Ellison, co-founder and CEO, Oracle Corporation, September 2009)

2.Q: What is Oracle’s Cloud Computing strategy? A: Oracle has two cloud computing objectives.The first is to ensure that [it] is fully enterprise-grade to enable enterprise adoption. [...] The second [is] to support both public and private cloud computing to give customers choice. (Oracle Cloud Computing FAQ, October 2010)

3.The truth is rarely pure and never simple. (Oscar Wilde, The Importance of Being Earnest, 1895)

Thanks! e-mail: [email protected]

17