188
Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The University of Chicago http://www.mcs.anl.gov/~foster rid Computing in Canada Workshop, University of Alberta, May 1, 200

Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

  • View
    220

  • Download
    5

Embed Size (px)

Citation preview

Page 1: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

Grid Computing:Concepts, Appplications, and

Technologies

Ian Foster

Mathematics and Computer Science Division

Argonne National Laboratory

and

Department of Computer Science

The University of Chicago

http://www.mcs.anl.gov/~foster

Grid Computing in Canada Workshop, University of Alberta, May 1, 2002

Page 2: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

2

Outline

The technology landscape Grid computing The Globus Toolkit Applications and technologies

– Data-intensive; distributed computing; collaborative; remote access to facilities

Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions

Page 3: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

3

Outline

The technology landscape Grid computing The Globus Toolkit Applications and technologies

– Data-intensive; distributed computing; collaborative; remote access to facilities

Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions

Page 4: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

4Living in an Exponential World

(1) Computing & Sensors

Moore’s Law: transistor count doubles each 18 months

Magnetohydro-dynamics

star formation

Page 5: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

5

Living in an Exponential World:(2) Storage

Storage density doubles every 12 months Dramatic growth in online data (1 petabyte =

1000 terabyte = 1,000,000 gigabyte)– 2000 ~0.5 petabyte

– 2005 ~10 petabytes

– 2010 ~100 petabytes

– 2015 ~1000 petabytes? Transforming entire disciplines in physical and,

increasingly, biological sciences; humanities next?

Page 6: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

6

Data Intensive Physical Sciences

High energy & nuclear physics– Including new experiments at CERN

Gravity wave searches– LIGO, GEO, VIRGO

Time-dependent 3-D systems (simulation, data)– Earth Observation, climate modeling

– Geophysics, earthquake modeling

– Fluids, aerodynamic design

– Pollutant dispersal scenarios Astronomy: Digital sky surveys

Page 7: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

7

Ongoing Astronomical Mega-Surveys

Large number of new surveys– Multi-TB in size, 100M objects or larger

– In databases

– Individual archives planned and under way Multi-wavelength view of the sky

– > 13 wavelength coverage within 5 years Impressive early discoveries

– Finding exotic objects by unusual colors> L,T dwarfs, high redshift quasars

– Finding objects by time variability> Gravitational micro-lensing

MACHO2MASSSDSSDPOSSGSC-IICOBE MAPNVSSFIRSTGALEXROSATOGLE...

MACHO2MASSSDSSDPOSSGSC-IICOBE MAPNVSSFIRSTGALEXROSATOGLE...

Page 8: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

8

Crab Nebula in 4 Spectral Regions

X-ray Optical

Infrared Radio

Page 9: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

9

Coming Floods of Astronomy Data

The planned Large Synoptic Survey Telescope will produce over 10 petabytes per year by 2008!– All-sky survey every few days, so will have

fine-grain time series for the first time

Page 10: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

10Data Intensive Biology and Medicine

Medical data– X-Ray, mammography data, etc. (many petabytes)

– Digitizing patient records (ditto) X-ray crystallography Molecular genomics and related disciplines

– Human Genome, other genome databases

– Proteomics (protein structure, activities, …)

– Protein interactions, drug delivery Virtual Population Laboratory (proposed)

– Simulate likely spread of disease outbreaks Brain scans (3-D, time dependent)

Page 11: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

11

And comparisons must bemade among many

We need to get to one micron to know location of every cell. We’re just now starting to get to 10 microns – Grids will help get us there and further

A Brainis a Lotof Data!

(Mark Ellisman, UCSD)

Page 12: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

12An Exponential World: (3) Networks

(Or, Coefficients Matter …) Network vs. computer performance

– Computer speed doubles every 18 months

– Network speed doubles every 9 months

– Difference = order of magnitude per 5 years 1986 to 2000

– Computers: x 500

– Networks: x 340,000 2001 to 2010

– Computers: x 60

– Networks: x 4000

Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan-2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins.

Page 13: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

13

Outline

The technology landscape Grid computing The Globus Toolkit Applications and technologies

– Data-intensive; distributed computing; collaborative; remote access to facilities

Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions

Page 14: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

14

Evolution of the Scientific Process

Pre-electronic– Theorize &/or experiment, alone or in small

teams; publish paper Post-electronic

– Construct and mine very large databases of observational or simulation data

– Develop computer simulations & analyses

– Exchange information quasi-instantaneously within large, distributed, multidisciplinary teams

Page 15: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

15

Evolution of Business

Pre-Internet– Central corporate data processing facility

– Business processes not compute-oriented Post-Internet

– Enterprise computing is highly distributed, heterogeneous, inter-enterprise (B2B)

– Outsourcing becomes feasible => service providers of various sorts

– Business processes increasingly computing- and data-rich

Page 16: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

16

The Grid

“Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”

Page 17: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

17

An Example Virtual Organization: CERN’s Large Hadron Collider

1800 Physicists, 150 Institutes, 32 Countries

100 PB of data by 2010; 50,000 CPUs?

Page 18: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

18Grid Communities & Applications:Data Grids for High Energy Physics

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm

~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000

SpecInt95 equivalents

www.griphyn.org www.ppdg.net www.eu-datagrid.org

Page 19: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

19Data Integration and Mining: (credit Sara Graves)

From Global Information to Local Knowledge

Precision Agriculture

Emergency Response

Weather Prediction

Urban Environments

Page 20: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

20Intelligent Infrastructure:Distributed Servers and Services

Page 21: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

22Grid Computing

Page 22: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

23

Early 90s– Gigabit testbeds, metacomputing

Mid to late 90s– Early experiments (e.g., I-WAY), academic software

projects (e.g., Globus, Legion), application experiments 2002

– Dozens of application communities & projects– Major infrastructure deployments– Significant technology base (esp. Globus ToolkitTM)– Growing industrial interest – Global Grid Forum: ~500 people, 20+ countries

The Grid:A Brief History

Page 23: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

28

The Grid World: Current Status Dozens of major Grid projects in scientific &

technical computing/research & education– www.mcs.anl.gov/~foster/grid-projects

Considerable consensus on key concepts and technologies– Open source Globus Toolkit™ a de facto standard for

major protocols & services Industrial interest emerging rapidly

– IBM, Platform, Microsoft, Sun, Compaq, … Opportunity: convergence of eScience and

eBusiness requirements & technologies

Page 24: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

35

Outline

The technology landscape Grid computing The Globus Toolkit Applications and technologies

– Data-intensive; distributed computing; collaborative; remote access to facilities

Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions

Page 25: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

36Grid Technologies:

Resource Sharing Mechanisms That …

Address security and policy concerns of resource owners and users

Are flexible enough to deal with many resource types and sharing modalities

Scale to large number of resources, many participants, many program components

Operate efficiently when dealing with large amounts of data & computation

Page 26: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

37

Aspects of the Problem

1) Need for interoperability when different groups want to share resources– Diverse components, policies, mechanisms

– E.g., standard notions of identity, means of communication, resource descriptions

2) Need for shared infrastructure services to avoid repeated development, installation– E.g., one port/service/protocol for remote access to

computing, not one per tool/appln

– E.g., Certificate Authorities: expensive to run A common need for protocols & services

Page 27: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

39

The Hourglass Model

Focus on architecture issues– Propose set of core services

as basic infrastructure– Use to construct high-level,

domain-specific solutions Design principles

– Keep participation cost low– Enable local control– Support for adaptation– “IP hourglass” model

Diverse global services

Coreservices

Local OS

A p p l i c a t i o n s

Page 28: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

40

Layered Grid Architecture(By Analogy to Internet Architecture)

Application

Fabric“Controlling things locally”: Access to, & control of, resources

Connectivity“Talking to things”: communication (Internet protocols) & security

Resource“Sharing single resources”: negotiating access, controlling use

Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services

InternetTransport

Application

Link

Inte

rnet P

roto

col

Arch

itectu

re

Page 29: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

41

Globus Toolkit™

A software toolkit addressing key technical problems in the development of Grid-enabled tools, services, and applications– Offer a modular set of orthogonal services

– Enable incremental development of grid-enabled tools and applications

– Implement standard Grid protocols and APIs

– Available under liberal open source license

– Large community of developers & users

– Commercial support

Page 30: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

42

General Approach

Define Grid protocols & APIs– Protocol-mediated access to remote resources

– Integrate and extend existing standards

– “On the Grid” = speak “Intergrid” protocols Develop a reference implementation

– Open source Globus Toolkit

– Client and server SDKs, services, tools, etc. Grid-enable wide variety of tools

– Globus Toolkit, FTP, SSH, Condor, SRB, MPI, … Learn through deployment and applications

Page 31: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

43

Key Protocols

The Globus Toolkit™ centers around four key protocols– Connectivity layer:

> Security: Grid Security Infrastructure (GSI)

– Resource layer:> Resource Management: Grid Resource Allocation Management

(GRAM)

> Information Services: Grid Resource Information Protocol (GRIP) and Index Information Protocol (GIIP)

> Data Transfer: Grid File Transfer Protocol (GridFTP)

Also key collective layer protocols– Info Services, Replica Management, etc.

Page 32: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

44

Globus Toolkit Structure

GRAM MDS

GSI

GridFTP MDS

GSI

???

GSI

Reliable invocationSoft state

management

Notification

ComputeResource

DataResource

Other Serviceor Application

Jobmanager

Jobmanager

Service naming

Page 33: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

45

GSI: www.gridforum.org/security/gsi

Connectivity LayerProtocols & Services

Communication– Internet protocols: IP, DNS, routing, etc.

Security: Grid Security Infrastructure (GSI)– Uniform authentication, authorization, and

message protection mechanisms in multi-institutional setting

– Single sign-on, delegation, identity mapping

– Public key technology, SSL, X.509, GSS-API

– Supporting infrastructure: Certificate Authorities, certificate & key management, …

Page 34: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

46

Why Grid Security is Hard

Resources being used may be extremely valuable & the problems being solved extremely sensitive

Resources are often located in distinct administrative domains– Each resource may have own policies & procedures

The set of resources used by a single computation may be large, dynamic, and/or unpredictable– Not just client/server

It must be broadly available & applicable– Standard, well-tested, well-understood protocols

– Integration with wide variety of tools

Page 35: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

47

1) Easy to use

2) Single sign-on

3) Run applicationsftp,ssh,MPI,Condor,Web,…

4) User based trust model

5) Proxies/agents (delegation)

1) Specify local access control

2) Auditing, accounting, etc.

3) Integration w/ local systemKerberos, AFS, license mgr.

4) Protection from compromisedresources

API/SDK with authentication, flexible message protection,

flexible communication, delegation, ...Direct calls to various security functions (e.g. GSS-API)Or security integrated into higher-level SDKs:

E.g. GlobusIO, Condor-G, MPICH-G2, HDF5, etc.

User View Resource Owner View

Developer View

Grid Security Requirements

Page 36: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

48

Grid Security Infrastructure (GSI) Extensions to existing standard protocols & APIs

– Standards: SSL/TLS, X.509 & CA, GSS-API

– Extensions for single sign-on and delegation Globus Toolkit reference implementation of GSI

– SSLeay/OpenSSL + GSS-API + delegation

– Tools and services to interface to local security> Simple ACLs; SSLK5 & PKINIT for access to K5, AFS, etc.

– Tools for credential management> Login, logout, etc.

> Smartcards

> MyProxy: Web portal login and delegation

> K5cert: Automatic X.509 certificate creation

Page 37: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

49

Site A(Kerberos)

Site B (Unix)

Site C(Kerberos)

Computer

User

Single sign-on via “grid-id”& generation of proxy cred.

Or: retrieval of proxy cred.from online repository

User ProxyProxy

credential

Computer

Storagesystem

Communication*

GSI-enabledFTP server

AuthorizeMap to local idAccess file

Remote fileaccess request*

GSI-enabledGRAM server

GSI-enabledGRAM server

Remote processcreation requests*

* With mutual authentication

Process

Kerberosticket

Restrictedproxy

Process

Restrictedproxy

Local id Local id

AuthorizeMap to local idCreate processGenerate credentials

Ditto

GSI in Action: “Create Processes at A and B that Communicate & Access Files at C”

Page 38: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

50

GSI Working Group Documents

Grid Security Infrastructure (GSI) Roadmap– Informational draft overview of working group

activities and documents Grid Security Protocols & Syntax

– X.509 Proxy Certificates

– X.509 Proxy Delegation Protocol

– The GSI GSS-API Mechanism Grid Security APIs

– GSS-API Extensions for the Grid

– GSI Shell API

Page 39: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

51

GSI Futures

Scalability in numbers of users & resources – Credential management

– Online credential repositories (“MyProxy”)

– Account management Authorization

– Policy languages

– Community authorization Protection against compromised resources

– Restricted delegation, smartcards

Page 40: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

52

CAS1. CAS request, with resource names and operations

Community Authorization

Does the collective policy authorize this

request for this user?

user/group membership

resource/collective membership

collective policy information

Resource

Is this request authorized for

the CAS?

Is this request authorized by

the capability? local policy

information

4. Resource reply

User 3. Resource request, authenticated with

capability

2. CAS reply, with and resource CA info

capability

Laura Pearlman, Steve Tuecke, Von Welch, others

Page 41: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

53

GRAM, GridFTP, GRIS: www.globus.org

Resource LayerProtocols & Services

Grid Resource Allocation Management (GRAM) – Remote allocation, reservation, monitoring, control

of compute resources GridFTP protocol (FTP extensions)

– High-performance data access & transport Grid Resource Information Service (GRIS)

– Access to structure & state information Others emerging: Catalog access, code repository

access, accounting, etc. All built on connectivity layer: GSI & IP

Page 42: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

54

Resource Management

The Grid Resource Allocation Management (GRAM) protocol and client API allows programs to be started and managed on remote resources, despite local heterogeneity

Resource Specification Language (RSL) is used to communicate requirements

A layered architecture allows application-specific resource brokers and co-allocators to be defined in terms of GRAM services– Integrated with Condor, PBS, MPICH-G2, …

Page 43: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

55

GRAM GRAM GRAM

LSF Condor NQE

Application

RSL

Simple ground RSL

Information Service

Localresourcemanagers

RSLspecialization

Broker

Ground RSL

Co-allocator

Queries& Info

Resource Management Architecture

Page 44: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

56

Data Access & Transfer

GridFTP: extended version of popular FTP protocol for Grid data access and transfer

Secure, efficient, reliable, flexible, extensible, parallel, concurrent, e.g.:– Third-party data transfers, partial file transfers

– Parallelism, striping (e.g., on PVFS)

– Reliable, recoverable data transfers Reference implementations

– Existing clients and servers: wuftpd, ncftp

– Flexible, extensible libraries in Globus Toolkit

Page 45: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

57The Grid Information Problem

Large numbers of distributed “sensors” with different properties

Need for different “views” of this information, depending on community membership, security constraints, intended purpose, sensor type

Page 46: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

58The Globus Toolkit Solution: MDS-2

Registration & enquiry protocols, information models, query languages– Provides standard interfaces to sensors

– Supports different “directory” structures supporting various discovery/access strategies

Page 47: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

59

Globus Applications and Deployments

Application projects include– GriPhyN, PPDG, NEES, EU DataGrid, ESG,

Fusion Collaboratory, etc., etc. Infrastructure deployments include

– DISCOM, NASA IPG, NSF TeraGrid, DOE Science Grid, EU DataGrid, etc., etc.

– UK Grid Center, U.S. GRIDS Center Technology projects include

– Data Grids, Access Grid, Portals, CORBA, MPICH-G2, Condor-G, GrADS, etc., etc.

Page 48: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

60

Globus Futures Numerous large projects are pushing hard on

production deployment & application– Much will be learned in next 2 years!

Active R&D program, focused for example on– Security & policy for resource sharing

– Flexible, high-perf., scalable data sharing

– Integration with Web Services etc.

– Programming models and tools Community code development producing a true

Open Grid Architecture

Page 49: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

61

Outline

The technology landscape Grid computing The Globus Toolkit Applications and technologies

– Data-intensive; distributed computing; collaborative; remote access to facilities

Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions

Page 50: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

62

Important Grid Applications

Data-intensive Distributed computing (metacomputing) Collaborative Remote access to, and computer

enhancement of, experimental facilities

Page 51: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

63

Outline

The technology landscape Grid computing The Globus Toolkit Applications and technologies

– Data-intensive; distributed computing; collaborative; remote access to facilities

Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions

Page 52: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

64

Data Intensive Science: 2000-2015

Scientific discovery increasingly driven by IT – Computationally intensive analyses

– Massive data collections

– Data distributed across networks of varying capability

– Geographically distributed collaboration Dominant factor: data growth (1 Petabyte = 1000 TB)

– 2000 ~0.5 Petabyte

– 2005 ~10 Petabytes

– 2010 ~100 Petabytes

– 2015 ~1000 Petabytes?

How to collect, manage,access and interpret thisquantity of data?

Drives demand for “Data Grids” to handleadditional dimension of data access & movement

Page 53: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

65Data Grid Projects Particle Physics Data Grid (US, DOE)

– Data Grid applications for HENP expts. GriPhyN (US, NSF)

– Petascale Virtual-Data Grids iVDGL (US, NSF)

– Global Grid lab TeraGrid (US, NSF)

– Dist. supercomp. resources (13 TFlops) European Data Grid (EU, EC)

– Data Grid technologies, EU deployment CrossGrid (EU, EC)

– Data Grid technologies, EU emphasis DataTAG (EU, EC)

– Transatlantic network, Grid applications Japanese Grid Projects (APGrid) (Japan)

– Grid deployment throughout Japan

Collaborations of application scientists & computer scientists

Infrastructure devel. & deployment

Globus based

Page 54: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

66Grid Communities & Applications:Data Grids for High Energy Physics

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm

~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000

SpecInt95 equivalents

www.griphyn.org www.ppdg.net www.eu-datagrid.org

Page 55: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

67Biomedical InformaticsResearch Network (BIRN)

Evolving reference set of brains provides essential data for developing therapies for neurological disorders (multiple sclerosis, Alzheimer’s, etc.).

Today – One lab, small patient base– 4 TB collection

Tomorrow– 10s of collaborating labs– Larger population sample– 400 TB data collection: more

brains, higher resolution– Multiple scale data integration

and analysis

Page 56: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

68

Digital Radiology (Hollebeek, U. Pennsylvania)

Hospital digital data– Very large data sources: great clinical value to digital

storage and manipulation and significant cost savings

– 7 Terabytes per hospital per year

– Dominated by digital images Why mammography

– Clinical need for film recall & computer analysis

– Large volume ( 4,000 GB/year ) (57% of total)

– Storage and records standards exist

– Great clinical value

mammograms X-raysMRICAT scansendoscopies ...

Page 57: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

69Earth System Grid

(ANL, LBNL, LLNL, NCAR, ISI, ORNL)

Enable a distributed community [of thousands] to perform computationally intensive analyses on large climate datasets

Via– Creation of Data Grid supporting secure, high-

performance remote access

– “Smart data servers” supporting reduction and analyses

– Integration with environmental data analysis systems, protocols, and thin clients

www.earthsystemgrid.org (soon)

Page 58: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

70

Earth System Grid Architecture

Metadata Catalog

Replica Catalog

Tape Library

Disk Cache

Attribute Specification

Logical Collection and Logical File Name

Disk Array Disk Cache

Application

Replica Selection

Multiple Locations

NWS

SelectedReplica

GridFTP commands PerformanceInformation &Predictions

Replica Location 1 Replica Location 2 Replica Location 3

MDS

Page 59: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

71

Data Grid Toolkit Architecture

Data Movement ServiceOptimized Endpoint Management

(Bulk parallel transfer, rate-limited transfer, disk/network scheduling)

Data Transfer ServiceEnd-to-End File Transfer

(Link optimization, performance guarantees, admission control)

Collective Data Management ServiceCollective File Movement

(Collection mgmt., priority, fault recovery, replication, resource selection)

Page 60: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

72

A UniversalAccess/Transport Protocol

Suite of communication libraries and related tools that support– GSI security

– Third-party transfers

– Parameter set/negotiate

– Partial file access

– Reliability/restart

– Logging/audit trail All based on a standard, widely deployed protocol

– Integrated instrumentation

– Parallel transfers

– Striping (cf DPSS)

– Policy-based access control

– Server-side computation

[later]

Page 61: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

73

And the Universal Protocol is … GridFTP

Why FTP?– Ubiquity enables interoperation with many

commodity tools– Already supports many desired features, easily

extended to support others We use the term GridFTP to refer to

– Transfer protocol which meets requirements– Family of tools which implement the protocol

Note GridFTP > FTP Note that despite name, GridFTP is not restricted

to file transfer!

Page 62: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

74

GridFTP: Basic Approach

FTP is defined by several IETF RFCs Start with most commonly used subset

– Standard FTP: get/put etc., 3rd-party transfer Implement RFCed but often unused features

– GSS binding, extended directory listing, simple restart

Extend in various ways, while preserving interoperability with existing servers– Stripe/parallel data channels, partial file,

automatic & manual TCP buffer setting, progress and extended restart

Page 63: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

75

The GridFTP Family of Tools

Patches to existing FTP code– GSI-enabled versions of existing FTP client

and server, for high-quality production code Custom-developed libraries

– Implement full GridFTP protocol, targeting custom use, high-performance

Custom-developed tools– E.g., high-performance striped FTP server

Page 64: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

76

High-Performance Data Transfer

ControlInterface

Scheduling Modules

Data Channel

Bulk Transfer Protocol

TCP Transfer Protocol

Rate Limiting Interface

Enquiry

GRIP GridFTP TCP, BTP… TCP, BTP…

Data Channel

Bulk Transfer Protocol

TCP Transfer Protocol

Rate Limiting Interface

Resource Mgmt.(disk, NIC)

GRAM

Page 65: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

77

GridFTP for Efficient WAN Transfer

Transfer Tb+ datasets– Highly-secure authentication– Parallel transfer for speed

LLNL->Chicago transfer (slow site network interfaces):

FUTURE: Integrate striped GridFTP with parallel storage systems, e.g., HPSS

Parallel TransferFully utilizes bandwidth of

network interface on single nodes.

Striped TransferFully utilizes bandwidth of

Gb+ WAN using multiple nodes.

Par

alle

l F

iles

yste

m

Par

alle

l F

iles

yste

m

GridFTP (globus-url-copy)

0

10

20

30

40

50

60

70

80

0 5 10 15 20 25 30 35

# of Parallel Streams

Ban

dw

idth

(M

bs)

Page 66: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

78

GridFTP for User-Friendly Visualization Setup

High-res visualization is too large for display on a single system– Needs to be tiled, 24bit->16bit

depth– Needs to be staged to display units

GridFTP/ActiveMural integration application performs tiling, data reduction, and staging in a single operation– PVFS/MPI-IO on server– MPI process group transforms data

as needed before transfer– Performance is currently bounded

by 100Mb/s NICs on display nodes

Page 67: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

79Distributed Computing+Visualization

Remote CenterGenerates Tb+ datasets fromsimulation code

LAN/WAN Transfer

User-friendly striped GridFTP application tiles the frames and stages tiles onto display nodes

FLASH data transferredto ANL for visualization

GridFTP parallelismutilizes high bandwidth

(Capable of utilizing>Gb/s WAN links)

WAN Transfer Chiba City Visualizationcode constructsand storeshigh-resolutionvisualizationframes fordisplay onmany devices

ActiveMural DisplayDisplays very high resolution

large-screen dataset animations

Job SubmissionSimulation code submitted toremote center for execution

on 1000s of nodes

FUTURE (1-5 yrs)• 10s Gb/s LANs, WANs• End-to-end QoS• Automated replica management• Server-side data reduction & analysis• Interactive portals

Page 68: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

80

SC’2001 Experiment:Simulation of HEP Tier 1 Site

Tiered (Hierarchical) Site Structure– All data generated at lower tiers must be forwarded to

the higher tiers

– Tier 1 sites may have many sites transmitting to them simultaneously and will need to sink a substantial amount of bandwidth

– We demonstrated the ability of GridFTP to support this at SC 2001 in the Bandwidth Challenge

– 16 Sites, with 27 Hosts, pushed a peak of 2.8 Gbs to the showfloor in Denver with a sustained bandwidth of nearly 2 Gbs

Page 69: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

81Visualization of Network Traffic

During the Bandwidth Challenge

Page 70: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

82

The ReplicaManagement Problem

Maintain a mapping between logical names for files and collections and one or more physical locations

Important for many applications Example: CERN high-level trigger data

– Multiple petabytes of data per year

– Copy of everything at CERN (Tier 0)

– Subsets at national centers (Tier 1)

– Smaller regional centers (Tier 2)

– Individual researchers will have copies

Page 71: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

83

Our Approach to Replica Management

Identify replica cataloging and reliable replication as two fundamental services– Layer on other Grid services: GSI, transport,

information service– Use as a building block for other tools

Advantage– These services can be used in a wide variety

of situations

Page 72: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

84

Replica Catalog Structure: A Climate Modeling Example

Logical File Parent

Logical File Jan 1998

Logical CollectionC02 measurements 1998

Replica Catalog

Locationjupiter.isi.edu

Locationsprite.llnl.gov

Logical File Feb 1998

Size: 1468762

Filename: Jan 1998Filename: Feb 1998…

Filename: Mar 1998Filename: Jun 1998Filename: Oct 1998Protocol: gsiftpUrlConstructor: gsiftp://jupiter.isi.edu/ nfs/v6/climate

Filename: Jan 1998…Filename: Dec 1998Protocol: ftpUrlConstructor: ftp://sprite.llnl.gov/ pub/pcmdi

Logical CollectionC02 measurements 1999

Page 73: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

85

Giggle: A Scalable Replication Location Service

Local replica catalogs maintain definitive information about replicas

Publish (perhaps approximate) information using soft state techniques

Variety of indexing strategies possible

0

2

4

6

8

10

1 10 100 1000Number of LFNs ('000)

Tim

e (m

s)

Create (LRC) Add (LRC)Delete (LRC) Query (LRC)Query (RLI)

1

10

100

1000

10000

1 10 100 1000

Number of LFNs ('000)

Tim

e fo

r sof

t-st

ate

upda

te (s

ec)

1 LRC

2 LRCs

Page 74: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

86

GriPhyN = App. Science + CS + Grids

GriPhyN = Grid Physics Network– US-CMS High Energy Physics

– US-ATLAS High Energy Physics

– LIGO/LSC Gravity wave research

– SDSS Sloan Digital Sky Survey

– Strong partnership with computer scientists Design and implement production-scale grids

– Develop common infrastructure, tools and services

– Integration into the 4 experiments

– Application to other sciences via “Virtual Data Toolkit” Multi-year project

– R&D for grid architecture (funded at $11.9M +$1.6M)

– Integrate Grid infrastructure into experiments through VDT

Page 75: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

87

GriPhyN Institutions

– U Florida

– U Chicago

– Boston U

– Caltech

– U Wisconsin, Madison

– USC/ISI

– Harvard

– Indiana

– Johns Hopkins

– Northwestern

– Stanford

– U Illinois at Chicago

– U Penn

– U Texas, Brownsville

– U Wisconsin, Milwaukee

– UC Berkeley

– UC San Diego

– San Diego Supercomputer Center

– Lawrence Berkeley Lab

– Argonne

– Fermilab

– Brookhaven

Page 76: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

88

GriPhyN: PetaScale Virtual Data Grids

Virtual Data Tools

Request Planning &

Scheduling ToolsRequest Execution & Management Tools

Transforms

Distributed resources(code, storage, CPUs,networks)

Resource Management

Services

Resource Management

Services

Security and Policy

Services

Security and Policy

Services

Other Grid ServicesOther Grid

Services

Interactive User Tools

Production TeamIndividual Investigator Workgroups

Raw data source

~1 Petaop/s~100 Petabytes

Page 77: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

89

GriPhyN/PPDGData Grid Architecture

Application

Planner

Executor

Catalog Services

Info Services

Policy/Security

Monitoring

Repl. Mgmt.

Reliable TransferService

Compute Resource Storage Resource

DAG

DAG

DAGMAN, Kangaroo

GRAM GridFTP; GRAM; SRM

GSI, CAS

MDS

MCAT; GriPhyN catalogs

GDMP

MDS

Globus

= initial solution is operational

Ewa Deelman, Mike Wilde

Page 78: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

90GriPhyN Research Agenda

Virtual Data technologies– Derived data, calculable via algorithm

– Instantiated 0, 1, or many times (e.g., caches)

– “Fetch value” vs. “execute algorithm”

– Potentially complex (versions, cost calculation, etc) E.g., LIGO: “Get gravitational strain for 2 minutes around 200

gamma-ray bursts over last year” For each requested data value, need to

– Locate item materialization, location, and algorithm

– Determine costs of fetching vs. calculating

– Plan data movements, computations to obtain results

– Execute the plan

Page 79: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

91

Virtual Data in Action

Data request may– Compute locally

– Compute remotely

– Access local data

– Access remote data Scheduling based on

– Local policies

– Global policies

– Cost

Major facilities, archives

Regional facilities, caches

Local facilities, cachesFetch item

Page 80: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

92

GriPhyN Research Agenda (cont.) Execution management

– Co-allocation (CPU, storage, network transfers)

– Fault tolerance, error reporting

– Interaction, feedback to planning Performance analysis (with PPDG)

– Instrumentation, measurement of all components

– Understand and optimize grid performance Virtual Data Toolkit (VDT)

– VDT = virtual data services + virtual data tools

– One of the primary deliverables of R&D effort

– Technology transfer to other scientific domains

Page 81: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

93

Programs as Community Resources:Data Derivation and Provenance

Most scientific data are not simple “measurements”; essentially all are:– Computationally corrected/reconstructed

– And/or produced by numerical simulation And thus, as data and computers become

ever larger and more expensive:– Programs are significant community resources

– So are the executions of those programs

Page 82: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

94

Transformation Derivation

Data

created-by

execution-of

consumed-by/generated-by

“I’ve detected a calibration error in an instrument and

want to know which derived data to recompute.”

“I’ve come across some interesting data, but I need to understand the nature of the corrections applied when it was constructed before I can trust it for my purposes.”

“I want to search an astronomical database for galaxies with certain characteristics. If a program that performs this analysis exists, I won’t have to write one from scratch.”

“I want to apply an astronomical analysis program to millions of objects. If the results

already exist, I’ll save weeks of computation.”

Page 83: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

95

The Chimera Virtual Data System(GriPhyN Project)

Virtual data catalog– Transformations,

derivations, data Virtual data language

– Data definition + query

Applications include browsers and data analysis applications

Data Grid Resources(distributed execution

and data management)

VDL Interpreter(manipulate derivations

and transformations)

Virtual Data Catalog(implements ChimeraVirtual Data Schema)

Virtual DataApplications

Virtual Data Language(definition and query)

Task Graphs(compute and data

movement tasks, withdependencies)

SQL

Chimera

Page 84: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

96SDSS Galaxy Cluster Finding

Page 85: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

97

Cluster-finding Data Pipelinecatalog

cluster

5

4

core

brg

field

tsObj

3

2

1

brg

field

tsObj

2

1

brg

field

tsObj

2

1

brg

field

tsObj

2

1

core

3

Page 86: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

98

Virtual Data in CMS

Virtual Data Long Term Vision of CMS: CMS Note 2001/047, GRIPHYN 2001-16

Page 87: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

99

NCSA Linux cluster

5) Secondary reports complete to master

Master Condor job running at

Caltech

7) GridFTP fetches data from UniTree

NCSA UniTree - GridFTP-enabled FTP server

4) 100 data files transferred via GridFTP, ~ 1 GB each

Secondary Condor job on WI

pool

3) 100 Monte Carlo jobs on Wisconsin Condor pool

2) Launch secondary job on WI pool; input files via Globus GASS

Caltech workstation

6) Master starts reconstruction jobs via Globus jobmanager on cluster

8) Processed objectivity database stored to UniTree

9) Reconstruction job reports complete to master

Early GriPhyN Challenge Problem:CMS Data Reconstruction

Scott Koranda, Miron Livny, others

Page 88: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

100

0

20

40

60

80

100

120

Pre / Simulation Jobs / Post (UW Condor)

ooHits at NCSA

ooDigis at NCSA

Delay due to script error

Trace of a Condor-G Physics Run

Page 89: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

101

Outline

The technology landscape Grid computing The Globus Toolkit Applications and technologies

– Data-intensive; distributed computing; collaborative; remote access to facilities

Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions

Page 90: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

102

Distributed Computing

Aggregate computing resources & codes– Multidisciplinary simulation

– Metacomputing/distributed simulation

– High-throughput/parameter studies Challenges

– Heterogeneous compute & network capabilities, latencies, dynamic behaviors

Example tools– MPICH-G2: Grid-aware MPI

– Condor-G, Nimrod-G: parameter studies

Page 91: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

103

•Lift Capabilities•Drag Capabilities•Responsiveness

•Deflection capabilities•Responsiveness

•Thrust performance•Reverse Thrust performance•Responsiveness•Fuel Consumption

•Braking performance•Steering capabilities•Traction•Dampening capabilities

Crew Capabilities- accuracy- perception- stamina- re-action times- SOPs

Engine Models

Airframe Models

Wing Models

Landing Gear Models

Stabilizer Models

Human Models

Multidisciplinary Simulations: Aviation Safety

Whole system simulations are produced by coupling all of the sub-system simulations

Page 92: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

104

MPICH-G2: A Grid-Enabled MPI

A complete implementation of the Message Passing Interface (MPI) for heterogeneous, wide area environments– Based on the Argonne MPICH implementation of

MPI (Gropp and Lusk) Requires services for authentication, resource

allocation, executable staging, output, etc. Programs run in wide area without change

– Modulo accommodating heterogeneous communication performance

See also: MetaMPI, PACX, STAMPI, MAGPIE

www.globus.org/mpi

Page 93: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

105

Grid-based Computation: Challenges

Locate “suitable” computers Authenticate with appropriate sites Allocate resources on those computers Initiate computation on those computers Configure those computations Select “appropriate” communication methods Compute with “suitable” algorithms Access data files, return output Respond “appropriately” to resource changes

Page 94: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

106MPICH-G2 Use of Grid Services

mpirunGenerates

resource specification

globusrunSubmits multiple jobs

GRAM GRAM GRAM

DUROC Coordinates startup

Authenticates

fork

P1 P2

LSF

P1 P2

LoadLeveler

P1 P2

Initiates job

Communicates via vendor-MPI and TCP/IP (globus-io)

Monitors/controls

Detects termination

MDSLocateshosts

GASSStages

executables

% grid-proxy-init% mpirun -np 256 myprog

Page 95: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

107

Cactus(Allen, Dramlitsch, Seidel, Shalf, Radke)

Modular, portable framework for parallel, multidimensional simulations

Construct codes by linking– Small core (flesh): mgmt services

– Selected modules (thorns): Numerical methods, grids & domain decomps, visualization and steering, etc.

Custom linking/configuration tools Developed for astrophysics, but not

astrophysics-specific

Cactus “flesh”

Thorns

www.cactuscode.org

Page 96: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

108Cactus: An Application

Framework for Dynamic Grid Computing Cactus thorns for active management of application

behavior and resource use Heterogeneous resources, e.g.:

– Irregular decompositions

– Variable halo for managing message size

– Msg compression (comp/comm tradeoff)

– Comms scheduling for comp/comm overlap Dynamic resource behaviors/demands, e.g.:

– Perf monitoring, contract violation detection

– Dynamic resource discovery & migration

– User notification and steering

Page 97: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

109Cactus Example:Terascale Computing

Gig-E100MB/sec

SDSC IBM SP1024 procs5x12x17 =1020

NCSA Origin Array256+128+1285x12x(4+2+2) =480

OC-12 lineBut only 2.5MB/sec)

17

12

512

5

4 2 2

Solved EEs for gravitational waves (real code)– Tightly coupled, communications required through derivatives– Must communicate 30MB/step between machines– Time step take 1.6 sec

Used 10 ghost zones along direction of machines: communicate every 10 steps

Compression/decomp. on all data passed in this direction Achieved 70-80% scaling, ~200GF (only 14% scaling without tricks)

Page 98: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

110

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1 3 5 7 9

11

13

15

17

19

21

23

25

27

29

31

33

Clock Time

Ite

rati

on

s/S

ec

on

dCactus Example (2):Migration in Action

Loadapplied

3 successivecontract

violations

RunningAt UIUC

(migrationtime not to scale)

Resourcediscovery

& migration

RunningAt UC

Page 99: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

111

AmesMoffett Field, CA

GlennCleveland, OH

LangleyHampton, VA

Whitcomb

OVERFLOW on IPG using Globus and MPICH-G2 for

intra-problem, wide area communication

Lomax512 node SGI Origin 2000

high-lift subsonicwind tunnel model

Sharp

Application POC: Mohammad J. Djomehri

IPG Milestone 3:Large Computing Node

Completed 12/2000

Slide courtesy Bill Johnston, LBNL & NASA

Page 100: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

112

High-Throughput Computing:Condor

High-throughput computing platform for mapping many tasks to idle computers

Three major components– Scheduler manages pool(s) of [distributively

owned or dedicated] computers

– DAGman manages user task pools

– Matchmaker schedules tasks to computers Parameter studies, data analysis Condor-G extensions support wide area

execution in Grid environment

www.cs.wisc.edu/condor

Page 101: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

113

Defining a DAG A DAG is defined by a .dag file, listing each of its

nodes and their dependencies:

# diamond.dag

Job A a.sub

Job B b.sub

Job C c.sub

Job D d.sub

Parent A Child B C

Parent B C Child D Each node runs the Condor job specified by its

accompanying Condor submit file

Job A

Job B Job C

Job D

Page 102: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

114

High-Throughput Computing:Mathematicians Solve NUG30

Looking for the solution to the NUG30 quadratic assignment problem

An informal collaboration of mathematicians and computer scientists

Condor-G delivered 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites)

14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13,26,17,30,6,20,19,8,18,7,27,12,11,23

MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin

Page 103: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

115Grid Application Development Software

(GrADS) Project

hipersoft.rice.edu/grads

Page 104: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

116

Outline

The technology landscape Grid computing The Globus Toolkit Applications and technologies

– Data-intensive; distributed computing; collaborative; remote access to facilities

Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions

Page 105: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

117Access Grid

High-end group work and collaboration technology

Grid services being used for discovery, configuration, authentication

O(50) systems deployed worldwide

Basis for SC’2001 SC Global event in November 2001– www.scglobal.org

Ambient mic(tabletop)

Presentermic

Presentercamera

Audience camera

www.accessgrid.org

Page 106: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

118

Outline

The technology landscape Grid computing The Globus Toolkit Applications and technologies

– Data-intensive; distributed computing; collaborative; remote access to facilities

Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions

Page 107: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

119

Grid-Enabled Research Facilities:Leverage Investments

Research instruments, satellites, particle accelerators, MRI machines, etc., cost a great deal

Data from those devices can be accessed and analyzed by many more scientists– Not just the team that gathered the data

More productive use of instruments– Calibration, data sampling during a run, via

on-demand real-time processing

Page 108: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

120

NETWORK

IMAGINGINSTRUMENTS

COMPUTATIONALRESOURCES

LARGE-SCALEDATABASES

DATA ACQUISITIONPROCESSING,

ANALYSISADVANCED

VISUALIZATION

Telemicroscopy & Grid-Based Computing

Page 109: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

121

Tokyo XP(Chicago)

STAR TAP

TransPAC vBNS

(San Diego)SDSC

NCMIR(San Diego)

UCSD

UHVEM(Osaka, Japan)

CRL/MPT

NCMIR(San Diego)

UHVEM(Osaka, Japan)

1st

2nd

Globus

APAN Trans-Pacific Telemicroscopy Collaboration, Osaka-U, UCSD, ISI

(slide courtesy Mark Ellisman@UCSD)

Page 110: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

122Network forEarthquake Engineering Simulation

NEESgrid: US national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other

On-demand access to experiments, data streams, computing, archives, collaboration

Argonne, Michigan, NCSA, UIUC, USC www.neesgrid.org

Page 111: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

123

“Experimental Facilities” Can Include Field Sites

Remotely controlled sensor grids for field studies, e.g., in seismology and biology– Wireless/satellite communications

– Sensor net technology for low-cost communications

Page 112: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

124

Outline

The technology landscape Grid computing The Globus Toolkit Applications and technologies

– Data-intensive; distributed computing; collaborative; remote access to facilities

Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions

Page 113: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

125

Nature and Role of Grid Infrastructure

Persistent Grid infrastructure is critical to the success of many eScience projects– High-speed networks, certainly

– Remotely accessible compute & storage

– Persistent, standard services: PKI, directories, reservation, …

– Operational & support procedures Many projects creating such infrastructures

– Production operation is the goal, but much to learn about how to create & operate

Page 114: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

126

A National Grid Infrastructure

REGIONAL

AA

AA

AA

AA

AA

REGIONAL

AA

AA

AA

AA

AA

REGIONAL

AA

AA

AA

AA

AA

REGIONAL

AA

AA

AA

AA

AA

REGIONAL

AA

AA

AA

AA

AA

Page 115: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

127

Example Grid Infrastructure Projects

I-WAY (1995): 17 U.S. sites for one week GUSTO (1998): 80 sites worldwide, exp NASA Information Power Grid (since 1999)

– Production Grid linking NASA laboratories INFN Grid, EU DataGrid, iVDGL, … (2001+)

– Grids for data-intensive science TeraGrid, DOE Science Grid (2002+)

– Production Grids link supercomputer centers U.S. GRIDS Center

– Software packaging, deployment, support

Page 116: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

128The 13.6 TF TeraGrid:Computing at 40 Gb/s

26

24

8

4 HPSS

5

HPSS

HPSS UniTree

External Networks

External Networks

External Networks

External Networks

Site Resources Site Resources

Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB

SDSC4.1 TF225 TB

Caltech Argonne

NCSA, SDSC, Caltech, Argonne www.teragrid.org

Page 117: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

129TeraGrid (Details)

32

32

5

32

32

5

Cisco 6509 Catalyst Switch/Router

32 quad-processor McKinley Servers(128p @ 4GF, 8GB memory/server)

Fibre Channel Switch

HPSS

HPSS

ESnetHSCCMREN/AbileneStarlight

10 GbE

16 quad-processor McKinley Servers(64p @ 4GF, 8GB memory/server)

NCSA500 Nodes

8 TF, 4 TB Memory240 TB disk

SDSC256 Nodes

4.1 TF, 2 TB Memory225 TB disk

Caltech32 Nodes

0.5 TF 0.4 TB Memory

86 TB disk

Argonne64 Nodes

1 TF0.25 TB Memory

25 TB disk

IA-32 nodes

4

Juniper M160

OC-12

OC-48

OC-12

574p IA-32 Chiba City

128p Origin

HR Display & VR Facilities

= 32x 1GbE

= 64x Myrinet

= 32x FibreChannel

Myrinet Clos SpineMyrinet Clos Spine Myrinet Clos SpineMyrinet Clos Spine

Chicago & LA DTF Core Switch/RoutersCisco 65xx Catalyst Switch (256 Gb/s Crossbar)

= 8x FibreChannel

OC-12

OC-12

OC-3

vBNSAbileneMREN

Juniper M40

1176p IBM SPBlue Horizon

OC-48

NTON

32

24

8

32

24

8

4

4

Sun E10K

4

1500p Origin

UniTree

1024p IA-32 320p IA-64

2

14

8

Juniper M40vBNS

AbileneCalrenESnet

OC-12

OC-12

OC-12

OC-3

8

SunStarcat

16

GbE

= 32x Myrinet

HPSS

256p HP X-Class

128p HP V2500

92p IA-32

24Extreme

Black Diamond

32 quad-processor McKinley Servers(128p @ 4GF, 12GB memory/server)

OC-12 ATM

Calren

2 2

Page 118: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

130

Targeted StarLightOptical Network Connections

Vancouver

Seattle

Portland

San Francisco

Los Angeles

San Diego(SDSC)

NCSA

Chicago NYC

SURFnetCA*net4Asia-

Pacific

Asia-Pacific

AMPATH

PSC

Atlanta

IU

U Wisconsin

DTF 40Gb

NTON

NTON

AMPATH

Atlanta

NCSA/UIUC

ANL

UICChicago Cross connect

NW Univ (Chicago) StarLight Hub

Ill Inst of Tech

Univ of Chicago Indianapolis (Abilene NOC)

St Louis GigaPoP

I-WIRE

www.startap.net

CERN

Page 119: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

April 18, 2023 Introduction to Grid [email protected] ARGONNE CHICAGO

131

CA*net 4 Architecture

Calgary ReginaWinnipeg

OttawaMontreal

Toronto

Halifax

St. John’s

Fredericton

Charlottetown

ChicagoSeattleNew York

CANARIEGigaPOP

ORAN DWDM

Carrier DWDM

Thunder Bay

CA*net 4 node)

Possible future CA*net 4 node

Quebec

Windsor

Edmonton

Saskatoon

Victoria

Vancouver

Boston

Page 120: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

132

2001.9.3

Current status2001(plan)

Europe

Australia

KoreaJapan

China

Thailand

Malaysia

Singapore Indonesia

STAR TAP(USA)

PhilippinesVietnam

Hong Kong

Sri Lanka

622Mbps x 2

APAN Network Topology

Page 121: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

133

iVDGL: A Global Grid Laboratory

International Virtual-Data Grid Laboratory– A global Grid laboratory (US, Europe, Asia, South

America, …)– A place to conduct Data Grid tests “at scale”– A mechanism to create common Grid infrastructure– A laboratory for other disciplines to perform Data Grid

tests– A focus of outreach efforts to small institutions

U.S. part funded by NSF (2001-2006)– $13.7M (NSF) + $2M (matching)

“We propose to create, operate and evaluate, over asustained period of time, an international researchlaboratory for data-intensive science.”

From NSF proposal, 2001

Page 122: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

134

Initial US-iVDGL Data Grid

Tier1 (FNAL)Proto-Tier2Tier3 university

UCSDFlorida

Wisconsin

FermilabBNL

Indiana

BU

Other sites to be added in

2002

SKC

Brownsville

Hampton

PSU

JHUCaltech

Page 123: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

135

U.S. PIs: Avery, Foster, Gardner, Newman, Szalay www.ivdgl.org

iVDGL:International Virtual Data Grid Laboratory

Tier0/1 facility

Tier2 facility

10 Gbps link

2.5 Gbps link

622 Mbps link

Other link

Tier3 facility

Page 124: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

136

iVDGL Architecture(from proposal)

Page 125: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

137

US iVDGL Interoperability

US-iVDGL-1 Milestone (August 02)

ATLAS

CMS LIGO

SDSS/NVO

US-iVDGL-1

Aug 2002

US-iVDGL-1

Aug 2002

1

2

iGOC

1

2

1

2

1

2

Page 126: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

138

Transatlantic Interoperability

iVDGL-2 Milestone (November 02)

ATLAS

CMS LIGO

SDSS/NVO

iVDGL-2Nov 2002

iVDGL-2Nov 2002

ANL

BNL

BU

IU

UM

OU

UTA

HU

LBL

CIT

UCSD

UF

FNAL

FNAL

JHU

CS Research

ANL

UCB

UC

IU

NU

UW

ISI

CIT

UTB

PSU

UWM

UC

iGOC Outreach DataTAG

CERN

INFN

UK PPARC

U of A

Page 127: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

139

Another Example:INFN Grid in Italy

20 sites, ~200 persons, ~90 FTEs, ~20 IT Preliminary budget for 3 years: 9 M Euros Activities organized around

– S/w development with Datagrid, DataTAG….

– Testbeds (financed by INFN) for DataGrid, DataTAG, US-EU Intergrid

– Experiments applications

– Tier1..Tiern prototype infrastructure Large scale testbeds provided by LHC

experiments, Virgo…..

Page 128: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

140U.S. GRIDS Center

NSF Middleware Infrastructure Program

GRIDS = Grid Research, Integration, Deployment, & Support

NSF-funded center to provide– State-of-the-art middleware infrastructure to

support national-scale collaborative science and engineering

– Integration platform for experimental middleware technologies

ISI, NCSA, SDSC, UC, UW NMI software release one: May 2002

www.grids-center.org

Page 129: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

141

Outline

The technology landscape Grid computing The Globus Toolkit Applications and technologies

– Data-intensive; distributed computing; collaborative; remote access to facilities

Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions

Page 130: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

142

Globus Toolkit: Evaluation (+)

Good technical solutions for key problems, e.g.– Authentication and authorization

– Resource discovery and monitoring

– Reliable remote service invocation

– High-performance remote data access This & good engineering is enabling progress

– Good quality reference implementation, multi-language support, interfaces to many systems, large user base, industrial support

– Growing community code base built on tools

Page 131: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

143

Globus Toolkit: Evaluation (-) Protocol deficiencies, e.g.

– Heterogeneous basis: HTTP, LDAP, FTP

– No standard means of invocation, notification, error propagation, authorization, termination, …

Significant missing functionality, e.g.– Databases, sensors, instruments, workflow, …

– Virtualization of end systems (hosting envs.) Little work on total system properties, e.g.

– Dependability, end-to-end QoS, …

– Reasoning about system properties

Page 132: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

144

Globus Toolkit Structure

GRAM MDS

GSI

GridFTP MDS

GSI

???

GSI

Reliable invocationSoft state

management

Notification

ComputeResource

DataResource

Other Serviceor Application

Jobmanager

Jobmanager

Lots of good mechanisms, but (with the exception of GSI) not that easilyincorporated into other systems

Service naming

Page 133: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

145

Open Grid Services Architecture Service orientation to virtualize resources Define fundamental Grid service behaviors

– Core set required, others optional A unifying framework for interoperability &

establishment of total system properties Integration with Web services and hosting

environment technologies Leverage tremendous commercial base Standard IDL accelerates community code

Delivery via open source Globus Toolkit 3.0 Leverage GT experience, code, mindshare

Page 134: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

146

“Web Services” Increasingly popular standards-based framework for

accessing network applications– W3C standardization; Microsoft, IBM, Sun, others

WSDL: Web Services Description Language– Interface Definition Language for Web services

SOAP: Simple Object Access Protocol– XML-based RPC protocol; common WSDL target

WS-Inspection– Conventions for locating service descriptions

UDDI: Universal Desc., Discovery, & Integration – Directory for Web services

Page 135: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

147

Web Services Example:Database Service

WSDL definition for “DBaccess” porttype defines operations and bindings, e.g.:– Query(QueryLanguage, Query, Result)

– SOAP protocol

Client C, Java, Python, etc., APIs can then be generated

DBaccess

Page 136: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

148

Transient Service Instances “Web services” address discovery & invocation of

persistent services– Interface to persistent state of entire enterprise

In Grids, must also support transient service instances, created/destroyed dynamically– Interfaces to the states of distributed activities

– E.g. workflow, video conf., dist. data analysis Significant implications for how services are managed,

named, discovered, and used– In fact, much of our work is concerned with the

management of service instances

Page 137: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

149

The Grid Service =Interfaces + Service Data

Servicedata

element

Servicedata

element

Servicedata

element

GridService … other interfaces …

Implementation

Service data accessExplicit destructionSoft-state lifetime

NotificationAuthorizationService creationService registryManageabilityConcurrency

Reliable invocationAuthentication

Hosting environment/runtime(“C”, J2EE, .NET, …)

Page 138: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

150

Open Grid Services Architecture:Fundamental Structure

1) WSDL conventions and extensions for describing and structuring services– Useful independent of “Grid” computing

2) Standard WSDL interfaces & behaviors for core service activities– portTypes and operations => protocols

Page 139: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

151

WSDL Conventions & Extensions portType (standard WSDL)

– Define an interface: a set of related operations serviceType (extensibility element)

– List of port types: enables aggregation serviceImplementation (extensibility element)

– Represents actual code service (standard WSDL)

– instanceOf extension: map descr.->instance compatibilityAssertion (extensibility element)

– portType, serviceType, serviceImplementation

Page 140: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

152

Structure of a Grid Service

service

PortTypePortType

service service service

Standard WSDL

… …

ServiceDescription

ServiceInstantiation

PortType

serviceImplementation serviceImplementation …

=

serviceType serviceType …

cA

cA

cA compatibilityAssertion=

cA

instanceOf instanceOf instanceOf instanceOf

Page 141: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

153Standard Interfaces & Behaviors:Four Interrelated Concepts

Naming and bindings– Every service instance has a unique name, from which

can discover supported bindings Information model

– Service data associated with Grid service instances, operations for accessing this info

Lifecycle– Service instances created by factories

– Destroyed explicitly or via soft state Notification

– Interfaces for registering interest and delivering notifications

Page 142: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

154

GridService Required– FindServiceData

– Destroy

– SetTerminationTime

NotificationSource– SubscribeToNotificationTopic

– UnsubscribeToNotificationTopic NotificationSink

– DeliverNotification

OGSA Interfaces and OperationsDefined to Date

Factory– CreateService

PrimaryKey– FindByPrimaryKey

– DestroyByPrimaryKey

Registry– RegisterService

– UnregisterService

HandleMap– FindByHandle

Authentication, reliability are binding propertiesManageability, concurrency, etc., to be defined

Page 143: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

155

Service Data A Grid service instance maintains a set of service data

elements– XML fragments encapsulated in standard <name, type,

TTL-info> containers

– Includes basic introspection information, interface-specific data, and application data

FindServiceData operation (GridService interface) queries this information– Extensible query language support

See also notification interfaces– Allows notification of service existence and changes in

service data

Page 144: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

156

Grid Service Example:Database Service

A DBaccess Grid service will support at least two portTypes– GridService

– DBaccess Each has service data

– GridService: basic introspection information, lifetime, …

– DBaccess: database type, query languages supported, current load, …, …

GridService DBaccess

DB info

Name, lifetime, etc.

Page 145: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

159

Lifetime Management GS instances created by factory or manually;

destroyed explicitly or via soft state– Negotiation of initial lifetime with a factory

(=service supporting Factory interface) GridService interface supports

– Destroy operation for explicit destruction

– SetTerminationTime operation for keepalive Soft state lifetime management avoids

– Explicit client teardown of complex state

– Resource “leaks” in hosting environments

Page 146: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

160

Factory

Factory interface’s CreateService operation creates a new Grid service instance– Reliable creation (once-and-only-once)

CreateService operation can be extended to accept service-specific creation parameters

Returns a Grid Service Handle (GSH)– A globally unique URL

– Uniquely identifies the instance for all time

– Based on name of a home handleMap service

Page 147: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

161

Transient Database Services

GridService DBaccess

DB info

Name, lifetime, etc.

GridService

DBaccessFactory

Factory info

Instance name, etc.

GridService Registry

Registry info

Instance name, etc.

GridService DBaccess

DB info

Name, lifetime, etc.

“What services can you create?”

“What database services exist?”

“Create a database service”

Page 148: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

162

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider

“I want to createa personal databasecontaining data one.coli metabolism”

.

.

.

DatabaseFactory

Page 149: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

163

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

“Find me a data mining service, and somewhere to store

data”

DatabaseFactory

Page 150: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

164

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

GSHs for Miningand Database factories

DatabaseFactory

Page 151: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

165

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

“Create a data mining service with initial lifetime 10”

“Create adatabase with initial lifetime 1000”

DatabaseFactory

Page 152: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

166

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

DatabaseFactory

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

Database

Miner

“Create a data mining service with initial lifetime 10”

“Create adatabase with initial lifetime 1000”

Page 153: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

167

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

DatabaseFactory

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

Database

Miner

Query

Query

Page 154: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

168

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

DatabaseFactory

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

Database

Miner

Query

Query

Keepalive

Keepalive

Page 155: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

169

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

DatabaseFactory

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

Database

MinerKeepalive

KeepaliveResults

Results

Page 156: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

170

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

DatabaseFactory

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

Database

Miner

Keepalive

Page 157: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

171

Example:Data Mining for Bioinformatics

UserApplication

BioDB n

Storage Service Provider

DatabaseFactory

MiningFactory

CommunityRegistry

DatabaseService

BioDB 1

DatabaseService

.

.

.

Compute Service Provider...

Database

Keepalive

Page 158: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

172

Notification Interfaces NotificationSource for client subscription

– One or more notification generators> Generates notification message of a specific type

> Typed interest statements: E.g., Filters, topics, …

> Supports messaging services, 3rd party filter services, …

– Soft state subscription to a generator NotificationSink for asynchronous delivery of

notification messages A wide variety of uses are possible

– E.g. Dynamic discovery/registry services, monitoring, application error notification, …

Page 159: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

173

Notification Example

Notifications can be associated with any (authorized) service data elements

GridService DBaccess

DB info

Name, lifetime, etc.

GridService

DB info

Name, lifetime, etc.

NotificationSource

NotificationSink

Subscribers

Page 160: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

174

Notification Example

Notifications can be associated with any (authorized) service data elements

GridService DBaccess

DB info

Name, lifetime, etc.

GridService

DB info

Name, lifetime, etc.

NotificationSource

“Notify me ofnew data about

membrane proteins”

Subscribers

NotificationSink

Page 161: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

175

Notification Example

Notifications can be associated with any (authorized) service data elements

GridService DBaccess

DB info

Name, lifetime, etc.

GridService

DB info

Name, lifetime, etc.

NotificationSource

Keepalive

NotificationSink

Subscribers

Page 162: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

176

Notification Example

Notifications can be associated with any (authorized) service data elements

GridService DBaccess

DB info

Name, lifetime, etc.

GridService

NotificationSink

DB info

Name, lifetime, etc.

NotificationSource

New data

Subscribers

Page 163: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

177Open Grid Services Architecture:Summary

Service orientation to virtualize resources– Everything is a service

From Web services– Standard interface definition mechanisms:

multiple protocol bindings, local/remote transparency

From Grids– Service semantics, reliability and security models

– Lifecycle management, discovery, other services Multiple “hosting environments”

– C, J2EE, .NET, …

Page 164: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

178

Recap: The Grid Service

Servicedata

element

Servicedata

element

Servicedata

element

GridService … other interfaces …

Implementation

Service data accessExplicit destructionSoft-state lifetime

NotificationAuthorizationService creationService registryManageabilityConcurrency

Reliable invocationAuthentication

Hosting environment/runtime(“C”, J2EE, .NET, …)

Page 165: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

179

OGSA and the Globus Toolkit Technically, OGSA enables

– Refactoring of protocols (GRAM, MDS-2, etc.)—while preserving all GT concepts/features!

– Integration with hosting environments: simplifying components, distribution, etc.

– Greatly expanded standard service set Pragmatically, we are proceeding as follows

– Develop open source OGSA implementation> Globus Toolkit 3.0; supports Globus Toolkit 2.0 APIs

– Partnerships for service development

– Also expect commercial value-adds

Page 166: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

180

GT3: An Open Source OGSA-Compliant Globus Toolkit

GT3 Core– Implements Grid service

interfaces & behaviors

– Reference impln of evolving standard

– Java first, C soon, C#? GT3 Base Services

– Evolution of current Globus Toolkit capabilities

– Backward compatible Many other Grid services

GT3 Core

GT3 Base Services

Other GridServicesGT3

DataServices

Page 167: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

181

Hmm, Isn’t This Just Another Object Model?

Well, yes, in a sense– Strong encapsulation

– We (can) profit greatly from experiences of previous object-based systems

But– Focus on encapsulation not inheritance

– Does not require OO implementations

– Value lies in specific behaviors: lifetime, notification, authorization, …, …

– Document-centric not type-centric

Page 168: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

182

Grids and OGSA:Research Challenges

Grids pose profound problems, e.g.– Management of virtual organizations

– Delivery of multiple qualities of service

– Autonomic management of infrastructure

– Software and system evolution OGSA provides foundation for tackling these

problems in a rigorous fashion?– Structured establishment/maintenance of

global properties

– Reasoning about total system properties

Page 169: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

183

Summary

OGSA represents refactoring of current Globus Toolkit protocols and integration with Web services technologies

Several desirable features– Significant evolution of functionality

– Uniform IDL facilitates code sharing

– Allows for alignment of potentially divergent directions (e.g., info service, service registry, monitoring)

Page 170: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

184

Evolution

This is not happening all at once– We have an early prototype of Core (alpha

release May?)

– Next we will work on Base, others

– Full release by end of 2002??

– Establishing partnerships for other services Backward compatibility

– API level seems straightforward

– Protocol level: gateways?

– We need input on best strategies

Page 171: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

185

For More Information

OGSA architecture and overview– “The Physiology of the Grid: An Open Grid

Services Architecture for Distributed Systems Integration”, at www.globus.org/ogsa

Grid service specification– At www.globus.org/ogsa

Open Grid Services Infrastructure WG, GGF– www.gridforum.org/ogsi (?), soon

Globus Toolkit OGSA prototype– www.globus.org/ogsa

Page 172: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

186

Outline

The technology landscape Grid computing The Globus Toolkit Applications and technologies

– Data-intensive; distributed computing; collaborative; remote access to facilities

Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions

Page 173: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

187

GGF Objectives

An open process for development of standards– Grid “Recommendations” process modeled

after Internet Standards Process (IETF) A forum for information exchange

– Experiences, patterns, structures A regular gathering to encourage shared effort

– In code development: libraries, tools…

– Via resource sharing: shared Grids

– In infrastructure: consensus standards

Page 174: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

188

GGF Groups

Working Groups– Tightly focused on

development of a spec or set of related specs

> Protocol, API, etc.

– Finite set of objectives and schedule of milestones

Research Groups– More exploratory than

Working Groups

– Focused on understanding requirements, taxonomies, models, methods for solving a particular set of related problems

– May be open-ended but with a definite set of objectives and milestones to drive progress

Groups are approved and evaluated by a GGF Steering Group (GFSG) based on written charters. Among the criteria for group formation:• Is this work better done (or already being done) elsewhere, e.g. IETF, W3C?• Are the leaders involved and/or in touch with relevant efforts elsewhere?

Page 175: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

189Current GGF Groups(Out-of-date List, Sorry…)

AREA Working Groups Research Groups

Grid Information Services

Grid Object Specification Grid Notification Framework Metacomputing Directory Services

Relational Database Information Services

Scheduling and Resource Management

Advanced Reservation Scheduling Dictionary Scheduler Attributes

Security Grid Security Infrastructure Grid Certificate Policy

Performance Grid Performance Monitoring Architecture

Architectures JINI NPI Architecture

Grid Protocol Architecture Accounting Models

Data GridFTP Data Replication

Applications, Programming Models, and User Environments

Applications Grid User Services Grid Computing Env. Adv Programming Models Adv Collaboration Env

Page 176: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

190Proposed GGF Groups(Again, Out of Date …)

AREA Working Groups Research Groups

Scheduling and Resource Management

Scheduling Command Line API

Distributed Resource Mgmt Applic API

Grid Resource Management Protocol

Scheduling Optimization

Performance Network Monitoring/Measurement

Sensor Management Grid Event Service

Architectures Open Grid Services Architecture

Grid Economies

Data Archiving Command Line API

Persistent Archives

DataGrid Schema Application Metadata Network Storage

Area TBD… Open Source Software LicensingCluster Standardization

High-Performance Networks for Grids

Page 177: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

191

Getting Involved

Participate in a GGF Meeting– 3x/year, last one had 500 people

– July 21-24, 2002 in Edinburgh (with HPDC)

– October 15-17, 2002 in Chicago Join a working group or research group

– Electronic participation via mailing lists (see www.gridforum.org)

Page 178: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

192

Grid Events

Global Grid Forum: working meeting– Meets 3 times/year, alternates U.S.-Europe,

with July meeting as major event HPDC: major academic conference

– HPDC’11 in Scotland with GGF’5, July 2002 Other meetings with Grid content include

– SC’XY, CCGrid, Globus Retreat

www.gridforum.org, www.hpdc.org

Page 179: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

193

Outline

The technology landscape Grid computing The Globus Toolkit Applications and technologies

– Data-intensive; distributed computing; collaborative; remote access to facilities

Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions

Page 180: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

194

Summary

The Grid problem: Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations – Real application communities emerging

– Significant infrastructure deployments

– Substantial open source/architecture technology base: Globus Toolkit

– Pathway defined to industrial adoption, via open source + OGSA

– Rich set of intellectual challenges

Page 181: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

195Major Application Communities are

Emerging Intellectual buy-in, commitment

– Earthquake engineering: NEESgrid

– Exp. physics, etc.: GriPhyN, PPDG, EU Data Grid

– Simulation: Earth System Grid, Astrophysical Sim. Collaboratory

– Collaboration: Access Grid Emerging, e.g.

– Bioinformatics Grids

– National Virtual Observatory

Page 182: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

196Major Infrastructure Deployments are

Underway For example:

– NSF “National Technology Grid”

– NASA “Information Power Grid”

– DOE ASCI DISCOM Grid

– DOE Science Grid’

– EU DataGrid

– iVDGL

– NSF Distributed Terascale Facility (“TeraGrid”)

– DOD MOD Grid

Page 183: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

197

A Rich Technology Basehas been Constructed

6+ years of R&D have produced a substantial code base based on open architecture principles: esp. the Globus Toolkit, including– Grid Security Infrastructure

– Resource directory and discovery services

– Secure remote resource access

– Data Grid protocols, services, and tools Essentially all projects have adopted this as a

common suite of protocols & services Enabling wide range of higher-level services

Page 184: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

198

Pathway Defined toIndustrial Adoption

Industry need– eScience applications, service provider models,

need to integrate internal infrastructures, collaborative computing in general

Technical capability– Maturing open source technology base

– Open Grid Services Architecture enables integration with industry standards

Result likely to be exponential industrial uptake

Page 185: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

199

Rich Set of Intellectual Challenges

Transforming the Internet into a robust, usable computational platform

Delivering (multi-dimensional) qualities of service within large systems

Community dynamics and collaboration modalities

Program development methodologies and tools for Internet-scale applications

Etc., etc., etc.

Page 186: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

206

New Programs

U.K. eScience program

EU 6th Framework U.S. Committee on

Cyberinfrastructure Japanese Grid

initiative

Page 187: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

207U.S. Cyberinfrastructure:Draft Recommendations

New INITIATIVE to revolutionize science and engineering research at NSF and worldwide to capitalize on new computing and communications opportunities 21st Century Cyberinfrastructure includes supercomputing, but also massive storage, networking, software, collaboration, visualization, and human resources– Current centers (NCSA, SDSC, PSC) are a key resource for the INITIATIVE

– Budget estimate: incremental $650 M/year (continuing) An INITIATIVE OFFICE with a highly placed, credible leader empowered to

– Initiate competitive, discipline-driven path-breaking applications within NSF of cyberinfrastructure which contribute to the shared goals of the INITIATIVE

– Coordinate policy and allocations across fields and projects. Participants across NSF directorates, Federal agencies, and international e-science

– Develop high quality middleware and other software that is essential and special to scientific research

– Manage individual computational, storage, and networking resources at least 100x larger than individual projects or universities can provide.

Page 188: Grid Computing: Concepts, Appplications, and Technologies Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department

[email protected] ARGONNE CHICAGO

208

For More Information The Globus Project™

– www.globus.org Grid concepts, projects

– www.mcs.anl.gov/~foster Open Grid Services Architecture

– www.globus.org/ogsa Global Grid Forum

– www.gridforum.org GriPhyN project

– www.griphyn.org