28
The Grid: Essential Infrastructure for DOE Science Ian Foster Argonne National Laboratory University of Chicago Globus Alliance

1 Abstract The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

Embed Size (px)

Citation preview

Page 1: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

The Grid:Essential Infrastructure for DOE Science

Ian Foster

Argonne National Laboratory

University of Chicago

Globus Alliance

Page 2: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

3

Grid—Are You:

An enthusiast?◊ “The best thing since the FFT”

A skeptic?◊ “An overhyped funding concept”

Or perplexed?◊ “Should be refined around shocks”◊ “Makes high-end computing obsolete”◊ “What NSF does; not relevant to DOE”

Page 3: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

4

Science Today is a Team Sport

Page 4: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

5

Particularly within DOE

Lawrence BerkeleyNational Lab

•Advanced Light Source•National Center for Electron Microscopy

•National Energy Research Scientific Computing Facility

Los Alamos NeutronScience Center

Univ. of IL• Electron Microscopy Center

for Materials Research • Center for Microanalysis of

Materials

MIT•Bates Accelerator Center

•Plasma Science & Fusion Center

SC User FacilitiesInstitutions that Use SC Facilities

Fermi National Accelerator Lab•Tevatron

Stanford Linear Accelerator Center

•B-Factory•Stanford Synchrotron Radiation Laboratory

Princeton Plasma Physics Lab

GeneralAtomics

- DIII-D Tokamak

SC Laboratories

Pacific Northwest National Lab

• Environmental Molecular Sciences Lab

Argonne National Lab• Intense Pulsed Neutron Source•Advanced Photon Source•Argonne Tandem Linac Accelerator System

BrookhavenNational Lab

•Relativistic Heavy Ion Collider

•National Synchrotron Light Source

Oak Ridge National Lab•High-Flux Isotope Reactor Surface Modification & Characterization Center

•Spallation Neutron Source (under construction)

Thomas Jefferson NationalAccelerator Facility

•Continuous Electron Beam Accelerator Facility

Physics AcceleratorsSynchrotron Light SourcesNeutron SourcesSpecial Purpose FacilitiesLarge Fusion Experiments

Sandia Combustion Research Facility

James R. MacDonaldLaboratory

Page 5: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

6

Challenges for 21st CenturyDOE Science

Scientific excellencein the context of: Unique facilities Complex problems Enormous data Distributed teams Multidisciplinary

research International reach

Page 6: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

7

We Must be able to Assemble Required Expertise & Resources When Needed!

Transform DOE resources into on-demand services accessible to any individual or team

Page 7: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

8

A Unifying Concept:The Grid

“Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”

1. Enable integration of distributed resources

2. Using general-purpose protocols & infrastructure

3. To achieve better-than-best-effort service

Page 8: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

9

The Dubious Power Grid Analogy

Must we travel to the power source?

Or can we ship power to where we want to work?

Enable on-demand access to, and integration of,diverse resources & services, regardless of location

Page 9: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

10

Example Grid Capabilities

Engage via telepresence in an experiment at a remote facility

Integrate data from multiple sources in support of global change research

Harness computers across sites to process data from a physics experiment

Discover & access a genome analysis service (running on high-end computer)

Page 10: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

11

Resources◊ Computing, storage, data

Communities◊ Operational procedures, …

Grid as Mechanism, Infrastructure, & Community

A

AA

Services◊ Authentication, discovery, …

Connectivity◊ Reduce tyranny of distance

Technologies◊ Build services & applications

Page 11: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

12

Building a DOE Grid:What Must be Done

Create, deploy, & operate infrastructure◊ Local & global services for security, access, discovery, data

mgmt, data mediation, etc. Establish policies

◊ Reconcile diverse local, global, & community security, accounting, auditing, etc., policies

Develop applications◊ Expand DOE Grid to new communities and increase its

utility for existing communities Expand research

◊ Produce the technical advances needed for the Grid of 2010

Page 12: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

13

DOE Collaboratories and NetworkPrograms Have Made Much Progress

Basic machinery, e.g.◊ PKI & CA/RAs used by collaboratories◊ GridFTP data transfer, GRAM job mgmt

Higher-level tools, e.g.◊ Storage management & data movement◊ Workflow & computation management◊ Access Grid collaboration

Application successes◊ Close coupling of CS/math & application teams◊ PPDG Grid2003, FusionGrid, Earth System Grid, Chemical

Sciences Collaboratory

Page 13: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

14

Grid Applications

Grid Technologies for Resource Integration & Management

Grid Resources

inte

gra

tioninteroperability

DB Access

PDB portal

App Scheduler

PSE

portalUser-level Middleware and Tools

System-level Common Infrastructure

Page 14: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

15

Grid Applications

Grid Services Architecture Agree on interfaces, services

◊ Common infrastructure services act like a “grid OS”◊ Users interact with the Grid through higher-level, user-friendly

middleware layer

User-focusedmiddleware & tools

(commercial opportunities)

Grid Resources

DB Federation

PDB portal

App Scheduler

PSE

Chem portal

Common infrastructure

services(many open source)

Authentication, information, resource

access, resource mgmt, negotiation,

scheduling, monitoring, data transfer, etc., etc.

Page 15: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

16

Reliable Data Replication between BNL & LBNL for STAR Experiment

5 TB/week quasi-automated transfer SRM and GridFTP

Page 16: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

17

Grid2003: An Operational Grid 28 sites (2100-2800 CPUs) & growing 400-1300 concurrent jobs 7 substantial applications + CS experiments Running since October 2003

Korea

http://www.ivdgl.org/grid2003

Page 17: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

18

Grid2003 Components Computers & storage at 28 sites (to date)

◊ 2800+ CPUs Uniform service environment at each site

◊ Globus Toolkit provides basic authentication, execution management, data movement

◊ Pacman installation system enables installation of numerous other VDT and application services

Global & virtual organization services◊ Certification & registration authorities, VO membership

services, monitoring services Client-side tools for data access & analysis

◊ Virtual data, execution planning, DAG management, execution management, monitoring

IGOC: iVDGL Grid Operations Center

Page 18: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

19

Grid2003 Metrics

Metric Target AchievedNumber of CPUs 400 2762 (28 sites)

Number of users > 10 102 (16)

Number of applications > 4 10 (+CS)

Number of sites running concurrent apps

> 10 17

Peak number of concurrent jobs 1000 1100

Data transfer per day > 2-3 TB 4.4 TB max

Page 19: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

20

Grid2003 Applications To Date

CMS proton-proton collision simulation ATLAS proton-proton collision simulation LIGO gravitational wave search SDSS galaxy cluster detection ATLAS interactive analysis BTeV proton-antiproton collision simulation SnB biomolecular analysis GADU/Gnare genone analysis Various computer science experiments

www.ivdgl.org/grid2003/applications

Page 20: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

21

Earth System Grid

Goal: address technical obstacles to the sharing & analysis of high-volume data from advanced earth system models

Page 21: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

22

Under the Covers of ESG

Page 22: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

23

Shared Application Real Time Data Display

Video & Audio Between Pulse Data

Shot Cycle Status

FusionGrid

Remote control room demo

@ SC’03

TransP production service: 1662 runs in FY03

Page 23: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

24

Building a DOE Grid:Critical Next Steps

Institutionalize Grid infrastructure◊ Broad deployment & support at sites◊ Software as infrastructure◊ Legitimate (& challenging) security concerns

Expand range of resource sharing modalities◊ Research aimed at federating not just data &

computers, but workflow and semantics◊ Scale data size, community sizes, etc., etc.

Reach new application domains◊ Sustain current collaboratory pilots, and start new ones

of similar or greater ambition

Page 24: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

25

Grid Complements DOE’sHigh-End Computing Program

We need specialized supercomputers that tightly integrate computing & storage◊ Bandwidth is still a scarce commodity◊ Many algorithms are latency intolerant

And, in addition:◊ Economies of scale are possible (sometimes)

Grid can allow for far more effective HPC◊ Focus high-end systems on high-end apps without

compromising service to others◊ Link high-end apps into larger workflows

And also support other collaborative scenarios

Page 25: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

26

Grid Services: secure and uniform access and management for distributed resources

Science Portals: collaboration and problem solving

Web Services and Application building services

Supercomputing andLarge-Scale Storage

ESnet:High Speed Networking

Spallation Neutron Source

High Energy Physics

Advanced Photon Source

Macromolecular Crystallography

Computing and Storageof Scientific Groups

Supernova Observatory

Advanced Chemistry

Magnetic Fusion

EuropeAsia-Pacific

Credit: W. Johnston

Page 26: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

27

Grid is Not a DOE vs. NSF Issue

1) DOE science needs a DOE Grid◊ We can’t just borrow our neighbor’s

Grid when we want to collaborate or compute

2) DOE has unique expertise◊ Large fraction of extant human

capital resides at DOE labs

3) Not clear that NSF is going to produce required advances◊ BRP report argues $1B/yr needed to

meet national needs

Page 27: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

28

(Recap)Grid—Are You:

An enthusiast?◊ “The best thing since the FFT”

A skeptic?◊ “An overhyped funding concept”

Or perplexed?◊ “Should be refined around shocks”◊ “Makes high-end computing obsolete”◊ “What NSF does; not relevant to DOE”

Page 28: 1 Abstract  The term "Grid" tends to generate enthusiasm, skepticism, or perplexity. Enthusiasts speak of the potential for integrating services and resources

29

For More Information

DOE Science Grid◊ www.doesciencegrid.org

Global Grid Forum◊ www.ggf.org

The Globus Alliance®◊ www.globus.org

Background information◊ www.mcs.anl.gov/~foster

[email protected] 2nd Edition: Just Out