Upload
gina
View
39
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Grid Computing & Web Services: A Natural Partnership. Dave Angulo Department of Computer Science The University of Chicago and Mathematics and Computer Science Division Argonne National Laboratory. Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and - PowerPoint PPT Presentation
Citation preview
Grid Computing & Web Services:A Natural Partnership
Ian FosterMathematics and Computer Science Division
Argonne National Laboratoryand
Department of Computer ScienceThe University of Chicago
Address of Poznan Supercomputing & Networking Center Poznan, Poland February 7, 2002
Dave AnguloDepartment of Computer Science
The University of Chicagoand
Mathematics and Computer Science Division
Argonne National Laboratory
[email protected] University of Chicago
Partial Acknowledgements Open Grid Services Architecture work is performed
by– Ian Foster, Globus Co-PI @ Argonne/UofC– Carl Kesselman, Globus Co-PI @ USC/ISI– Steve Tuecke, Globus Toolkit Architect @ANL– Jeff Nick, Steve Graham, Jeff Frey @ IBM
Globus Toolkit R&D involves many fine scientists & engineers at ANL, USC/ISI, and elsewhere (see www.globus.org)
Strong collaborations with many outstanding EU, UK, US Grid projects
Support from DOE, NASA, NSF, Microsoft
[email protected] University of Chicago
Partial Acknowledgements
Globus ToolkitTM
– R&D involves> many fine scientists & engineers at ANL/UofC, USC/ISI, and
elsewhere (see www.globus.org)– Led by
> Ian Foster @ Argonne/UofC> Carl Kesselman @ USC/ISI
Open Grid Services Architecture work performed by– Ian Foster, Globus Co-PI @ Argonne/UofC– Carl Kesselman, Globus Co-PI @ USC/ISI– Steve Tuecke, Globus Toolkit Architect @ANL– Jeff Nick, Steve Graham, Jeff Frey @ IBM
Strong collaborations with many outstanding EU, UK, US Grid projects
Support from DOE, NASA, NSF, Microsoft, IBM
[email protected] University of Chicago
Grid Computing
[email protected] University of Chicago
The Grid Problem Resource sharing & coordinated problem
solving in dynamic, multi-institutional virtual organizations
[email protected] University of Chicago
Why Grids? A biochemist exploits 10,000 computers to screen
100,000 compounds in an hour 1,000 physicists worldwide pool resources for
petaflop analyses of petabytes of data Civil engineers collaborate to design, execute, &
analyze shake table experiments Climate scientists visualize, annotate, & analyze
terabyte simulation datasets A home user invokes architectural design functions
at an application service provider– An application service provider purchases cycles
from compute cycle providers
[email protected] University of Chicago
Elements of the Problem Resource sharing
– Computers, storage, sensors, networks, …– Sharing always conditional: issues of trust, policy,
payment, … Coordinated problem solving
– Beyond client-server: distributed data analysis, computation, …
Dynamic, multi-institutional virtual orgs– Community overlays on classic org structures– Large or small, static or dynamic
[email protected] University of Chicago
Grids: Why Now? Moore’s law improvements in computing
produce highly functional end systems The Internet and burgeoning wired and
wireless provide universal connectivity Network exponentials produce dramatic
changes in geometry and geography
[email protected] University of Chicago
Grids: Why Now? Moore’s law improvements in computing
produce highly functional endsystems The Internet and burgeoning wired and
wireless provide universal connectivity Network exponentials produce dramatic
changes in geometry and geography– 9-month doubling: double Moore’s law!– 1986-2001: x340,000; 2001-2010: x4000?
[email protected] University of Chicago
The Grid World: Current Status Dozens of major Grid projects in scientific &
technical computing/research & education– Deployment, application, technology
Considerable consensus on key concepts and technologies– Globus Toolkit™ has emerged as de facto
standard for major protocols & services Global Grid Forum has emerged as a significant
force– And first “Grid” proposals at IETF
[email protected] University of Chicago
Selected Major Grid ProjectsName URL &
SponsorsFocus
Access Grid www.mcs.anl.gov/FL/accessgrid; DOE, NSF
Create & deploy group collaboration systems using commodity technologies
BlueGrid IBM Grid testbed linking IBM laboratoriesDISCOM www.cs.sandia.gov/
discomDOE Defense Programs
Create operational Grid providing access to resources at three U.S. DOE weapons laboratories
DOE Science Grid
sciencegrid.orgDOE Office of Science
Create operational Grid providing access to resources & applications at U.S. DOE science laboratories & partner universities
Earth System Grid (ESG)
earthsystemgrid.orgDOE Office of Science
Delivery and analysis of large climate model datasets for the climate research community
European Union (EU) DataGrid
eu-datagrid.orgEuropean Union
Create & apply an operational grid for applications in high energy physics, environmental science, bioinformatics
ggg
g
g
g
New
New
[email protected] University of Chicago
Selected Major Grid ProjectsName URL/
SponsorFocus
EuroGrid, Grid Interoperability (GRIP)
eurogrid.orgEuropean Union
Create technologies for remote access to supercomputer resources & simulation codes; in GRIP, integrate with Globus
Fusion Collaboratory
fusiongrid.orgDOE Off. Science
Create a national computational collaboratory for fusion research
Globus Project globus.orgDARPA, DOE, NSF, NASA, Msoft
Research on Grid technologies; development and support of Globus Toolkit; application and deployment
GridLab gridlab.orgEuropean Union
Grid technologies and applications
GridPP gridpp.ac.ukU.K. eScience
Create & apply an operational grid within the U.K. for particle physics research
Grid Research Integration Dev. & Support Center
grids-center.orgNSF
Integration, deployment, support of the NSF Middleware Infrastructure for research & education
g
g
g
g
g
g
New
New
New
New
New
[email protected] University of Chicago
Selected Major Grid ProjectsName URL/Sponsor Focus
Grid Application Dev. Software
hipersoft.rice.edu/grads; NSF
Research into program development technologies for Grid applications
Grid Physics Network
griphyn.orgNSF
Technology R&D for data analysis in physics expts: ATLAS, CMS, LIGO, SDSS
Information Power Grid
ipg.nasa.govNASA
Create and apply a production Grid for aerosciences and other NASA missions
International Virtual Data Grid Laboratory
ivdgl.orgNSF
Create international Data Grid to enable large-scale experimentation on Grid technologies & applications
Network for Earthquake Eng. Simulation Grid
neesgrid.orgNSF
Create and apply a production Grid for earthquake engineering
Particle Physics Data Grid
ppdg.netDOE Science
Create and apply production Grids for data analysis in high energy and nuclear physics experiments
g
g
g
g
gNew
New
g
[email protected] University of Chicago
Selected Major Grid ProjectsName URL/Sponsor Focus
TeraGrid teragrid.orgNSF
U.S. science infrastructure linking four major resource sites at 40 Gb/s
UK eScience Grid grid-support.ac.ukU.K. eScience
Support center for Grid projects within the U.K.
Unicore BMBFT Technologies for remote access to supercomputers
g
gNew
New
Also many technology R&D projects: e.g., Condor, NetSolve, Ninf, NWS
See also www.gridforum.org
[email protected] University of Chicago
Grid Communities & Applications:Data Grids for High Energy Physics
Tier2 Centre ~1 TIPS
Online System
Offline Processor Farm ~20 TIPS
CERN Computer Centre
FermiLab ~4 TIPSFrance Regional Centre
Italy Regional Centre
Germany Regional Centre
InstituteInstituteInstituteInstitute ~0.25TIPS
Physicist workstations
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~1 MBytes/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight (deprecated)
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Caltech ~1 TIPS
~622 Mbits/sec
Tier 0Tier 0
Tier 1Tier 1
Tier 2Tier 2
Tier 4Tier 4
1 TIPS is approximately 25,000
SpecInt95 equivalents
www.griphyn.org www.ppdg.net www.eu-datagrid.org
[email protected] University of Chicago
Grid Communities and Applications:Mathematicians Solve NUG30
Community=an informal collaboration of mathematicians and computer scientists
Condor-G delivers 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites)
Solves NUG30 quadratic assignment problem
14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13,26,17,30,6,20,19,8,18,7,27,12,11,23
www.mcs.anl.gov/metaneos: Argonne, Iowa, NWU, Wisconsin
[email protected] University of Chicago
Grid Communities and Applications:Network for Earthquake Eng. Simulation
NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other
On-demand access to experiments, data streams, computing, archives, collaboration
NEESgrid: Argonne, Michigan, NCSA, UIUC, USC www.neesgrid.org
[email protected] University of Chicago
The 13.6 TF TeraGrid:Computing at 40 Gb/s
26
24
8
4 HPSS
5
HPSS
HPSS UniTree
External Networks
External NetworksExternal
Networks
External Networks
Site Resources Site Resources
Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB
SDSC4.1 TF225 TB
Caltech Argonne
TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org
[email protected] University of Chicago
Intl. Virtual Data Grid Lab.
Tier0/1 facilityTier2 facility
10+ Gbps link
2.5 Gbps link
622 Mbps link
Other link
Tier3 facility
www.ivdgl.org
[email protected] University of Chicago
Access Grid Collaborative work
among large groups ~50 sites worldwide Use Grid services for
discovery, security www.scglobal.org
Ambient mic(tabletop)
Presentermic
Presentercamera
Audience camera
Access Grid: Argonne, others www.accessgrid.org
[email protected] University of Chicago
Grid Architecture & Globus Toolkit™ The question:
– What is needed for resource sharing & coordinated problem solving in dynamic virtual organizations (VOs)?
The answer:– Major issues identified: membership, resource
discovery & access, …, …– Grid architecture captures core elements,
emphasizing pre-eminent role of protocols– Globus Toolkit™ has emerged as de facto standard
for major protocols & services
[email protected] University of Chicago
The Critical Role of Protocols Need for interoperability when different groups
want to share resources– E.g., IP lets me talk to your computer, but how do we
establish & maintain sharing?– How do I discover, authenticate, authorize, describe
what I want to do, etc., etc.? Need for shared infrastructure services to avoid
repeated development, installation, e.g.– One port/service for remote access to computing, not
one per tool/application– X.509 enables sharing of Certificate Authorities
[email protected] University of Chicago
Grid Architecture
Application
Fabric“Controlling things locally”: Access to, & control of, resources
Connectivity“Talking to things”: communication (Internet protocols) & security
Resource“Sharing single resources”: negotiating access, controlling use
Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services
InternetTransport
Application
Link
Internet Protocol Architecture
For more info: www.globus.org/research/papers/anatomy.pdf
[email protected] University of Chicago
Globus Project and Toolkit Globus Project™
– R&D project at ANL, U.Chicago, USC/ISI– Emphasis on identifying and defining core
protocols and services– O(40) researchers & developers
Globus Toolkit™– A major product of the Globus Project– Open source software: reference implementation
of core protocols & services– Growing open source developer community
[email protected] University of Chicago
Globus Toolkit: Evaluation (1) Good technical solutions for key problems, e.g.
– Authentication and authorization– Resource discovery and monitoring– Reliable remote service invocation– High-performance remote data access
This + good engineering is enabling progress– Good quality reference implementation, multi-
language support, interfaces to many systems, large user base, industrial support
– Growing community code base built on tools
[email protected] University of Chicago
Globus Toolkit: Evaluation (2) Protocol deficiencies, e.g.
– Heterogeneous basis: HTTP, LDAP, FTP– No standard means of error propagation
Significant missing functionality, e.g.– Databases, sensors, instruments– Programming tools: workflow, …– Virtualization of end systems (hosting envs.)
Little work on total system properties, e.g. – Dependability, end-to-end QoS, …– Reasoning about system properties
[email protected] University of Chicago
“Web Services” Increasingly popular standards-based framework for
accessing network applications– W3C standardization; Microsoft, IBM, Sun, others
WSDL: Web Services Description Language– Interface Definition Language for Web services
SOAP: Simple Object Access Protocol– XML-based RPC protocol; common WSDL target
WS-Inspection (WSIL)– Conventions for locating service descriptions
UDDI: Universal Desc., Discovery, & Integration – Directory for Web services
[email protected] University of Chicago
Transient Service Instances “Web services” address discovery & invocation of
persistent services In Grids, must also support transient service
instances, created/destroyed dynamically– E.g., to manage eBusiness workflow, video
conference, or distributed data analysis Significant implications for how services are
managed, named, discovered, and used– In fact, much of our work is concerned with the
management of service instances
[email protected] University of Chicago
Open Grid Services Architecture Service orientation to virtualize resources From Web services:
– Standard interface definition mechanisms: multiple protocol bindings, multiple implementations, local/remote transparency
Building on Globus Toolkit:– The Grid service defines standard semantics for service
interactions– Factory, registry, and mapper services– Reliable and secure transport
Multiple hosting targets: J2EE, .NET, “C”, etc.
[email protected] University of Chicago
OGSA Service Model System comprises (a typically few) persistent services &
(potentially many) transient services All services adhere to specified Grid service interfaces
and behaviors– Reliable invocation, lifetime management, discovery,
authorization, notification, upgradeability, concurrency, manageability
Interfaces for managing Grid service instances– Factory, registry, mapper
Heavily leverage Globus Toolkit technology=> Reliable secure mgmt of distributed state
[email protected] University of Chicago
The Grid Service A (potentially transient) Web service with specified
interfaces & behaviors, including– Creation (Factory)– Global naming (GSH) & references (GSR)– Lifetime management– Registration & Discovery– Authorization– Notification– Concurrency– Manageability
[email protected] University of Chicago
Factory A Grid service with Factory interface can be
requested to create a new Grid service instance– Reliable creation (once-and-only-once)
Create operation can be extended to accept Grid-service-specific creation parameters
Returns a Grid Service Handle (GSH)– A globally unique URL– Uniquely identifies the instance for all time– Based on name of a home mapper service
[email protected] University of Chicago
Mapper A GSH is a stable name for a Grid service, but does not
allow client to actually communicate with the Grid service
A Grid Service Reference (GSR) is a WSDL document that describes how to communicate with the Grid service– Contains protocol binding, network address, …– May expire (I.e. GSR information may change)
The Mapper interface allows a client to map from a GSH to a GSR– http get on GSH also returns a GSR
[email protected] University of Chicago
Lifetime Management GS instances created by factory or manually; destroyed
explicitly or via soft state– Negotiation of initial lifetime with Factory
SoftStateDestruction interface supports– GetTerminationTime message for inquiry
>Notification interface also allows for lifetime notification– SetTerminationTime message for keepalive
Soft state lifetime management avoids– Explicit client teardown of complex state– Resource “leaks” in hosting environments
ExplicitDestruction interface also available
[email protected] University of Chicago
Discovery A Grid service instance may maintain a set of
service information– XML fragments encapsulated in standard <name,
type, TTL-info> containers Discovery interface allows clients to query the Grid
service instance for this information– Query operation, plus supporting operations
>Extensible query language support See also Notification interfaces
– Allows notification of service existence and about service information
[email protected] University of Chicago
Registry The Registry interface may be used to discover a set
of Grid service instances– Returns a WS-Inspection document containing the
GSHs of a set of Grid services– Also returns policy associated with the set– Also available through Discovery interface
The RegistryManagement interface allows for soft-state registration of a Grid service– A set of Grid services can periodically register their
GSHs into a registry service, to allow for discovery of services in that set
[email protected] University of Chicago
Authorization Protocol binding handles authentication during
invocation of Grid service operation– Gives service URI for authenticated subject
Grid service instance should apply authorization policy on all operations– May be site-, service-, instance-, etc., specific
OGSA defines standard interfaces for remote management of access control policy– OperationAuthorizationManagement– SubjectEquivalency
[email protected] University of Chicago
Notification Interfaces NotificationSource for client subscription
– One or more notification generators> Generates notification message of a specific type> Typed interest statements: E.g., Filters, topics, …> Supports messaging services, 3rd party filter services, …
– Soft state subscription to a generator NotificationSink for asynchronous delivery of
notification messages A wide variety of uses are possible
– E.g. Dynamic discovery/registry services, monitoring, application error notification, …
[email protected] University of Chicago
Use of Web Services (1) A Grid service interface is a WSDL portType A Grid service definition is a WSDL extension
(serviceType) containing:– A set of one or more portTypes supported by
the service– portType & serviceType compatibility
statements, to support upgradability> For discovery of compatible services when interfaces are
upgraded– Implementation version information
[email protected] University of Chicago
Use of Web Services (2) A GSR is a WSDL document with extensions:
– Extension to service element to reference serviceType– Service element extensions to carry the GSH, and the
expiration time of the GSR A GSH is an URL, with the following properties:
– Globally unique for all time– http get on GSH + “.wsdl” returns GSR– Can derive GSH to Mapper from it
Registry returns WS-Inspection documents
[email protected] University of Chicago
Using OGSAto Construct Grid Environments
Factory RegistryService
FactoryH2R
Mapper
ServiceService Service ...
...
(a) Simple HostingEnvironment
Factory RegistryService
FactoryH2R
Mapper
ServiceService Service ...
...
F R
F M
SS S
F R
F M
SS S
(b) Virtual HostingEnvironment
E2EFactory
E2E Reg
E2E H2RMapper
...
F1
RM
SS S
F2
RM
SS S
E2E S E2E S E2E S
(c) Compound Services
In each case, Registry handle is effectively the uniquename for the virtual organization.
[email protected] University of Chicago
OGSA and the Globus Toolkit Technically, OGSA enables
– Refactoring of protocols (GRAM, MDS-2, etc.)—while preserving all GT concepts/features!
– Integration with hosting environments: simplifying components, distribution, etc.
– Greatly expanded standard service set Pragmatically, we are proceeding as follows
– Develop open source OGSA implementation> Globus Toolkit 3.0; supports Globus Toolkit 2.0 APIs
– Partnerships for service development– Also expect commercial value-adds
[email protected] University of Chicago
Globus Toolkit Refactoring Grid Security Infrastructure (GSI)
– Used in Grid service network protocol bindings Meta Directory Service 2 (MDS-2)
– Native part of each Grid service:> Discovery, Registry, RegistryManagement, Notification
Grid Resource Allocation & Mngt (GRAM)– Gatekeeper -> Factory for job mgr instances
GridFTP– Refactor control channel protocol
Other services refactored to used Grid services
[email protected] University of Chicago
Timeline Summer 2002 – Alpha releases of high-
level Grid Services Late 2002, Early 2003 – Alpha release of
new core Grid Services (MDS, GRAM, GridFTP)
[email protected] University of Chicago
Migration Paths Globus ToolkitTM evolutionary in nature
– Toolkit implementation may change– Underlying model of Grid Computing remains the
same– Capabilities of future Toolkits will be superset of
today’s Toolkit New implementations integrate better with existing
commodity technologies In cases of radical departure from current
implementations, migration paths will be provided– possibly maintain compatible APIs– possibly create gateways to today’s protocols
[email protected] University of Chicago
Summary:Evolution of Grid Technologies
Initial exploration (1996-1999; Globus 1.0)– Extensive appln experiments; core protocols
Data Grids (1999-??; Globus 2.0+)– Large-scale data management and analysis
Open Grid Services Architecture (2001-??, Globus 3.0)– Integration w/ Web services, hosting environments,
resource virtualization– Databases, higher-level services
Radically scalable systems (2003-??)– Sensors, wireless, ubiquitous computing
[email protected] University of Chicago
Summary The Grid problem: Resource sharing & coordinated
problem solving in dynamic, multi-institutional virtual organizations
Grid architecture: Protocol, service definition for interoperability & resource sharing
Globus Toolkit a source of protocol and API definitions—and reference implementations– And many projects applying Grid concepts (& Globus
technologies) to important problems Open Grid Services Architecture represents (we
hope!) next step in evolution
[email protected] University of Chicago
For More Information The Globus Project™
– www.globus.org Grid architecture
– www.globus.org/research/papers/anatomy.pdf
Open Grid Services Architecture– www.globus.org/research/
papers/ogsa.pdf– www.globus.org/research/
papers/gsspec.pdf