82
An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory LISHEP 2004 – UERJ, Rio De Janeiro – Feb 2004

An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

Embed Size (px)

Citation preview

Page 1: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

An Overview of the Globus Toolkit

and theOpen Grid Services Architecture

Mike Wilde

Mathematics and Computer Science Division

Argonne National Laboratory

LISHEP 2004 – UERJ, Rio De Janeiro – Feb 2004

Page 2: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

Grids – What & Why?

Page 3: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 3 www.griphyn.org

What is a Grid?

Three key criteria– Coordinates distributed resources …

– using standard, open, general-purpose protocols and interfaces …

– to deliver non-trivial qualities of service. What is not a Grid?

– A cluster, a network attached storage device, a scientific instrument, a network, etc.

– Each may be an important component of a Grid, but by itself does not constitute a Grid

Page 4: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 4 www.griphyn.org

The Grid “Resource sharing & coordinated

problem solving in dynamic … virtual organizations”

1. Enable integration of distributed service & resources

2. Using general-purpose protocols & infrastructure

3. To achieve useful qualities of service

“The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001

Page 5: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 5 www.griphyn.org

•Authenticate once

•Submit a grid computation

(code, resources, data,

…)

•Locate resources

•Negotiate authorization,

acceptable use, etc.

•Select and acquire resources

•Initiate data transfers,

computation

•Monitor progress

•Steer computation

•Store and distribute results

•Account for usage

Grid Applications

LI GO in Louisiana

a b

c

LI GO in Louisiana

a b

c

Page 6: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 6 www.griphyn.org

Grid Communities & Applications:Data Grids for High Energy Physics

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm

~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000

SpecInt95 equivalents

Image courtesy Harvey Newman, Caltech

Page 7: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 7 www.griphyn.org

Grid2003: Towards a Persistent U.S. Open Science Grid

Status on 11/19/03(http://www.ivdgl.org/grid2003)

P

Page 8: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 8 www.griphyn.org

Examples of Production Grid Deployments

“Persistent deployment of Grid services in support of a diverse user community”

Grid3/iVDGL (US)– 22 sites, O(3000) CPUs, 2

countries LHC Computing Grid

– 25 sites, international

– High energy physics NorduGrid

– 24 clusters, 724 CPUs, 6 countries; physics

NASA IPG– 4 sites, O(3000) CPUs

– Aeronautics NEESgrid (prod. 2004)

– Instruments, data, compute, collaborative

– Earthquake eng. TeraGrid (prod. Jan 04)

– 5 sites, expanding

W

P

P

P

P

P

P re-WS Web Services

Page 9: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 9 www.griphyn.org

Resource Integrationas a Fundamental Challenge

R

Discovery

Many sourcesof data, services,computation

R

Registries organizeservices of interestto a community

Access

Data integration activitiesmay require access to, &exploration/analysis of, dataat many locations

Exploration & analysismay involve complex,multi-step workflows

RM

RM

RMRM

RM

Resource managementis needed to ensureprogress & arbitrate competing demands

Securityservice

Securityservice

PolicyservicePolicyservice

Security & policymust underlie access& managementdecisions

Page 10: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

Globus Toolkit 2.4

Page 11: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

Grid Security Infrastructure - Secure Communcation

Page 12: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 12 www.griphyn.org

Certificates

By checking the signature, one can determine that a public key belongs to a given user.

NameIssuerPublic KeySignature

Hash

=?Decrypt

Public Key fromIssuer

Page 13: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 13 www.griphyn.org

Grid Security Infrastructure (GSI) GSI is:

PKI(CAs and

Certificates)

SSL/TLS

Proxies and Delegation

PKI forcredentials

SSL forAuthenticationAnd message protection

Proxies and delegation (GSIExtensions) for secure singleSign-on

Page 14: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 14 www.griphyn.org

Site A(Kerberos)

Site B (Unix)

Site C(Kerberos)

Computer

User

Single sign-on via “grid-id”& generation of proxy cred.

Or: retrieval of proxy cred.from online repository

User ProxyProxy

credential

Computer

Storagesystem

Communication*

GSI-enabledFTP server

AuthorizeMap to local idAccess file

Remote fileaccess request*

GSI-enabledGRAM server

GSI-enabledGRAM server

Remote processcreation requests*

* With mutual authentication

Process

Kerberosticket

Restrictedproxy

Process

Restrictedproxy

Local id Local id

AuthorizeMap to local idCreate processGenerate credentials

Ditto

GSI in Action: “Processes at A and B Communicate & Access Files at C”

Page 15: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

Resource discoveryand status information

Page 16: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 16 www.griphyn.org

The Grid Information Problem

Large numbers of distributed “sensors” with different properties

Need for different “views” of this information, depending on community membership, security constraints, intended purpose, sensor type

Page 17: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 17 www.griphyn.org

Globus Toolkit Solution: MDS-2

Registration & enquiry protocols, information models, query languages– Provides standard interfaces to sensors

– Supports different “directory” structures supporting various discovery/access strategies

Karl Czajkowski, Steve Fitzgerald, others

Page 18: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 18 www.griphyn.org

MDS-2 Components

Grid Information Service (GRIS)– Provides resource description

– Modular content gateway Grid Index Information Service (GIIS)

– Provides aggregate directory

– Hierarchical groups of resources Lightweight Dir. Access Protocol (LDAP)

– Standard with many client implementations

– Used for GRIP (and GRRP currently)

Page 19: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 19 www.griphyn.org

GRIS Host Objects

/scratch1dev=

diskdev group=

DISK

DISK netdev group=

eth0dev=NET

NET

hn=hostname

cpu 0dev=CPU

cpu 1dev=CPU

CPUsdev group=

CPU

CPU

dev=RAM VMdev=RAM VM

RAM

VM

dev group=memory

software=OS

CPU

CPU

RAM

VM

DISK

NET

OS

OS

Page 20: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 20 www.griphyn.org

MDS Architecture Resources run a standard information service (GRIS) which speaks

LDAP and provides information about the resource (no searching). GIIS provides a “caching” service much like a web search engine.

Resources register with GIIS and GIIS pulls information from them when requested by a client and the cache has expired.

GIIS provides the collective-level indexing/searching function.

GIIS

Cache contains info from A and B

Resource A

GRIS

GIIS requests information fromGRIS services as needed.

Client 1

Client 2

Client 3

Resource B

GRIS

Clients 1 and 2 request infodirectly from resources.

Client 3 uses GIIS for searchingcollective information.

Page 21: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

Resource Management and Job Execution:

GRAM, Condor-G

Page 22: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 22 www.griphyn.org

GRAM Components

Globus SecurityInfrastructure

Job Manager

GRAM client API calls to request resource allocation

and process creation.

MDS client API callsto locate resources

Query current statusof resource

Create

RSL Library

Parse

RequestAllocate &

create processes

Process

Process

Process

Monitor &control

Site boundary

Client MDS: Grid Index Info Server

Gatekeeper

MDS: Grid Resource Info Server

Local Resource Manager

MDS client API callsto get resource info

GRAM client API statechange callbacks

Page 23: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 23 www.griphyn.org

Resource Management Review

Resource Specification Language (RSL) is used to communicate requirements

The Globus Resource Allocation and Management (GRAM) API allows programs to be started on remote resources, despite local heterogeneity

A layered architecture allows application-specific resource brokers and co-allocators (e.g. DUROC) to be defined in terms of GRAM services

Page 24: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 24 www.griphyn.org

Condor-G: Job Submission Client Use Condor to run jobs on the Grid Uses Globus Toolkit

– GRAM (submit a remote job)– GASS (transfer job’s files)

Run a job on a Grid resource Features

– Job management– Fault tolerance– Credential management

Page 25: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 25 www.griphyn.org

How It Works

ScheddSchedd JobManagerJobManager

LSFLSF

Condor-G Grid Resource

GridManagerGridManager

600 Globusjobs

Page 26: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

Data Management – GridFTP, Replica Location Service

Page 27: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 27 www.griphyn.org

GridFTP Data-intensive grid applications need to transfer

and replciate large data sets– Terabytes to petabytes– between any two sites in the Grid

GridFTP Features:– Uses Grid security– Third party (client mediated) transfer– Parallel transfers– Striped transfers– TCP buffer optimizations

Page 28: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 28 www.griphyn.org

Command line tool: globus-url-copy

This is the GridFTP client tool provided with the Globus Toolkit

It takes a source URL and destination URL and will do protocol conversion for http, https, FTP, gsiftp, and file (file must be local).

globus-url-copy sourceURL destURL

globus-url-copy gsiftp://sourceHostName:port/dir1/dir2/file17 gsiftp://destHostName:port/dirX/dirY/fileA

Page 29: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 29 www.griphyn.org

Striped GridFTP Server

Parallel File System (e.g. PVFS, PFS, etc.)

MPI-IO

Plug-in

Control

GridFTP Server Parallel BackendGridFTPservermaster

mpirun

GridFTPclient

Plug-in

Control

Plug-in

Control

Plug-in

Control…MPI (Comm_World)

MPI (Sub-Comm)

To Client or Another Striped GridFTP Server

Controlsocket

GridFTP Control Channel GridFTP Data Channels

Page 30: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 30 www.griphyn.org

GridFTP Development For GT3 Major redesign planned

Part 1: Replace existing globus_io libraries with XIO libraries (under development)– Pluggable protocol stack– TCP, reliable UDP, HTTP, GSI

Part 2: GridFTP OGSA Service (?)– Based on redesign of GRAM job submission, service

level agreements– Data transfer is just another type of job to be

executed

Page 31: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

RLS

Page 32: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 32 www.griphyn.org

Replica Management in Grids Data intensive applications

– Produce Terabytes or Petabytes of data

Replicate data at multiple locations– Fault tolerance– Performance: avoid wide area data transfer

latencies, achieve load balancing

Issues:– Locating replicas of desired files– Creating new replicas– Scalability– Reliability

Page 33: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 33 www.griphyn.org

A Replica Location Service A Replica Location Service (RLS) is a distributed

registry service that records the locations of data copies and allows discovery of replicas

Maintains mappings between logical identifiers and target names– Physical targets: Map to exact locations of replicated data

– Logical targets: Map to another layer of logical names, allowing storage systems to move data without informing the RLS

RLS was designed and implemented in a collaboration between the Globus project and the DataGrid project

Page 34: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 34 www.griphyn.org

LRC LRC LRC

RLIRLI

LRCLRC

Replica Location Indexes

Local Replica Catalogs

• LRCs contain consistent information about logical-to-target mappings on a site

• RLIs nodes aggregate information about LRCs

• Soft state updates from LRCs to RLIs: relaxed consistency of index information, used to rebuild index after failures

• Arbitrary levels of RLI hierarchy

Page 35: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

OGSA – the Evolution ofGrid Architecture

Page 36: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 36 www.griphyn.org

Overview

Grid background Open Grid Services Architecture Open Grid Services Infrastructure Beyond OGSI: other OGSA services Globus Toolkit v3 implementation Early GT3 performance results Scientific and commercial perspectives Summary

Page 37: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 37 www.griphyn.org

Why Open Standards Matter

Ubiquitous adoption demands open, standard protocols – Standard protocols enable interoperability

– Avoid product/vendor lock-in

– Enables innovation/competition on end points Further aided by open, standard APIs

– Standard APIs enable portability

– Allow implementations to port to different vendor platforms

Internet and Web as exemplars

Page 38: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 38 www.griphyn.org

Incr

ease

d fu

nctio

nalit

y,st

anda

rdiz

atio

n

Customsolutions

1990 1995 2000 2005

Open GridServices Arch

Real standardsMultiple implementations

Web services, etc.

Managed sharedvirtual systems

Computer science research

Globus Toolkit

Defacto standardSingle implementation

Internetstandards

The Emergence ofOpen Grid Standards

2010

Page 39: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 39 www.griphyn.org

Open Grid Services Architecture Service orientation to virtualize resources

– Everything is a service From Web services

– Standard interface definition mechanisms

– Evolving set of other standards: security, etc. From Grids (Globus Toolkit)

– Service semantics, reliability & security models

– Lifecycle management, discovery, other services A framework for the definition & management of

composable, interoperable services

“The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration”, Foster, Kesselman, Nick, Tuecke, 2002

Page 40: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 40 www.griphyn.org

Globus Toolkit:A Story of Evolution

Definition of Grid problem has been stable since original Globus Project proposal in 1995– Though we’ve gotten better at articulating it

But our approach to its solution has evolved:– From APIs and custom protocols…

– to standard protocols…

– to Grid services (OGSA) Driven by experience implementing and

deploying the Globus Toolkit, and building real applications with it

Page 41: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 41 www.griphyn.org

Globus Toolkit® v3.0

All of the GT v2.4 services and clients Complete Java implementation of OGSI v1.0

– Rich, container-based implementation

– Built on Apache Axis Globus “proprietary” services built on OGSI

– Managed Jobs (akin to GT2 GRAM)

– Reliable File Transfer (RFT)

– Index Services (akin to GT2 GIIS) Some services not yet OGSI-fied:

– GridFTP, Replica Location Services (RLS)

Page 42: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 42 www.griphyn.org

GT2 Components RLS

GT-OGSAGrid Service Infrastructure

The focus of this presentation

GT3 Distribution

Page 43: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 43 www.griphyn.org

Web Services

XML-based distributed computing technology Web service = a server process that exposes typed

ports to the network Described by the Web Services Description Language,

an XML document that contains– Type of message(s) the service understands & types of

responses & exceptions it returns

– “Methods” bound together as “port types”

– Port types bound to protocols as “ports” A WSDL document completely defines a service and

how to access it

Page 44: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 44 www.griphyn.org

WSDL Example

<wsdl:definitions targetNamespace=“…”> <wsdl:types> <schema> <xsd:element name=“fooInput” …/> <xsd:element name=“fooOutput” …/> </schema> </wsdl:types> <wsdl:message name=“fooInputMessage”> <part name=“parameters” element=“fooInput”/> </wsdl:message> <wsdl:message name=“fooOutputMessage”> <part name=“parameters” element=“fooOutput”/> </wsdl:message> <wsdl:portType name=“fooInterface”> <wsdl:operation name=“foo”> <input message=“fooInput”/> <output message = “fooOutput”/> </wsdl:operation> </wsdl:portType></wsdl:definitions>

Page 45: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 45 www.griphyn.org

Web Services: Mode of Operation

Definition in a Meta-language– CORBA: Interface Definition Language (IDL)– WS: Web Services Definition Language (WSDL)

Stubs:– Serialize/deserialize or marshal/unmarshal– Implement interaction based on a protocol such as IIOP or SOAP

Meta-languagedefinition ofa procedure,

object or service

server’s stub(skeleton)

server’s implementation

client’s stub(proxy)

client’s implementation

interaction

Page 46: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 46 www.griphyn.org

WSDL Document Structure WSDL: Web Services Definition Language Document structure:

– Service Description– Implemenation Details

Service Description– Elements

> PortType (~ class)> Operations (~ method)> Messages, message parts (~ parameters)> Types (type definitions)

– Used for> Generating stubs and skeletons> Service discovery

Page 47: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 47 www.griphyn.org

WSDL Document Structure (cntd)

Implementation Details– Binding

> Messaging protocol (eg. SOAP)

> Message Interpretation (eg. RPC or literal)

> Data-encoding model (eg. SOAP or literal encoding)

> Transport protocol (eg. HTTP or FTP)

– Port: describes service endpoint

– Service Element: groups Port elements together Others:

– Definition: root element of a SOAP document

Page 48: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 48 www.griphyn.org

Database: Service Description<types> <schema targetNamespace="http://samples.ogsa.globus.org/database/database.xsd" xmlns="http://www.w3.org/2001/XMLSchema">  <complexType name="query"> <sequence> <element name="send_query" type="string"/> </sequence> </complexType> </schema></types> <message name="myDatabaseQuery"> <part name="query_parameter" type="query"/></message> <message name="myDatabaseResponse"> <part name="response_parameter" type=“string"/></message> <portType name="Database_PortType"> <operation name="databaseQueryOperation"> <input message="tns:myDatabaseQuery"/> <output message="tns:myDatabaseResponse"/> </operation></portType>

“class”

“method”

“parameter”

“parameter” type

Page 49: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 49 www.griphyn.org

Database: Implementation

<binding name="Database_Binding" type="tns:Database_PortType"> <soap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap/http"/> <operation name="databaseQueryOperation"> <soap:operation soapAction="do_databaseQueryOperation"/> <input> <soap:body encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"

use="encoded" namespace="http://samples.ogsa.globus.org/database"/> </input>  <output> <soap:body encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"

use="encoded" namespace="http://samples.ogsa.globus.org/database"/> </output> </operation></binding> <service name="Database_Service"> <port name="Database_Port" binding="tns:Database_Binding"> <soap:address location="http://ept.mcs.anl.edu:8080/axis/services/Database_Port"/> </port></service>

use SOAP encoding

the service is located here

use http for transportinterpret as RPC calluse SOAP

Page 50: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 50 www.griphyn.org

Transient Service Instances “Web services” address discovery & invocation of

persistent services– Interface to persistent state of entire enterprise

In Grids, must also support transient services, created/destroyed dynamically– Interfaces to the states of distributed activities

– E.g. workflow, video conf., dist. data analysis Significant implications for how services are managed,

named, discovered, and used– In fact, much of our work is concerned with the

management of services

Page 51: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 51 www.griphyn.org

OGSA Structure A standard substrate: the Grid service

– Standard interfaces and behaviors that address key distributed system issues: naming, service state, lifetime, notification

– A Grid service is a Web service … supports standard service specifications

– Agreement, data access & integration, workflow, security, policy, diagnostics, etc.

– Target of current & planned GGF efforts … and arbitrary application-specific services based

on these & other definitions

Page 52: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 52 www.griphyn.org

Overview

Grid background Open Grid Services Architecture Open Grid Services Infrastructure Beyond OGSI: other OGSA services Globus Toolkit v3 implementation Early GT3 performance results Scientific and commercial perspectives Summary

Page 53: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 53 www.griphyn.org

OGSI Specification

Defines WSDL conventions and extensions– For describing and naming services

– Working with W3C WSDL working group to drive OGSI extensions into WSDL 1.2

Defines fundamental interfaces (using extended WSDL) and behaviors that define a Grid Service– A unifying framework for interoperability &

establishment of total system properties http://www.ggf.org/ogsi-wg

Page 54: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 54 www.griphyn.org

FundamentalInterfaces & Behaviors

OGSI defines basic patterns of interaction, which can be combined with each other and with custom patterns in a myriad of ways

OGSI Specification focuses on:– Atomic, composable patterns in the form of

portTypes/interfaces> Define operations & associated service data elements

– A model for how these are composed> Compatible with WSDL 1.2

Complete service descriptions are left to other groups that are defining real services

Page 55: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 55 www.griphyn.org

OGSI: Standard Web Services Interfaces & Behaviors

Naming and bindings (basis for virtualization)– Every service instance has a unique name, from which can discover

supported bindings Lifecycle (basis for fault resilient state management)

– Service instances created by factories

– Destroyed explicitly or via soft state Information model (basis for monitoring & discovery)

– Service data (attributes) associated with GS instances

– Operations for querying and setting this info

– Asynchronous notification of changes to service date Service Groups (basis for registries & collective svcs)

– Group membership rules & membership management Base Fault type

Page 56: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 56 www.griphyn.org

OGSI Service Data

Attributes: Publicly visible state of the service Want to bring full power of XML to attributes

– getXXX/setXXX is too limiting> How to get/set multiple?

> Want richer queries across attributes (e.g., join)

– Use XML Schema, XPath, XQuery, XSLT, etc.

– OGSI service data:> Attributes defined using XML Schema

> Attributes combined into a single (logical) document within the service

> Rich pull/push/set operations against service data document

Should declare attributes in WSDL interface

Page 57: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 57 www.griphyn.org

Open Grid Services Infrastructure

Implementation

Servicedata

element

Other standard interfaces:factory,

notification,collections

Hosting environment/runtime(“C”, J2EE, .NET, …)

Servicedata

element

Servicedata

element

GridService(required)

Dataaccess

Lifetime management• Explicit destruction• Soft-state lifetime

Introspection:• What port types?• What policy?• What state?

Client

Grid ServiceHandle

Grid ServiceReference

handleresolution

Page 58: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 58 www.griphyn.org

Service registry

Service requestor (e.g. user application)

Service factory

Create Service

Grid Service Handle

Resource allocation

Service instances

Register Service

Service discovery

Interactions standardized using WSDL

Service data Keep-alives Notifications Service invocation

Authentication & authorization are applied to all requests

Open Grid ServicesInfrastructure (OGSI)

Page 59: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 59 www.griphyn.org

GT-OGSA Grid Service Infrastructure

Security Infrastructure

System-Level Services

Base Services

User-Defined Services

Grid Service Container

Hosting Environment

Web Service Engine

OGSI Spec Implementation

Page 60: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 60 www.griphyn.org

OGSI Implementation

GT3 includes a set of primitives that fully implement the interfaces and behaviors defined in the OGSI Specification– Defines how entities can create, discover

and interact with a Grid service The OGSI Specification defines a protocol:

GT3 provides a programming model for that protocol

Page 61: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 61 www.griphyn.org

Implementation of the GridService portType: GridServiceImpl and

PersistentGridServiceImpl destroy() setServiceData() findServiceData() requestTerminationAfter() requestTerminationBefore() addOperationProvider()(See docs for complete set of methods)

Implementationof the OGSI Spec

<parameter name=“operationProviders” value=“<className>”>

Additionalfunctionality can beadded to aGrid Service usingOperationProviders

Deployment descriptor

Page 62: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 62 www.griphyn.org

Building an OGSI-Compliant Grid Service using GT3

Write service-specificlogic that alsoimplements the GT3 OperationProvider interface

Page 63: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 63 www.griphyn.org

Write service-specificlogic that alsoimplements the GT3 OperationProvider interface

Combine with one of thetwo GT3 implementationsof base GridServicefunctionality:GridServiceImpl orPersistentGridServiceImpl

Building an OGSI-Compliant Grid Service using GT3

Page 64: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 64 www.griphyn.org

Write service-specificlogic that alsoimplements the GT3 OperationProvider interface

Combine with one of thetwo GT3 implementationsof base GridServicefunctionality:GridServiceImpl orPersistentGridServiceImpl

An OGSI-Compliant

grid service

Building an OGSI-Compliant Grid Service using GT3

Page 65: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 65 www.griphyn.org

An OGSI-Compliant

grid service

OperationProviders are configured at deployment time or added at runtime

Write service-specificlogic that alsoimplements the GT3 OperationProvider interface

Combine with one of thetwo GT3 implementationsof base GridServicefunctionality:GridServiceImpl orPersistentGridServiceImpl

Building an OGSI-Compliant Grid Service using GT3

Page 66: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 66 www.griphyn.org

A Grid Service Can be Composed of Multiple OperationProviders

OPs can be designed as atomic bits of functionality to facilitate reuse

OP approach eases the task of bringing legacy code into OGSI-compliance

OPs allow Grid Services to be formed dynamically (in contrast to the inheritance approach)

Page 67: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 67 www.griphyn.org

Several OperationProviders are Included in the GT3 Distribution

NotificationSourceProvider

HandleResolverProvider

ServiceGroupRegistrationProvider

ServiceGroupProvider

FactoryProvider

Page 68: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 68 www.griphyn.org

GWSDL

OGSI requires interface extension/composition We worked within W3C WSDL working group to define

standard interface extension in WSDL 1.2 that meets OGSI requirements

But could not wait for WSDL 1.2 So defined gwsdl:portType that extends WSDL 1.1

portType with:– WSDL 1.2 portType extension

– WSDL 1.2 open content model Define GWSDL WSDL 1.1 & 1.2 mappings

Page 69: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 69 www.griphyn.org

GWSDL Example

<wsdl:definitions> <wsdl:types>…</wsdl:types> <wsdl:message>…</wsdl:message> … <gwsdl:portType name=“foo” extends=“ns:bar ogsi:GridService”>

<wsdl:operation name=“op1”>…</wsdl:operation> <wsdl:operation name=“op2”>…</wsdl:operation>

<ogsi:serviceData … />

</gwsdl:portType> …</wsdl:definitions>

Page 70: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 70 www.griphyn.org

MMJFS

Resource Management GRAM Architecture rendered in OGSA The MMJFS runs as an unprivileged user,

with a small highly-constrained setuid executable behind it

Individual user environments are created using virtual hosting

MJS

MJSMJS

MJSUser 1

User 2

User 3

Master User

MJS

MJS

MMJFS: Master Managed Job FactoryService

MJS: Managed JobService

User Hosting Environment

Page 71: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 71 www.griphyn.org

Client

GRAM Job Submission Scenario

IndexService

1. From an index service,the clientchooses an MMJFS

Page 72: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 72 www.griphyn.org

Client

GRAM Job Submission Scenario

IndexService 2. The client calls the

createServiceoperation on the factoryand supplies RSL

MMJFS

1. From an index service,the clientchooses an MMJFS

Page 73: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 73 www.griphyn.org

Client

GRAM Job Submission Scenario

IndexService MMJFS

MJS

3. The factorycreates aManaged JobService

1. From an index service,the clientchooses an MMJFS

2. The client calls thecreateServiceoperation on the factoryand supplies RSL

Page 74: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 74 www.griphyn.org

Client

GRAM Job Submission Scenario

IndexService MMJFS

MJS

4. The factoryreturns a locator

1. From an index service,the clientchooses an MMJFS

2. The client calls thecreateServiceoperation on the factoryand supplies RSL 3. The factory

creates aManaged JobService

Page 75: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 75 www.griphyn.org

Client

GRAM Job Submission Scenario

IndexService MMJFS

MJS5. The client subscribes tothe MJS’ status SDE and retrieves output

1. From an index service,the clientchooses an MMJFS

2. The client calls thecreateServiceoperation on the factoryand supplies RSL 3. The factory

creates aManaged JobService 4. The factory

returns a locator

Page 76: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 76 www.griphyn.org

Information Services

Index service as caching aggregator– Caches service data from other Grid

services Index service as provider framework

– Serves as a host for service data providers that live outside of a Grid service to publish data

Page 77: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 77 www.griphyn.org

Reliable File Transfer OGSI-compliant service exposing GridFTP

control channel functionality– 3rd-party transfer between GridFTP servers

Recoverable Grid service– Automatically restarts interrupted transfers

from the last checkpoint Progress and restart monitoring

GridFTP Server 1

GridFTP Server 2

RFT

JDBC

Page 78: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 78 www.griphyn.org

Example:Reliable File Transfer Service

Performance

Policy

Faults

servicedataelements

Pending

FileTransfer

InternalState

GridService

Notf’nSource

Policy

interfacesQuery &/orsubscribe

to service data

FaultMonitor

Perf.Monitor

Client Client Client

Request and manage file transfer operations

Data transfer operations

Page 79: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 79 www.griphyn.org

OGSI Implementations

Globus Toolkit version 3.0 (Java, C client) U Virginia OGSI.NET (.NET) LBNL pyGlobus (Python) U Edinburgh (.NET) U Manchester (PERL) Fujitsu Unicore (Java)

Page 80: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 80 www.griphyn.org

Overview

Grid background Open Grid Services Architecture Open Grid Services Infrastructure Beyond OGSI: other OGSA services Globus Toolkit v3 implementation Early GT3 performance results Scientific and commercial perspectives Summary

Page 81: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 81 www.griphyn.org

Web Services: Basic Functionality

OGSA

Open Grid Services Architecture

OGSI: Interface to Grid Infrastructure

Applications in Problem Domain X

Compute, Data & Storage Resources

Distributed

Application & Integration Technology for Problem Domain X

Users in Problem Domain X

Virtual Integration Architecture

Generic Virtual Service Access and Integration Layer

-

Structured DataIntegration

Structured Data Access

Structured DataRelational XML Semi-structured

Transformation

Registry

Job Submission

Data Transport Resource Usage

Banking

Brokering Workflow

Authorisation

Page 82: An Overview of the Globus Toolkit and the Open Grid Services Architecture Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory

LISHEP2004/UERJ 82 www.griphyn.org

OGSA Standardization & Implementation

OGSI defines core interfaces and behaviors for manageable services– Supported by strong open source technology & major

commercial vendors Efforts are underway within GGF, OASIS, and other

bodies to define standards for– Agreement negotiation

– Common management model

– Data access and integration

– Security and policy

– Etc., etc., etc.