96
Grid Technology A Web Services Globus OGSA & Grid Architecture CERN Geneva April 1-3 2003 Geoffrey Fox Community Grids Lab Indiana University [email protected]

Grid Technology A: Web Services Globus OGSA

Embed Size (px)

Citation preview

Page 1: Grid Technology A: Web Services Globus OGSA

Grid Technology AWeb ServicesGlobus OGSA

& Grid Architecture

CERN GenevaApril 1-3 2003

Geoffrey FoxCommunity Grids Lab

Indiana [email protected]

Page 2: Grid Technology A: Web Services Globus OGSA

With Thanks to• Tony Hey my co-speaker and

• I adapted presentations from• Marlon Pierce• Dennis Gannon• Globus• Malcolm Atkinson• David de Roure

Page 3: Grid Technology A: Web Services Globus OGSA

π0X0

π0X η0X

η0X0

-t

200 GeV hp

E260

E350

Fermilab Experiments 1975-1980Hadron Jets in 1977 Compared toFeynman Field (Fox) Model

Regge Theory 1978

Page 4: Grid Technology A: Web Services Globus OGSA

Caltech Hypercube

000001

010011

100

110

101

111

JPL Mark II 1985Chuck Seitz 1983

Hypercube as a cube

Page 5: Grid Technology A: Web Services Globus OGSA

History New York Times 1984• One of today's fastest computers is the Cray 1, which can do 20

million to 80 million operations a second. But at $5 million, they are expensive and few scientists have the resources to tie one up for days or weeks to solve a problem.

• ``Poor old Cray and Cyber (another super computer) don't have much of a chance of getting any significant increase in speed,'' Fox said. ``Our ultimate machines are expected to be at least 1,000 times faster than the current fastest computers.'' (80 gigaflops predicted. Earth Simulator is 40,000 gflops)

• But not everyone in the field is as impressed with Caltech's Cosmic Cube as its inventors are. The machine is nothing more nor less than 64 standard, off-the-shelf microprocessors wired together, not much different than the innards of 64 IBM personal computers working as a unit.

• The Caltech Hypercube was “just a cluster of PC’s”!

Page 6: Grid Technology A: Web Services Globus OGSA

History New York Times 1984 • ``We are using the same technology used in PCs (personal

computers) and Pacmans,'' Seitz said. The technology is an 8086 microprocessor capable of doing 1/20th of a million operations a second with 1/8th of a megabyte of primary storage. Sixty-four of them together will do 3 million operations a second with 8 megabytes of storage.

• Computer scientists have known how to make such a computer for years but have thought it too pedestrian to bother with.

• ``It could have been done many years ago,'' said Jack B. Dennis, a computer scientist at the Massachusetts Institute of Technology who is working on a more radical and ambitious approach to parallel processing than Seitz and Fox.

• ``There's nothing particularly difficult about putting together 64 of these processors,'' he said. ``But many people don't see that sort of machine as on the path to a profitable result.'‘

• So clusters are a trivial architecture (1984) ……• So architecture is unchanged ; unfortunately after 20 years research,

programming model is also the same (message passing)

Page 7: Grid Technology A: Web Services Globus OGSA

What is a Grid I?• Collaborative Environment (Ch2.2,18)• Combining powerful resources, federated computing and a security

structure (Ch38.2)• Coordinated resource sharing and problem solving in dynamic multi-

institutional virtual organizations (Ch6)• Data Grids as Managed Distributed Systems for Global Virtual

Organizations (Ch39)• Distributed Computing or distributed systems (Ch2.2,10)• Enabling Scalable Virtual Organizations (Ch6)• Enabling use of enterprise-wide systems, and someday nationwide

systems, that consist of workstations, vector supercomputers, and parallel supercomputers connected by local and wide area networks. Users will be presented the illusion of a single, very powerful computer, rather than a collection of disparate machines. The system will schedule application components on processors, manage data transfer, and provide communication and synchronization in such a manner as to dramatically improve application performance. Further, boundaries between computers will be invisible, as will the location of data and the failure of processors. (Ch10)

Page 8: Grid Technology A: Web Services Globus OGSA

What is a Grid II?• Supporting e-Science representing increasing global collaborations of

people and of shared resources that will be needed to solve the new problems of Science and Engineering (Ch36)

• As infrastructure that will provide us with the ability to dynamically link together resources as an ensemble to support the execution of large-scale, resource-intensive, and distributed applications. (Ch1)

• Makes high-performance computers superfluous (Ch6)• Metasystems or metacomputing systems (Ch10,37)• Middleware as the services needed to support a common set of

applications in a distributed network environment (Ch6)• Next Generation Internet (Ch6)• Peer-to-peer Network (Ch10, 18)• Realizing thirty year dream of science fiction writers that have spun

yarns featuring worldwide networks of interconnected computers that behave as a single entity. (Ch10)

Page 9: Grid Technology A: Web Services Globus OGSA

What is Grid Technology?• Grids support distributed collaboratories or virtual

organizations integrating concepts from• The Web• Distributed Objects (CORBA Java/Jini COM)• Globus Legion Condor NetSolve Ninf and other High

Performance Computing activities• Peer-to-peer Networks• With perhaps the Web being the most important for

“Information Grids” and Globus for “Compute Grids”• Use Information Grids and not usual Data Grids as

“distributed file systems” (holding lots of data!) are handled in Compute Grids

Page 10: Grid Technology A: Web Services Globus OGSA

PPPH: Paradigms Protocols Platforms and Hosting I

• We will start from the Web view and assert that basic paradigm is

• Meta-data rich Web Services communicating via messages

• These have some basic support from some runtime such as .NET, Jini (pure Java), Apache Tomcat+Axis (Web Service toolkit), Enterprise JavaBeans, WebSphere (IBM) or GT3 (Globus Toolkit 3)– These are the distributed equivalent of operating

system functions as in UNIX Shell

• Called Hosting Environment or platform

Page 11: Grid Technology A: Web Services Globus OGSA

Some Basic Observations• Grids manage and share asynchronous resources in a rather

centralized fashion• Peer-to-peer networks are “just like” Grids with different

implementations of services like registration and look-up• Web Services interact with messages

– Everything (including applications like PowerPoint will be a WS?) – see later short discussion

• Computers are fast and getting faster. One can afford many strategies that used to be unrealistic– All messages can be publish/subscribe– Software message routing

• XML will be used for most interesting data and meta-data– One will store/consider data and meta-data separately but often

use same technology to manage both of them.• Need Synchronous and Asynchronous Resource Sharing

– Integrate Grid and Collaboration technology

Page 12: Grid Technology A: Web Services Globus OGSA

Classic Grid Architecture

Database Database

Netsolve

Computing

SecurityCollaboration

CompositionContent Access

Resources

Clients Users and Devices

Middle TierBrokers Service Providers

Middle Tier becomes Web Services

Page 13: Grid Technology A: Web Services Globus OGSA

What is a Web Service I• A web service is a computer program running on either the local

or remote machine with a set of well defined interfaces (ports) specified in XML (WSDL)

• In principle, computer program can be in any language (Fortran .. Java .. Perl .. Python) and the interfaces can be implemented in any way what so ever– Interfaces can be method calls, Java RMI Messages, CGI Web

invocations, totally compiled away (inlining) but• The simplest implementations involve XML messages (SOAP)

and programs written in net friendly languages like Java and Python

• Web Services separate the meaning of a port (message) interface from its implementation

• Enhances/Enables Re-usable component model of ANY electronic resource

Page 14: Grid Technology A: Web Services Globus OGSA

etc.XML WS to WS Interfaces

(Virtual) XML Knowledge (User) Interface

Clients

(Virtual) XML Data Interface

Raw DataRawResources

Raw Data

WSWS

Web Service (WS)

WS

WSWS WS WSWS

Render to XML Display Format

(Virtual) XML Rendering Interface

Page 15: Grid Technology A: Web Services Globus OGSA

What is a Web Service II• Web Services have important implication that ALL

interfaces are XML messages based. In contrast• Most Windows programs have interfaces defined as

interrupts due to user inputs• Most software have interfaces defined as methods which

might be implemented as a message but this is often NOT explicit

Security Catalog

PaymentCredit Card

WarehouseshippingWSDL interfaces

WSDL interfaces

Page 16: Grid Technology A: Web Services Globus OGSA

What is a Web Service III• “Everything electronic” is a resource

– Computers; Programs; People– Data (from sensors to this presentation to email to databases)

• “Everything electronic” is a distributed object• All resources have interfaces which are defined in XML for both

properties (data-structure) and methods (service, function, subroutine) (Resources are Services)– We can assume that a data-structure property has

getproperty() and setproperty(value) methods to act as interface

• All resources are linked by messages with structure, which must be specifiable in XML

• All resources have a URI such as unique://a/b/c …….

Page 17: Grid Technology A: Web Services Globus OGSA

WSDL Abstractions• WSDL abstracts a program as an entity that does

something given one or more inputs with its results defined by streams on one or more outputs.

• Functions are defined by method name and parametersmethodname(parm1,parm2, … parmN)– Where parameters are “Input” “Output” or both

• In WSDL, we will have a Web Service which like a (Java or CORBA Program) can be thought of as a (distributed) object with many methods– Instead of a function call, the “calling routine” sends an XML

message to the Web Service specifying methodname and values of the parameters

– Note name of function is just another parameter

Page 18: Grid Technology A: Web Services Globus OGSA

Details of WSDL Protocol Stack• UDDI finds where programs are

– remote( (distributed) programs are just Web Services

– (not a great success)• WSFL links programs together

(under revision as BPEL4WS)• WSDL defines interface (methods,

parameters, data formats)• SOAP defines structure of message

including serialization of information• HTTP is negotiation/transport protocol• TCP/IP is layers 3-4 of OSI• Physical Network is layer 1 of OSI

UDDI or WSIL

WSFL

WSDL

SOAP or RMI

HTTP or SMTP or IIOP or RMTP

TCP/IP

Physical Network

Page 19: Grid Technology A: Web Services Globus OGSA

Education as a Web Service• Can link to Science as a Web Service and substitute educational

modules • “Learning Object” XML standards already exist from IMS/ADL

http://www.adlnet.org – need to update architecture • Web Services for virtual university include:• Registration• Performance (grading) • Authoring of Curriculum• Online laboratories for real and virtual instruments• Homework submission • Quizzes of various types (multiple choice, random parameters)• Assessment data access and analysis• Synchronous Delivery of Curricula• Scheduling of courses and mentoring sessions• Asynchronous access, data-mining and knowledge discovery• Learning Plan agents to guide students and teachers

Page 20: Grid Technology A: Web Services Globus OGSA

What are System and Application Services?• There are generic Grid system services: security, collaboration,

persistent storage, universal access– OGSA (Open Grid Service Architecture) is implementing these as

extended Web Services• An Application Web Service is a capability used either by another

service or by a user– It has input and output ports – data is from sensors or other

services• Consider Satellite-based Sensor Operations as a Web Service

– Satellite management (with a web front end)– Each tracking station is a service– Image Processing is a pipeline of filters – which can be grouped

into different services– Data storage is an important system service– Big services built hierarchically from “basic” services

• Portals are the user (web browser) interfaces to Web services

Page 21: Grid Technology A: Web Services Globus OGSA

Application Web Services• Note Service model integrates sensors, sensor analysis, simulations and people• An Application Web Service is a capability used either by another service or by a

user– It has input and output ports – data is from users, sensors or other services– Big services built hierarchically from “basic” services

Sensor Data as a Web

service (WS)

Data Analysis WS

Sensor Management

WS

Visualization WS

Simulation WS

Filter1WS

Filter2WS

Filter3WS

Build as multiple Filter Web Services

Prog1WS

Prog2WS

Build as multiple interdisciplinaryPrograms

Data Analysis WS

Simulation WS

Visualization WS

Page 22: Grid Technology A: Web Services Globus OGSA

The Application Service Model• As bandwidth of communication (between) services increases

one can support smaller services• A service “is a component” and is a replacement for a library in

case where performance allows• Services (components) are a sustainable model of software

development – each service has documented capability with standards compliant interfaces– XML defines interfaces at several levels– WSDL at Service interface level and XSIL or equivalent for

scientific data format• A service can be written as Perl, Python, Java Servlet, Enterprise

JavaBean, CORBA (C++ or Fortran) Object …• Communication protocol can be RMI (Java), IIOP (CORBA) or

SOAP (HTTP, XML) ……

Page 23: Grid Technology A: Web Services Globus OGSA

Application as a Web serviceApplication Model

Remaining W3C DOM Semantic Events

Data

Control

Application with W3C DOM Structure as a Web Service

User FacingPorts

Resource Facing Ports

Events as Messages

Rendering as Messages

W3C DOM User Interface

W3C DOM Raw(UI) Events

Selected W3C DOM Semantic EventsApplication

Viewand

SelectedControl

View

MVCM: Model

C:Control

V: View

Page 24: Grid Technology A: Web Services Globus OGSA

7 Primitives in WSDL• types: which provides data type definitions used to describe the

messages exchanged. • message: which represents an abstract definition of the data

being transmitted. A message consists of logical parts, each of which is associated with a definition within some type system.

• operation– an abstract description of an action supported by the service.

• portType: which is a set of abstract operations. Each operation refers to an input message and output messages.

• binding: which specifies concrete protocol and data format specifications for the operations and messages defined by a particular portType.

• port: which specifies an address for a binding, thus defining a single communication endpoint.

• service: which is used to aggregate a set of related ports

Page 25: Grid Technology A: Web Services Globus OGSA

Browser Interface

User Interface Server + Client

Stubs

Server plusService

Implementations

BackendResources

HTTP(S)

SOAP/HTTP(S)

Local invocation, JDBC connection or Grid Protocol

UI Server has stubsfor all services (database access, jobsubmission, filetransfer, etc.)

A particular serverhas severalservice implementations.

Backend is a database,application code plusoperating system.

Page 26: Grid Technology A: Web Services Globus OGSA

<?xml version="1.0" encoding="UTF-8"?><wsdl:definitions> <wsdl:message name="execLocalCommandResponse"><wsdl:message name="execLocalCommandRequest"><wsdl:portType name="SJwsImp"><wsdl:operation name="execLocalCommand" parameterOrder="in0"> <wsdl:input message="impl:execLocalCommandRequest" name="execLocalCommandRequest"/> <wsdl:output message="impl:execLocalCommandResponse" name="execLocalCommandResponse"/> </wsdl:operation></wsdl:portType><wsdl:binding name="SubmitjobSoapBinding" type="impl:SJwsImp"> <wsdlsoap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap/http"/> <wsdl:operation name="execLocalCommand"> <wsdlsoap:operation soapAction=""/> <wsdl:input name="execLocalCommandRequest"> <wsdl:output name="execLocalCommandResponse"></wsdl:operation> </wsdl:binding> <wsdl:service name="SJwsImpService"> <wsdl:port binding="impl:SubmitjobSoapBinding" name="Submitjob"> </wsdl:service></wsdl:definitions>

Page 27: Grid Technology A: Web Services Globus OGSA

Discussion of 7 WSDL Primitives• types specify data-structures which are equivalent to arguments of

methods• message specifies collections of types and is equivalent to set of

arguments in a method call. Note that it is an “abstract method” in Java terminology

• operation is a a collection of input output and fault messages; there are 4 types of operation one-way(service just receives a message), request-response(RPC), solicit-response, notification (services pushes out a message)

• portType represents a single channel that can support multiple operations. It is “abstract” as specified as a set of operations. It is equivalent to a “interface or abstract class” in Java

• binding tells you transport and message format for a porttype (which can have multiple bindings to reflect say performance-portability trades)

• port combines a binding and an endpoint network address (URL) and is like a “class instance”

• service consists of multiple ports and is equivalent to a “program” in Java

Page 28: Grid Technology A: Web Services Globus OGSA

OGSI

OGSA Platform services: registry,authorization, monitoring, data

access, etc., etc.

TransportProtocolHosting EnvironmentHosting Environment

Host. Env. & Protocol Bindings

Models for resources

& other entities

More specialized &domain-specific

services

Other

models

Domain-specificprofiles

Environment-specificprofiles

OGSAPlatform

OGSA OGSI & Hosting Environments• Start with Web Services in a hosting environment• Add OGSI to get a Grid service and a component model• Add OGSA to get Interoperable Grid “correcting” differences in base

platform and adding key functionalities

Page 29: Grid Technology A: Web Services Globus OGSA

Functional Level above OGSA• Systems Management and Automation • Workload / Performance Management • Security• Availability / Service Management • Logical Resource Management • Clustering Services • Connectivity Management • Physical Resource Management• Perhaps Data Access belongs here

Page 30: Grid Technology A: Web Services Globus OGSA

Two-level Programming I• The paradigm implicitly assumes a two-level Programming

Model• We make a Service (same as a “distributed object” or

“computer program” running on a remote computer) using conventional technologies– C++ Java or Fortran Monte Carlo module– Data streaming from a sensor or Satellite– Specialized (JDBC) database access

• Such nuggets accept and produce data from users files and databases

• The Grid is built by coordinating such nuggets assuming we have solved problem of programming the nugget

Nugget Data

Page 31: Grid Technology A: Web Services Globus OGSA

Two-level Programming II• The Grid is discussing the linkage and distribution of the

nuggets with the onlyaddition runtime interfaces to Grid as opposed to UNIX data streams

• Familiar from use of UNIX Shell, PERL or Python scripts to produce real applications from core programs

• Such interpretative environments are the single processor analog of Grid Programming

• Some projects like GrADS from Rice University are looking at integration between nugget levels but dominant effort looks at each level separately

Nugget1 Nugget2

Nugget3 Nugget4

Page 32: Grid Technology A: Web Services Globus OGSA

Why we can dream of using HTTP and that slow stuff

• We have at least three tiers in computing environment• Client (user portal discussed Thursday)• “Middle Tier” (Web Servers/brokers)• Back end (databases, files, computers etc.)• In Grid programming, we use HTTP (and used to use

CORBA and Java RMI) in middle tier ONLY to manipulate a proxy for real job– Proxy holds metadata – Control communication in middle tier only uses metadata– “Real” (data transfer) high performance communication in

back end

Page 33: Grid Technology A: Web Services Globus OGSA

Raw (HPC) Resources

Middleware

Database

PortalServices

SystemServices

SystemServices

SystemServices

Application Service

SystemServices

SystemServices

GridComputing

Environments

UserServices

“Core”Grid

Application Metadata

Actual Application

Page 34: Grid Technology A: Web Services Globus OGSA

OGSI

OGSA Platform services: registry,authorization, monitoring, data

access, etc., etc.

TransportProtocolHosting EnvironmentHosting Environment

Host. Env. & Protocol Bindings

Models for resources

& other entities

More specialized &domain-specific

services

Other

models

Domain-specificprofiles

Environment-specificprofiles

OGSAPlatform

OGSA OGSI & Hosting Environments• Start with Web Services in a hosting environment• Add OGSI to get a Grid service and a component model• Add OGSA to get Interoperable Grid “correcting” differences in base

platform and adding key functionalities

Page 35: Grid Technology A: Web Services Globus OGSA

PPPH: Paradigms Protocols Platforms and Hosting II

• Self-describing programs/interfaces are key to scaling– Minimize amount of work system has to do– Hide as much as possible in services and applications

• Protocols describe (in “principle” at least) those rules that system obeys and uses to deliver information between services (processes)

• Interfaces tell the service what to do to interpret the results of communication

• HTTP is the dominant transport protocol of the Web• HTML is the “interface” telling browser how to render• But you can extend interface to allow PDF, multimedia,

PowerPoint using “helper applications” which are (with more or less convenience) which are “automatically” downloaded if not already available– “Mime types” essentially self-describe” each interface

Page 36: Grid Technology A: Web Services Globus OGSA

Analogy with Web II• HTTP and HTML are the analogies on the client side• A “Web Service” generalizes a CGI Script on server side

– CGI is essentially a Distributed Object technology allowing server to access an arbitrary program labeled by a URL plus an ugly syntax to specify name and parameters of program to run

• Roughly WSDL (Web Service Description Language) is a better to specify program name and its parameters

• Web uses other protocols – HTTPS for secure links and RTP etc. for multimedia (UDP) streams– These again are required to integrate system – codecs like

MPEG are interfaces interpreted by client– There are further protocols like H323 and SIP which will

be placed (IMHO) by HTTP plus RTP etc. We should minimize number of protocols to get maintainable systems

Page 37: Grid Technology A: Web Services Globus OGSA

PPPH: Paradigms Protocols Platforms and Hosting III

• There are set of system capabilities which cannot be captured as standalone services and permeate Grid

• Meta-data rich Message-linked Web Services is permeating paradigm• Component Model such as “Enterprise JavaBean (EJB)” or OGSI

describes the formal structure of services – EJB if used lives inside OGSI in our Grids

• Invocation Framework describes how you interact with system• Security in fine grain fashion to provide selective authorization

(Globus and EDG WP6)• Policy context describes rules for this particular Grid• Transport mechanisms abstract concepts like ports and Quality of

Service• Messaging abstracts destination and customization of content• Network (monitoring, performance) EDG WP7• Fabric (resources) EDG WP4

Page 38: Grid Technology A: Web Services Globus OGSA

Architecture in Pictures I

Network

Resources

Services

Messaging

Services

Messaging

Abstract Model OGSI

Hosting Environment determines physical model

Invocation Framework

Page 39: Grid Technology A: Web Services Globus OGSA

Architecture in Pictures IIOGSA Interoperable Grid

Network

Resources

OGSA InterfacesOGSI Grid Services

Messaging

Network Monitoring and Scheduling

Page 40: Grid Technology A: Web Services Globus OGSA

Architecture in Pictures IIIOGSA Federated Grid

Network

Resources

Native Services

Messaging

Network Monitoring and Scheduling

Mediation Serviceconverting between OGSA and “native” services

Mediation Service

Page 41: Grid Technology A: Web Services Globus OGSA

Virtualization• The Grid could and sometimes does virtualize various

concepts• Location: URI (Universal Resource Identifier) virtualizes

URL• Replica management (caching) virtualizes file location

generalized by GriPhyn virtual data concept• Protocol: message transport and WSDL bindings

virtualize transport protocol as a QoS request• P2P or Publish-subscribe messaging virtualizes matching

of source and destination services• Semantic Grid virtualizes Knowledge as a meta-data

query• Brokering virtualizes resource allocation• Virtualization implies references can be indirect

Page 42: Grid Technology A: Web Services Globus OGSA

IFS: Interfaces and Functionality and Semantics I• The Grid platform tries to minimize detail in protocols and

maximize detail in interfaces to enhance scaling• However rich meta-data and semantics are critical for

correct and interesting operation– Put as much semantic interpretation as you can into specific

services– Lack of Semantic interoperation is in fact main weakness of

today’s Grids and Web services

• Everything becomes a service (See example of education) whether system or application level

• There are some very important “Global Services”– Discovery (look up) and Registration of service metadata– Workflow– MetaSchedulers

Page 43: Grid Technology A: Web Services Globus OGSA

IFS: Interfaces and Functionality and Semantics II• There are many other generally important services• OGSA-DAI The Database Service• Portal Service linked to by WSRP (Web services

for Remote Portals)• Notification of events• Job submission• Provenance – interpret meta-data about history of

data• File Interfaces• Sensor service – satellites …• Visualization• Basic brokering/scheduling

Page 44: Grid Technology A: Web Services Globus OGSA

Globus in a Nutshell from IPG• GT2 (or Globus Toolkit 2) is original (non web

service based) version which is basis of EDG (European Data Grid) work

• C programs and libraries • See Chapter 5 of book with background in chapters

2-4 and 37• http://www.ipg.nasa.gov/ipgusers/globus/• http://www.globusworld.org/globusworld_web/jw2_program_tut.htm

Page 45: Grid Technology A: Web Services Globus OGSA

Globus GT2 from IPG• The goal of the Globus GT2 is to provide dependable,

consistent, pervasive access to high-end resources.– This is original Grid “start” general recently to virtual

organizations and data grids

• The Globus Project offers the most widely used computing grid middleware. The Globus Project is a joint effort of Argonne National Laboratory, the Informational Sciences Institute of the University of Southern California, in collaboration with numerous other organizations including NCSA, NPACI, UCSD, and NASA. See http://www.globus.org/ for history, goals, release and usage notes, software distributions, and research papers.

Page 46: Grid Technology A: Web Services Globus OGSA

Globus GT2 II• Grid Fabric: Layer One

The fabric of the Grid comprises the underlying systems, computers, operating systems, networks, storage systems, and routers—the building blocks.

• Grid Services: Layer TwoGrid services integrate the components of the Grid fabric. Examples of the services that are provided by Globus Toolkit 2:

• GRAMThe Globus Resource Allocation Manager, GRAM, is a basic library service that provides capabilities to do remote-submission job start up. GRAM unites Grid machines, providing a common user interface so that you can submit a job to multiple machines on the Grid fabric. GRAM is a general, ubiquitous service, with specific application toolkit commands built on top of it

• MDSThe Monitoring and Discovery Service, also known as GIS, the Grid Information Service, provides information service. You query MDS to discover the properties of the machines, computers and networks that you want to use: how many processors are available at this moment? What bandwidth is provided? Is the storage on tape or disk? Is the visualization device an immersive desk or CAVE? Using an LDAP (Lightweight Directory Access Protocol) server, MDS provides middleware information in a common interface to put a unifying picture on top of disparate equipment.

• Contd …

Page 47: Grid Technology A: Web Services Globus OGSA

Globus GT2 III• GSI gss-api library for adding authentication to a program. GSI

provides programs, such as grid-proxy-init, to facilitate login to a variety of sites, while each site has its own flavor of security measures. That is, on the fabric layer, the various machines you want to use might be governed by disparate security policies; GSI provides a means of simplifying multiple remote logins. The standard installation is based on a PKI security system; the Kerberos installation of Globus is less standard. (Some installations with DoE and DoD insist on Kerberos)

• GridFTP A new (in Globus 2.0) protocol for file transfer over a grid. This is a Global Grid Forum standard

• GASS Globus Access to Secondary Storage, provides command-line tools and C APIs for remotely accessing data. GASS integrates GridFTP, HTTP, and local file I/O to enable secure transfers using any combination of these protocols..

Page 48: Grid Technology A: Web Services Globus OGSA

Globus GT2 IV• Application Toolkits: Layer Three

Application toolkits use Grid Services to provide higher-level capabilities, often targeted to specific classes of application.

• For example, the Globus development team has created a set of Grid service tools and a toolkit of programs for running remotely distributed jobs. These include remote job submission commands ( globusrun, globus-job-submit, globus-job-run), built on top of the GRAM service, and MPICH-G2, a Grid-enabled implementation of the Message Passing Interface (MPI).

• A more modern interface is through CoG Kits (Commodity Grid) to different languages – Perl Python Java – see chapter 26 of Book

• The Java CoG kit provides a natural way to link GT2 to a Web service framework

• Globus Toolkit 3 (GT3) effectively integrated CoG Kit interface with core Globus by wrapping all Globus Services as Web services

Page 49: Grid Technology A: Web Services Globus OGSA

Job Submission in Globus• Very similar to UNIX Shell – build Portal Web Interfaces to specific

or general Shell commands. Some example commands• globusrun Runs a single executable on a remote site with an RSL

specification. • globus-job-cancel Cancels a job previously started using globus-job-

submit. • globus-job-run Allows you to run a job at one or several remote

resources. It translates the program arguments to an RSL request and uses globusrun to submit the job.

• globus-job-clean Kills the job if it is still running and cleans the information concerning the job.

• globus-job-status Display the status of the job. See also globus-get-output to check the standard output or standard error of your job.

• These are all controlled by metadata specified by the Globus Resource Specification Language (RSL) which provides a common language to describe jobs and the resources required to run them.

• http://www.globus.org/gram/gram_rsl_parameters.html• The simplest RSL expression looks something like the following.

(executable=/bin/ls)

Page 50: Grid Technology A: Web Services Globus OGSA

Virtual Data Toolkit VDT from GriPhyn• http://www.lsc-group.phys.uwm.edu/vdt/• Trillium (PPDG from DoE GriPhyn and iVDgL from NSF) is

major US effort building Grid application software with a strong particle physics emphasis

• VDT is their major software release and its heart is Condor and GT2.– There is some “virtual data” software as well but not clear

if this is of interest in production use (interesting research area)

• Condor (Chapter 11 of Book) is powerful job scheduler for clusters and “cycle scavenging”– It has a well developed interface (ClassAds) for defining

requirements of jobs and matching to compute capabilities

Page 51: Grid Technology A: Web Services Globus OGSA

OGSA/OGSI Top Level View

• OGSA is the set of “core” Grid services– Stuff you can’t live

without– If you built a Grid

you would need to invent these things

OGSI

Broadly applicable services: registry,authorization, monitoring, data

access, etc., etc.

TransportProtocolHosting EnvironmentHosting Environment

Host. Env. & Protocol Bindings

Models for resources&

other entities

More specialized services: datareplication, workflow, etc., etc.

Domain-specific services

Other

models

Chapters 7 to 9 of Bookhttp://www.gridforum.org/Meetings/ggf7/docs/default.htm

http://www.globusworld.org/globusworld_web/jw2_program_tut.htm

Page 52: Grid Technology A: Web Services Globus OGSA

OGSI Open Grid Service Interface• http://www.gridforum.org/ogsi-wg• It is a “component model” for web services.• It defines a set of behavior patterns that each OGSI service must exhibit.• Every “Grid Service” portType extends a common base type.

– Defines an introspection model for the service– You can query it (in a standard way) to discover

• What methods/messages a port understands• What other port types does the service provide?• If the service is “stateful” what is the current state?

• A set of standard portTypes for– Message subscription and notification– Service collections

• Each service is identified by a URI called the “Grid Service Handle” • GSHs are bound dynamically to Grid Services References (typically wsdl

docs)– A GSR may be transient. GSHs are fixed.– Handle map services translate GSHs into GSRs.

Page 53: Grid Technology A: Web Services Globus OGSA

OGSI and Stateful Services• Sometimes you can send a message to a service, get a result and

that’s the end– This is a statefree service

• However most non-trivial services need state to allow persistent asynchronous interactions

• OGSI is designed to support Stateful services through two mechanisms– Information Port: where you can query for SDE (Service

Definition Elements)– “Factories” that allow one to view a Service as a “class” (in an

object-oriented language sense) and create separate instances for each Service invocation

• There are several interesting issues here– Difference between Stateful interactions and Stateful services– System or Service managed instances

Page 54: Grid Technology A: Web Services Globus OGSA

Factories and OGSI• Stateful interactions are typified by amazon.com where messages carry correlation

information allowing multiple messages to be linked together– Amazon preserves state in this fashion which is in fact preserved in its

database permanently• Stateful services have state that can be queried outside a particular interaction• Also note difference between implicit and explicit factories

– Some claim that implicit factories scale as each service manages its own instances and so do not need to worry about registering instances and lifetime management

• See WS-Addressing from largely IBM and Microsofthttp://msdn.microsoft.com/webservices/default.aspx?pull=/library/en-us/dnglobspec/html/ws-addressing.asp

FACTORY

1

2

3

4

FACTORY

1

2

3

4

Explicit FactoryImplicit Factory

Page 55: Grid Technology A: Web Services Globus OGSA

• OGSA-WG chaired by – Ian Foster, ANL and Univ. of Chicago– Jeff Nick, IBM– Dennis Gannon, IU

• Active Members from– IBM, Fujitsu, NEC, SUN, Hitachi, Avaki– Univ. of Mich, Chicago, Indiana (not much

academic involvement)

Open Grid Service Architecture

Page 56: Grid Technology A: Web Services Globus OGSA

OGSA Core Services I

• Registries, and namespace bindings– Registry is a collection of services indexed by service

metadata.• “find me a service with property X.”

– Directory is a map from a namespace to GSHs.– A namespace is a human understandable version of a

Grid Handle

• Queues – For building schedulers and resource brokers– Jobs and other requests are in queues– This is high-level messaging

Page 57: Grid Technology A: Web Services Globus OGSA

Security• Base this on Web Services Security• Authentication

– 2-way. Who are you and who am I?

• Authorization– What am I authorized to use/see/modify

• Accounting/Billing– (not really security – see monitoring)

• Privacy• Group Access

– Easily create a group to share access to a virtual Grid.

• Very complex issues related to services and message delivery.

Page 58: Grid Technology A: Web Services Globus OGSA

Common Resource Model

• Every resource on the grid that is manageable is represented by a service instance– CRM is the Schema hierarchy that defines each

resource (with its meta-data)– Service for a resource presents its management

interface to authorized parties.

Page 59: Grid Technology A: Web Services Globus OGSA

Policy Management• Policy management services

– Mechanism to publish policy and the services it applies to. – Policy life-cycle mgmt.

• Policy languages exist for routing, security, resource use

PolicyService

Manager

PolicyEnforcement

Point

PolicyServiceAgent

Admin GUI /Autonomic

Manager

Admin GUI /Autonomic

Manager

XMLRepository

* 1

1..n 1

1

1

*

**

*

*

*CanonicalPolicies

CanonicalPolicies

Policy Service CorePolicy Service Core

Policy Transformation

Service

Policy Validation

Service

Policy Resolution

Service

Policy Transformation

Service

Policy Validation

Service

Policy Resolution

Service

Common Resource Model

Device / Resource

Common Resource Model

Device / Resource

Non-Canonical

Producer of Policies

Consumer of Policies

Policy Component Requirements: A management control point for policy lifecycle (PSM) A canonical way to express policies (AC 4-tuple) A distribution point for policy dissemination (PSA) A way to express that a service is “policy aware” (PEP) A way to effect change on a resource (CRM)

Page 60: Grid Technology A: Web Services Globus OGSA

Grid Service Orchestration

• Creating new services by composing other services

• Two types of Orchestration– Composition in space

• One services is directly invoking another

– Composition in time• Managing the workflow

– First do this.– Then do this and that– When that is done do this

» If something goes wrong do this– And so on…

Page 61: Grid Technology A: Web Services Globus OGSA

Data Services

• Distributed Data Access• Data Caching• Data Replication Services• Metadata Catalog Services• Storage Services

Page 62: Grid Technology A: Web Services Globus OGSA

Metering Resource Consumption

• At what granularity do services report resource consumption?

• How do they report it?• How are services metered?

Billing

Con

trac

t Ser

vice

Accounts

Rate Packages

ASPIC CBI

ASPIC CBI

Resource Instrumentation

Metering Handler

Logging Service

Rating

Meter event adaption

Billable Record Listener

Aggregation and Correlation

Usage Information

Accounting

Page 63: Grid Technology A: Web Services Globus OGSA

Transactions

• Two threads/workflows must synchronize and agree they have done so before moving on.– Usually involves modification to two or more

persistent states– WS-transactions has been “proposed”.

Page 64: Grid Technology A: Web Services Globus OGSA

Messaging, Events, Logging

• Messaging– Delivery Model– Queuing and Pub/Sub message delivery (not clear to me why

these are different as publish/subscribe implemented as topic labeled queues)

• Events– Time stamped messages– Standard XML schemas

• Standard Logging• MQSeries (IBM), JMS (Java Message Service) and

NaradaBrokering (Indiana) provide this but most naturally at level of “platform/hosting environment”

Page 65: Grid Technology A: Web Services Globus OGSA

Where should Messaging be?• One can define messaging at the OGSA level “above the

hosting environment” but that makes it difficult to virtualize messaging and support network performance– Publish-subscribe or better queued messaging naturally

supports optimized routing based on network performance

• One can naturally support collaborative Web services in same fashion in a way that it MUCH easier that GrooveNetworks and other collaborative environments (WebeX, Placeware(Microsoft)) do as long as every application is a Web service

• OGSA location of messages is fine for low volume logging or notification events– Not good for events on “video” application where each frame is an

update event

Page 66: Grid Technology A: Web Services Globus OGSA

Application as a Web service

Master Client

Events Rendering

User Interface

W3C DOM Events

To Collaborative

Clients

From CollaborationAs a WS

Application as a Web service

Participating Client

Events Rendering

User Interface

W3C DOM Events

From Master

From CollaborationAs a WS

Page 67: Grid Technology A: Web Services Globus OGSA

Collaboration: Shared Display Sharing can be done at any point on “object” or Web Service

pipeline

Object Object’ Object’’Object Display

Object Viewer

Object Display

Object Display

Event(Message)

Service

Shared Display sharesframebuffer with eventscorresponding to changedpixels in master client.

Master

SharedDisplay

Shared Web Service

Shared Event

Shared Export

As long as pipeline uses messages, easy tomake collaborativeWindows framebuffers and in fact most applications do NOT expose a message based update interface

Page 68: Grid Technology A: Web Services Globus OGSA

WSDisplay

WSViewer

WS Display

WS ViewerEvent

(Message)Service

Master

WSDisplay

WS Viewer

Collaboration as a WSSet up Session with XGSP

WebServic

e

F

I

U

O

F

I

R

O

Shared Input Port (Replicated WS) Collaboration

OtherParticipants

WebServic

e

F

I

U

O

F

I

R

O

WebServic

e

F

I

U

O

F

I

R

O

Page 69: Grid Technology A: Web Services Globus OGSA

WSDisplay

WSViewer

WS Display

WS Viewer

Event(Message)

Service

Master

WSDisplay

WS Viewer

Web Service MessageInterceptor

Collaboration as a WSSet up Session with XGSP

Application orContent source

WSDL

Web Service

FI

U

O

FI

R

O

Shared Output Port Collaboration

OtherParticipants

Text ChatWhiteboardMultiplemasters

Page 70: Grid Technology A: Web Services Globus OGSA

NaradaBrokering Based on a network of cooperating broker nodes

• Cluster based architecture allows system to scale to arbitrary size

Originally designed to provide uniform software multicast to support real-time collaboration linked to publish-subscribe for asynchronous systems.

Now has four major core functions• Message transport (based on performance measurement) in

heterogeneous multi-link fashion• General publish-subscribe including JMS & JXTA and

support for RTP-based audio/video conferencing • Filtering for heterogeneous clients• Federation of multiple instances of Grid services

Page 71: Grid Technology A: Web Services Globus OGSA

Role of Event/Message Brokers We will use events and messages interchangeably

• An event is a time stamped message Our systems are built from clients, servers and “event brokers”

• These are logical functions – a given computer can have one or more of these functions

• In P2P networks, computers typically multifunction; in Grids one tends to have separate function computers

• Event Brokers “just” provide message/event services; servers provide traditional distributed object services as Web services

There are functionalities that only depend on event itself and perhaps the data format; they do not depend on details of application and can be shared among several applications• NaradaBrokering is designed to provide these functionalities• MPI provided such functionalities for all parallel computing

Page 72: Grid Technology A: Web Services Globus OGSA

Engineering Issues Addressedby Event / Messaging Service

Application level Quality of Service – e.g. give audio highest priority

Tunnel through firewalls & proxies Filter messages to slow (collaborative/real-time) clients Choose Hardware or Software multicast Scaling of software multicast

• Efficient calculation of destinations and routes. Integrate synchronous and asynchronous collaboration with

same messaging, control, archiving for all functions Transparently replace single server JMS systems with a

distributed solution. Provides reliable inter-peer group messaging for JXTA Open Source (high quality) messaging

Page 73: Grid Technology A: Web Services Globus OGSA

NaradaBrokering implements an Event Service

Filter is mapping to PDA or slow communication channel (universal access) – see our PDA adaptor

Workflow implements message process Routing illustrated by JXTA and includes firewall Destination-Source matching illustrated by JMS using Publish-

Subscribe mechanism These use Security model (being implemented) based on WS-Sec

Web Service 1

(Virtual)Queue

Web Service 2

Destination Source Matching FilterRouting workflow

WSDLPorts

WSDLPorts

Broker

Page 74: Grid Technology A: Web Services Globus OGSA

Narada Broker Network

Database

Resource

Broker

Broker

Broker

Broker

Broker

Broker

Software multicast

(P2P) Community

(P2P) Community

For message/events service(P2P) Community

(P2P) Community

Hypercube topologyfor brokers?Tree for distance educationwith teacher at root

Page 75: Grid Technology A: Web Services Globus OGSA

NaradaBrokering Communication Applications interface to NaradaBrokering through UserChannels

which NB constructs as a set of links between NB Broker waystations which may need to be dynamically instantiated

UserChannels have publish/subscribe semantics with XML topics Links implement a single conventional “data” protocol.

• Interface to add new transport protocols within the Framework • Administrative channel negotiates the best available communication

protocol for each link Different links can have different underlying transport implementations

• Implementations in the current release include support for TCP,UDP, Multicast, SSL and RTP. HTTP, HTTPS support will be available in Feb 2003 release.

• Supports communication through proxies such as iPlanet, Netscape and Apache.

• Supports communication through firewalls such as Microsoft ISA, Checkpoint.

Page 76: Grid Technology A: Web Services Globus OGSA

Performance/Routing in Message-based Architecture

In traveling from cities A to B (say 3 separate passengers), one chooses between and changes transport mechanism at waystations to optimize cost, time, comfort, scenic beauty …

Waystations are now NB brokers where one chooses transport protocol (individual or collective)• Able to choose between car, type of car, plane, train etc • Able to dynamically create waystations to cope with problems and acts as

hubs for multicast messages• Knows about traffic jams and can assign the “HOV lane”

SatelliteUDP

FirewallHTTP

Dial-upFilter

A

B1

Hand-HeldProtocol

FastLink

Software MulticastB2

B3

Page 77: Grid Technology A: Web Services Globus OGSA

Note on Optimization Note in parallel computing, couldn’t do much dynamic

optimization as aiming at microsecond latency• Natural to use hardware routing

In Grid, time scales are different• 100 millisecond quite normal network latency• 30 millisecond typical packet time sensitivity (this is one audio

or video frame) but even here can buffer 10-100 frames on client (conferencing to streaming)

• 1 millisecond is time for a Java server to “think” Jitter in latency (transit time) due to routing, processing

(in NB) or packet loss recovery is important property Grid needs and can tolerate significant dynamic

optimization

Page 78: Grid Technology A: Web Services Globus OGSA

1

2

3

4

5

6

7

8

9

1000 1500 2000 2500 3000 3500 4000 4500 5000

Tra

nsi

t D

ela

y (

Mill

iseco

nds)

Message Payload Size (Bytes)

Transit delay for message samples in NaradaBrokering Different communication hops - Internal Machines

hop-2hop-3hop-5hop-7Sender/receiver/broker - (Pentium-3, 1

GHz, 256 MB RAM). 100 Mbps LAN. JDK-1.3, Red Hat Linux 7.3

Page 79: Grid Technology A: Web Services Globus OGSA

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

1000 1500 2000 2500 3000 3500 4000 4500 5000

Sta

nda

rd D

evi

atio

n (

Mill

iseco

nds)

Message Payload Size (Bytes)

Standard Deviation for message samples in NaradaBrokering Different communication hops - Internal Machines

hop-2hop-3hop-5hop-7

Page 80: Grid Technology A: Web Services Globus OGSA

0

50

100

150

200

250

300

350

400

450

0 200 400 600 800 10001200 1400 160018002000

De

lay

(Mill

ise

con

ds)

Packet Number

Average delays/packet for 12 (of the 400 total) video-clients. NaradaBrokering Avg=80.76 ms, JMF Avg=229.23 ms

NaradaBrokering-RTP JMF-RTP

Page 81: Grid Technology A: Web Services Globus OGSA

0

5

10

15

20

25

0 200 400 600 800 1000 1200 1400 1600 1800 2000

Jitt

er

(M

illis

eco

nds)

Packet Number

Average jitter/packet for 12 (of the 400 total) video clients. NaradaBrokering Avg=13.38 ms, JMF Avg=15.55 ms

NaradaBrokering-RTP JMF-RTP

Page 82: Grid Technology A: Web Services Globus OGSA

Narada Performance Web Service Performance measurements are

used by Links in • Reconfiguring Connectivity

between nodes • Deciding underlying transport

protocol• Determining possible filtering

Each node determines performance of links of which it is endpoint

Individual node web services are aggregated as another Web Service

Factors measured include Transit delays, bandwidth, Jitter, Receiving rates. Performance measurements are

• Spaced out at increasing intervals for healthy channels.

• Factors selectively measured for unhealthy channels.

• No repeated measurements of bandwidth for example.

• Injected into Narada network as XML events

Administrative Interface

Probably should replace by a moresophisticated measurement package

Page 83: Grid Technology A: Web Services Globus OGSA

The Overall Architecture• The Grid is defined by a collection of distributed Services

– For many users the primary interaction with the Grid will be through a portal

Portal Server

MyProxyServer

MetadataDirectoryService(s)

Directory& indexServices

ApplicationFactoryServices

Messagingand group

collaboration

Event andlogging

Services

Page 84: Grid Technology A: Web Services Globus OGSA

Application Portal in a Minute (box) Systems like Unicore, GPDK, Gridport (HotPage),

Gateway, Legion provide “Grid or GCE Shell” interfaces to users (user portals)• Run a job; find its status; manipulate files• Basic UNIX Shell-like capabilities

Application Portals (Problem Solving Environments) are often built on top of “Shell Portals” but this can be quite time confusing• Application Portal = Shell Portal Web Service + Application

(factory) Web service

Page 85: Grid Technology A: Web Services Globus OGSA

Application Web service Application Web Service is ONLY metadata

• Application is NOT touched Application Web service defined by two sets of schema:

• First set defines the abstract state of the application What are my options for invoking myapp? Dub these to be “abstract descriptors”

• Second set defines a specific instance of the application I want to use myapp with input1.dat on

solar.uits.indiana.edu. Dub these to be “instance descriptors”.

Each descriptor group consists of• Application descriptor schema• Host (resource) descriptor schema• Execution environment (queue or shell) descriptor schema

Page 86: Grid Technology A: Web Services Globus OGSA
Page 87: Grid Technology A: Web Services Globus OGSA

Web Services as a Portlet• Each Web Service naturally has a

user interface specified as “just another port” – Customizable for universal access

• This gives each Web Service a Portlet view specified (in XML as always) by WSRP (Web services for Remote Portals)

• So component model for resources “automatically” gives a component model for user interfaces– When you build your

application, you define portletat same time

Application orContent source

WSDL

Web Service

SR

W

P

Application as a WSGeneral Application PortsInterface with other WebServices

User Face ofWeb ServiceWSRP Ports define WS as a Portlet

Web Services have other ports (Grid Service) to be OGSI compliant

Page 88: Grid Technology A: Web Services Globus OGSA

Online Knowledge Center built from Portlets

• Web Services provide a component model for the middleware (see large “common component architecture” effort in Dept. of Energy)

• Should match each WSDL component with a corresponding user interface component

• Thus one “must use” a component model for the portal with again an XML specification (portalML) of portal component

A set of UIComponents

Page 89: Grid Technology A: Web Services Globus OGSA

Portlet Portlet Portlet Portlet

XMLRSS, OCS, or otherLocal or remote

HTMLLocal files

JSP or VMLocal templates

WebPageRemote HTML

Portlet

PortletsUser implementedusing Portal API

Portlets

Data

PortletController PortletController

Screen Manager

HTML

PSML

PortletControl

ECS

JSP template

ECS ECS ECS ECS

ECS ECS ECS

ECS Root to HTML

ECS

Turbine ServletJetspeedArchitecture

Page 90: Grid Technology A: Web Services Globus OGSA

Portlets and Portal Stacks

• User interfaces to Portal services (Code Submission, Job Monitoring, File Management for Host X) are all managed as portlets.

• Users, administrators can customize their portal interfaces to just precisely the services they want.

Core Grid Services

User facing Web Service Ports

Application Grid Web Services

Aggregation Portals(Jetspeed)

Message S

ecurity, Information Services

Page 91: Grid Technology A: Web Services Globus OGSA

Jetspeed Computing Portal: Choose Portlets

4 available portletslinking to Web ServicesI choose two

Page 92: Grid Technology A: Web Services Globus OGSA

Choose Portlet Layout

Choose 1-column Layout

Original 2-column Layout

Page 93: Grid Technology A: Web Services Globus OGSA

Lists user files on selected host, noahsark.File operations include

Upload, download, Copy, rename, crossload

Tabs indicate availableportlet interfaces.

File management

Page 94: Grid Technology A: Web Services Globus OGSA
Page 95: Grid Technology A: Web Services Globus OGSA

Sample page with several portlets:

proxy credential manager,submission, monitoring

Page 96: Grid Technology A: Web Services Globus OGSA

Provide information about application

andhost parameters

Select applicationto edit

Administer Grid Portal