57
Hartmut Kaiser ([email protected]) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser [email protected] MPI for Gravitational Physics, Albert-Einstein-Institut, Golm Tom Goodale, Thilo Kielmann, Gabrielle Allen Edward Seidel, André Merzky, Kelly Davis The GridLab Project www.gridlab.org

Hartmut Kaiser ([email protected]) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser [email protected]

Embed Size (px)

Citation preview

Page 1: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected])

GAT - The Grid Application Toolkit

Abstracting the Grid for Application Programmers

Hartmut [email protected]

MPI for Gravitational Physics, Albert-Einstein-Institut, Golm

Tom Goodale, Thilo Kielmann, Gabrielle AllenEdward Seidel, André Merzky, Kelly Davis

The GridLab Projectwww.gridlab.org

Page 2: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 2

Outline

IntroductionWhy another Grid-API?What does it offer/solve?Simple samplesArchitecture / Implementation

In depth course through API groupsFile Management, File Stream Management, Logical File ManagementAdvert Service ManagementJob Management

Conclusions

Page 3: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 3

Why another Grid-API?

The situation today:Grids: everywhere

Supposedly. At least many projects

Grid applications: nowhere Almost. At least our experience that this is difficult, GGF APPS group

Why is this?Application programmers accept the Grid as a computing paradigm only very slowly.Problems: (multifold and often cited - amongst others)

Interfaces are NOT simple (see next slides. . .) Typical Globus code... ahem...

Different and evolving interfaces to the ’Grid’ Versions, new services, new implementations, WSDL does not solve all problems at all

Environment changes in many ways Globus, grid members, services, network, applications, ...

Page 4: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 4

Copy a File: Globus GASS

if (source_url.scheme_type == GLOBUS_URL_SCHEME_GSIFTP ||

source_url.scheme_type == GLOBUS_URL_SCHEME_FTP ) {

globus_ftp_client_operationattr_init (&source_ftp_attr);

globus_gass_copy_attr_set_ftp (&source_gass_copy_attr,

&source_ftp_attr);

}

else {

globus_gass_transfer_requestattr_init (&source_gass_attr,

source_url.scheme);

globus_gass_copy_attr_set_gass(&source_gass_copy_attr,

&source_gass_attr);

}

output_file = globus_libc_open ((char*) target,

O_WRONLY | O_TRUNC | O_CREAT,

S_IRUSR | S_IWUSR | S_IRGRP |

S_IWGRP);

if ( output_file == -1 ) {

printf ("could not open the file \"%s\"\n", target);

return (-1);

}

/* convert stdout to be a globus_io_handle */

if ( globus_io_file_posix_convert (output_file, 0,

&dest_io_handle)

!= GLOBUS_SUCCESS) {

printf ("Error converting the file handle\n");

return (-1);

}

result = globus_gass_copy_register_url_to_handle (

&gass_copy_handle, (char*)source_URL,

&source_gass_copy_attr, &dest_io_handle,

my_callback, NULL);

if ( result != GLOBUS_SUCCESS ) {

printf ("error: %s\n", globus_object_printable_to_string

(globus_error_get (result)));

return (-1);

}

globus_url_destroy (&source_url);

return (0);

}

int RemoteFile::GetFile (char const* source, char const* target) {

globus_url_t source_url;

globus_io_handle_t dest_io_handle;

globus_ftp_client_operationattr_t source_ftp_attr;

globus_result_t result;

globus_gass_transfer_requestattr_t source_gass_attr;

globus_gass_copy_attr_t source_gass_copy_attr;

globus_gass_copy_handle_t gass_copy_handle;

globus_gass_copy_handleattr_t gass_copy_handleattr;

globus_ftp_client_handleattr_t ftp_handleattr;

globus_io_attr_t io_attr;

int output_file = -1;

if ( globus_url_parse (source_URL, &source_url) != GLOBUS_SUCCESS ) {

printf ("can not parse source_URL \"%s\"\n", source_URL);

return (-1);

}

if ( source_url.scheme_type != GLOBUS_URL_SCHEME_GSIFTP &&

source_url.scheme_type != GLOBUS_URL_SCHEME_FTP &&

source_url.scheme_type != GLOBUS_URL_SCHEME_HTTP &&

source_url.scheme_type != GLOBUS_URL_SCHEME_HTTPS ) {

printf ("can not copy from %s - wrong prot\n", source_URL);

return (-1);

}

globus_gass_copy_handleattr_init (&gass_copy_handleattr);

globus_gass_copy_attr_init (&source_gass_copy_attr);

globus_ftp_client_handleattr_init (&ftp_handleattr);

globus_io_fileattr_init (&io_attr);

globus_gass_copy_attr_set_io (&source_gass_copy_attr, &io_attr);

&io_attr);

globus_gass_copy_handleattr_set_ftp_attr

(&gass_copy_handleattr,

&ftp_handleattr);

globus_gass_copy_handle_init (&gass_copy_handle,

&gass_copy_handleattr);

Page 5: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 5

Copy a File: CoG/RFT

TransferRequestType transferRequest = new TransferRequestType ();

transferRequest.setTransferArray (transfers1);

int concurrency = Integer.valueOf

((String)requestData.elementAt(6)).intValue();

if (concurrency > transfers1.length)

{

System.out.println ("Concurrency should be less than the number"

"of transfers in the request");

System.exit (0);

}

transferRequest.setConcurrency (concurrency);

TransferRequestElement requestElement = new TransferRequestElement ();

requestElement.setTransferRequest (transferRequest);

ExtensibilityType extension = new ExtensibilityType ();

extension = AnyHelper.getExtensibility (requestElement);

OGSIServiceGridLocator factoryService = new OGSIServiceGridLocator ();

Factory factory = factoryService.getFactoryPort (new URL (source_url));

GridServiceFactory gridFactory = new GridServiceFactory (factory);

LocatorType locator = gridFactory.createService (extension);

System.out.println ("Created an instance of Multi-RFT");

MultiFileRFTDefinitionServiceGridLocator loc

= new MultiFileRFTDefinitionServiceGridLocator();

RFTPortType rftPort = loc.getMultiFileRFTDefinitionPort (locator);

((Stub)rftPort)._setProperty (Constants.AUTHORIZATION,

NoAuthorization.getInstance());

((Stub)rftPort)._setProperty (GSIConstants.GSI_MODE,

GSIConstants.GSI_MODE_FULL_DELEG);

((Stub)rftPort)._setProperty (Constants.GSI_SEC_CONV,

Constants.SIGNATURE);

((Stub)rftPort)._setProperty (Constants.GRIM_POLICY_HANDLER,

new IgnoreProxyPolicyHandler ());

int requestid = rftPort.start ();

System.out.println ("Request id: " + requestid);

}

catch (Exception e)

{

System.err.println (MessageUtils.toString (e));

}

}

package org.globus.ogsa.gui;

import java.io.BufferedReader;

import java.io.File;

import java.io.FileReader;

import java.net.URL;

import java.util.Date;

import java.util.Vector;

import javax.xml.rpc.Stub;

import org.apache.axis.message.MessageElement;

import org.apache.axis.utils.XMLUtils;

import org.globus.*

import org.gridforum.ogsi.*

import org.gridforum.ogsi.holders.TerminationTimeTypeHolder;

import org.w3c.dom.Document;

import org.w3c.dom.Element;

public class RFTClient {

public static void copy (String source_url, String target_url) {

try {

File requestFile = new File (source_url);

BufferedReader reader = null;

try {

reader = new BufferedReader (new FileReader (requestFile));

} catch (java.io.FileNotFoundException fnfe) { }

Vector requestData = new Vector ();

requestData.add (target_url);

TransferType[] transfers1 = new TransferType[transferCount];

RFTOptionsType multirftOptions = new RFTOptionsType ();

multirftOptions.setBinary (Boolean.valueOf (

(String)requestData.elementAt (0)).booleanValue ());

multirftOptions.setBlockSize (Integer.valueOf (

(String)requestData.elementAt (1)).intValue ());

multirftOptions.setTcpBufferSize (Integer.valueOf (

(String)requestData.elementAt (2)).intValue ());

multirftOptions.setNotpt (Boolean.valueOf (

(String)requestData.elementAt (3)).booleanValue ());

multirftOptions.setParallelStreams (Integer.valueOf (

(String)requestData.elementAt (4)).intValue ());

multirftOptions.setDcau(Boolean.valueOf(

(String)requestData.elementAt (5)).booleanValue ());

int i = 7;

for (int j = 0; j < transfers1.length; j++)

{

transfers1[j] = new TransferType ();

transfers1[j].setTransferId (j);

transfers1[j].setSourceUrl ((String)requestData.elementAt (i++));

transfers1[j].setDestinationUrl ((String)requestData.elementAt (i++));

transfers1[j].setRftOptions (multirftOptions);

}

Page 6: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 6

Copy a File: GAT/C

#include <GAT.h>

GATResult RemoteFile_GetFile (GATContext context, char const* source_url, char const* target_url) { GATStatus status = 0; GATLocation source = GATLocation_Create (source_url); GATLocation target = GATLocation_Create (target_url); GATFile file = GATFile_Create (context, source, 0); if (source == 0 || target == 0 || file == 0) { return GAT_MEMORYFAILURE; } if ( GATFile_Copy (file, target, GATFileMode_Overwrite) != GAT_SUCCESS ) { GATContext_GetCurrentStatus (context, &status); return GATStatus_GetStatusCode (status); } GATFile_Destroy (&file); GATLocation_Destroy (&target); GATLocation_Destroy (&source); return GATStatus_GetStatusCode (status);}

Page 7: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 7

Copy a File: GAT/Python

import GAT

def copyfile(context, source_url, target_url): try: file = GAT.File(context, GAT.Location(source_url)) file.Copy(GAT.Location(target_url)) return GAT.SUCCESS

except GAT.Status, err: print err.args[0].message print err.args[0].traceback return err.args[0].errcode

Page 8: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 8

Copy a File: GAT/C++

#include <GAT++.hpp>

GAT::Result RemoteFile::GetFile (GAT::Context context, std::string source_url, std::string target_url) { try { GAT::File file (context, source_url); file.Copy (target_url); } catch (GAT::Exception const &e) { std::cerr << "Some error: " << e.what() << std::endl; return e.Result(); } return GAT_SUCCESS;}

Page 9: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 9

Copy a File: GAT/C++

#include <GAT++.hpp>

GAT::Result RemoteFile::GetFile (GAT::Context context,

std::string source_url, std::string target_url)

{

try {

GAT::File file (context, source_url);

file.Copy (target_url);

}

catch (GAT::Exception const &e) {

std::cerr << e.what() << std::endl;

return e.Result();

}

return GAT_SUCCESS;

}

Page 10: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 10

Code Statistics

55510Language

291020Cleanup

113030Action

253030Init

152080100Lines total

C++ GATC GATCoGGASSCode

Page 11: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 11

Dynamic Middleware

Globus, Unicore, my_service, your_service, . . .The same functionality has different interfaces all over the place.

But you don't want to recompile your app every time, not to speak of recoding... WSDL does not mean end of all problems (see CoG code), but begin of new ones... - on application

level, WSDL is not trivial enough

Restricting yourself to Globus does not help either: version changes every couple of months(2.4.x, 3.2.y, 4.a.b)

and gets bug fixes. Changes often are MAJOR - we have seen a number of them over the last couple of years...

The application that runs today will fail tomorrow!

Right now, it is basically impossible for a programmer to focus on the science, not on IT (i.e. Grid) problems.

Page 12: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 12

Dynamic Grids

Services (and interfaces) get exchanged (“upgraded”) on regular basis

That is related to the point above, but also a social problem!

Institutions (resources, services, applications) join/leave YOUR grid without (much) notice.

The grid is designed to ease and simplify that kind of fluctuation - its not a bug, its a feature! But the applications are not able to make use of that feature right now …

The Grid changes AT RUNTIME – services go down, resources get busy/free, disks and storage nodes are empty/full, . . . THINGS CONSTANTLY CHANGE.

Today Grid middleware allows to cope with that, but utilizing that in an intelligent way is a major programming effort, and blows the application with code that needs constant maintenance...

Applications need LOTS of code for handling transient problems.

Most applications share most of these problems, but code reuse is difficult/impossible.

We can reuse the Globus libraries, right, but isn't every project re-inventing its own abstraction layer for these? In our experience/projects: they do!

Aren’t we all re-inventing abstraction layers for this?

Page 13: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 13

Simplicity!

The key objective for application programmers Remember: an applications programmer is a physicist, chemist, linguist , medical

Simple API’s should:be easy to use

Simple, finite, consistent API which allows error tracing

be invariant: make upgrades really, really simple Well defined API which rarely changes. Implementation which allows dynamic exchange of key elements and provides runtime abstractions.

avoid refactoring/recoding/recompilation Same applications runs today and tomorrow; here and there; on Globus and Unicore; on Globus 2.2.4 and

Globus 2.4; on Linux and on Mac; local and on grid;

focus on well-known programming paradigms (e.g., for a file provide a file API – without services to services to files. . .)

Files are best example: expect open, close, read, write, seek. Do not introduce fancy things like the need to ask a service discovery service to tell me the location of an service which is able to tell me the location of my file...

Page 14: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 14

What Applications want

...and what they GAT: Enough of 'we want this' and 'we don’t want that' - you got the picture, right? So, here is what we

do:

An API that allows to implement basic Grid use casesStay simple! As simple as possible! But not simpler!Focus on applications, and scientists, rather than Grid nerds

As you and me

Next slides will give an overview of what we think is essential, and how we envision usage of that. Remind: this is version 1 - our first shot - we know its not perfect, but we are converging to something we can work with already.

Page 15: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 15

GAT API Scope

FilesMonitoring and EventsResources, JobsInformation ExchangeUtility classes (error handling, security, preferences...)

NOTHING ELSE

Page 16: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 16

API: Sub Systems

• The pipe stuff in the file subsystem is 'historical‘, pipe is VERY simple - we don't do real communication and data exchange a la MPI! • The actual API is somewhat larger (especially the resource part): 34 objects as opposed to 27 shown here. BUT THAT’S IT!!!

Page 17: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 17

Examples in GAT

Read remote physical fileRead a logical fileSpawn a SubtaskMigrate a Subtask

Page 18: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 18

Read a remote physical File

try { char data[25]; GAT::FileStream file (context, source_url);

file.seek (100, SEEK_SET); file.read (data, sizeof(data));}catch (GAT::Exception const &e){ std::cerr << "Some error: " << e.what() << std::endl; return e.Result();}

Well known paradigm. Whatever service/lib/... implements that, the programmer does not know (no reference to Globus... )Whatever the URL/protocol (ftp://, gsiftp://, http:// file://) no code changes! No service specific parameter settings (can be drawback BUT SIMPLIFIES)

Page 19: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 19

Read a logical file

try {

GAT::LogicalFile logical_file (context, name);

std::list<GAT::File> files = logical_file.get_files();

files.front().Copy(dest_url);

}

catch (GAT::Exception const &e) {

std::cerr << "Some error: " << e.what() << std::endl;

return e.Result();

}

SAME paradigm + 'private name space‘ URL Unknown! Service unknown! complete abstraction ==> Virtualization! Still simple

Page 20: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 20

Spawn a Subtask

GAT::Table sdt; sdt.add ("location", "/bin/date");

GAT::Table hdt; hdt.add ("machine.type", "i686");

GAT::SoftwareDescription sd (sdt);

GAT::HardwareResourceDescription hrd (hdt);

GAT::JobDescription jd (context, sd, hrd);

GAT::ResourceBroker rb (context, prefs);

GAT::Job j = rb.submit (jd);

Page 21: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 21

Migrate a Subtask

GAT::Table sdt; sdt.add ("location", "/bin/sleep"); sdt.add ("arguments", "36000");GAT::Table hdt; hdt.add ("machine.type", "i386");

GAT::SoftwareDescription sd (sdt);GAT::HardwareResourceDescription hrd (hdt);

GAT::JobDescription jd (context, sd, hrd);GAT::ResourceBroker rb (context, prefs);

GAT::Job job = rb.submit (jd);

hdt.add ("machine.name", "fs0.das2.cs.vu.nl");

std::list<GAT::Resource> resources = rb.find_resources (hrd);

job.Migrate (resources[0]);

if (GATJobState_Running == job.GetState ()) job.Stop ();

Page 22: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 22

API Status

Version: 1.1 – currently converging towards 1.2Is object oriented, but language neutralDefines syntax and semantics of GRID accessSpecification is open process

Hopefully gets input from many communitiesWill evolve along with the findings of GGF’s new SAGA-WG (Simple APIs for Grid Applications)

Page 23: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 23

Architecture

API is a very thin layer, provides no capabilities by itself(bind to the Grid-Environment)Adaptors implement capabilities mirroring the APIEngine mediates between API and adaptors:

switch adaptors at runtime (shared libraries)error tracing and fallback mechanisms(default local adaptor set)

CPI is also well defined - adaptors are interchangeable

Page 24: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 24

Architecture

Page 25: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 25

Implementation (Engine)

C version fully implemented C++ wrapper to C fully implemented Python wrapper to C fully implemented Java version fully implementedFortran, Perl (wrappers to C) to follow (SWIG?)

Focus: portability, lightness, flexibility, adaptivity

Page 26: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 26

Implementation (Adaptors)

Full set of local adaptors implemented (contained in the download)Couple of ‘external’ adaptors implemented as well (GridLab services, Globus)Lot of adaptors currently under development (DRMAA, GRAM, Curl/wget etc.)

www.gridlab.org/WorkPackages/wp1/adaptorreleases.html

Page 27: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 27

File Management: Outline

File PackageOverviewThe GATFile class

CodeExampleExercise

Page 28: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 28

File PackageOverview

File Management

The File Package allows applicationprogrammers to manipulate Files

in a “Grid” environment.

…and contains a single class!

Page 29: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 29

File Management

File PackageThe GATFile class copies, moves, deletes, examines …files

Page 30: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 30

File Management

Code sample…/GATEngine/C-Reference/examples/

example_03_-_file_size.cExamine the code

ExerciseModify example_03_-_file_size.c to delete a file instead of getting its length

GATResult GATFile_Delete(GATFile_const file);

Page 31: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 31

FileStream Management: Outline

FileStream PackageOverviewGATFileStream class

CodeExampleExercise

Page 32: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 32

FileStream Management

Stream PackageOverview

The Stream Package allows

Application programmers to stream

data to and from remote or local

processes and to stream data to and from

remote or local files.

… and consists of various classesand interfaces.

Page 33: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 33

FileStream Management

FileStream PackageOverview

We’ll cover only GATFileStream.

Page 34: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 34

FileStream Management

Stream PackageGATFileStream class streams data to/from a file

Page 35: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 35

FileStream Management

Code sample…/GATEngine/C-Reference/examples/

example_20_-_filestream_simple.cExamine the code

ExerciseModify example_20_-_filestream_simple.c to write your name, say, to file.

GATResult GATFileStream_Write(GATFileStream stream, void const *buffer, GATuint32 buffersize, GATuint32 *written_bytes);

Page 36: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 36

LogicalFile Management: Outline

LogicalFile PackageOverviewGATLogicalFile class

CodeExample

Page 37: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 37

LogicalFile Management

LogicalFile PackageOverview

The LogicalFile Package exists to replicatefiles which are identical, but dispersedgeographically in an efficient manner.

…and consists of a single class!

Page 38: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 38

LogicalFile Management

LogicalFile PackageGATLogicalFile class replicates files in a “Grid” environment

Page 39: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 39

LogicalFile Management

Code sample…/GATEngine/C-Reference/examples/

example_31_-_logicalfile_ops.c

Examine the code

Page 40: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 40

Advert Management

Advert PackageOverviewGATInterface_IAdvertisableGATAdvertService class

CodeExample

Page 41: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 41

Advert Management

Advert PackageOverview

The Advert Package allows an application

to persistently store objects (advertisables),

query such stored advertisables, and move them across machine and

language boundaries.

… and consists of only one class and

one interface.

Page 42: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 42

Advert Management

Advert PackageGATInterface_IAdvertisable

The interface GATInterface_IAdvertisablemarks objects capable of being persisted in

a GATAdvertService, it’s similar to Java’sserilizable.

Page 43: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 43

Advert Management

Advert PackageGATAdvertService - Stores advertisables and allows one to

query for these advertisables across machine boundaries.

Page 44: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 44

Advert Management

Code sample…/GATEngine/C-Reference/examples/

example_31_-_advertservice_ops.c

Examine the code

Page 45: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 45

Job Management: Outline

Job PackageOverviewGATResourceDescription’sGATResource’sGATResourceBrokerGATSoftwareDescriptionGATJobDescriptionGATJob

CodeExample

Page 46: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 46

Job Management

Job PackageOverviewThe Job Package allows an

application to obtain resource, reserve

resources, andsubmit and manage

jobs.

…and consists of many classes and

interfaces.

Page 47: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 47

Job Management

Job PackageGATResourceDescriptions

A GATHardwareResourceDescription describes a hardware resource, such as a i686 box with 1GB of memory. A GATSoftwareResourceDescription describes a software resource, such as an OS.

Page 48: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 48

Job Management

Job Package: GATResourcesA GATHardwareResource represents a hardware resource, such as a i686 box with 1GB of memory. A GATSoftwareResource represents a software resource, such as a running OS.

Page 49: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 49

Job Management

Job PackageGATResourceBroker

A GATResourceBroker instance is able to broker resources, it can find or reserve resources; also, it can submit jobs to suchresources.

Page 50: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 50

Job Management

Job PackageGATSoftwareDescription

A GATSoftwareDescription instance describes an executable,for example /bin/date

Page 51: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 51

Job Management

Job PackageGATJobDescription

A GATJobDescription instance describes a job which can be executed. It includes a description of the hardware and software for the job.

Page 52: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 52

Job Management

Job PackageGATJob

A GATJob represents a jobthat has been submitted to a resource management system.

Page 53: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 53

Job Management

Code sample…/GATEngine/C-Reference/examples/

example_60_-_job_submit.c

Examine the code

example_60_-_job_submit /bin/date

Page 54: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 54

Utility Classes

Additional utility classesGATList, GATTable etc.

Data structures, specifc to language bindingGATSecurityContext

Security related operationsGATStatus

Error handling

Page 55: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 55

Open problems

Memory management is very tediousUser has to do a lot by himself

Track all GAT object instance copiesFree all GAT object instances Call constructor and destructor

Error handling is complicated (even with macros)Always check for error codes, cluttered code.

All of these problems are solved by the C++ and Python wrappers

Asynchronicity is (almost) completely missingEngine is not thread safe as of today (C-implementation)

Page 56: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 56

Current activities

SAGA (Simple API for Grid Applications) draft will be presented at the next GGF GAT and the Globus-CoG will be used as inputs to the work of SAGA-RGApplication/Vendor interest increasing (EGEE, Vienna Uni, CUT, KISTI, LSU, Intel, HP, Platform...)Merge of the Java-GAT and Globus-CoG plannedAdaptor writing has been accepted in community

Page 57: Hartmut Kaiser (hkaiser@aei.mpg.de) GAT - The Grid Application Toolkit Abstracting the Grid for Application Programmers Hartmut Kaiser hartmut.kaiser@aei.mpg.de

Hartmut Kaiser ([email protected]) COST D23 Tutorial on Grid Computing 24th/25th October 2004, Manno, 57

Conclusions

The GAT provides a simple and stable API to various Grid environmentsIt is used as a prototype implementation for the ongoing standardization process at the SAGA WG of the GGFDownloads:

http://www.gridlab.orgCurrently, snapshots available for GAT V1.1GAT 1.2, available in early December 2004Platforms: Linux, Windows, Mac OS X, SGI Irix, True64 UNIX

Support via mailing list [email protected]