52
CoreGrid Summer School Budapest, Hungary, 3-7 September , 2007 1 Grid Interoperability Issues in Resource Management: Questions and Solutions Attila Kertész [email protected] MTA SZTAKI CoreGRID Institute on Resource Management and Scheduling

Grid Interoperability Issues in Resource Management: Questions and Solutions

  • Upload
    sulwyn

  • View
    30

  • Download
    0

Embed Size (px)

DESCRIPTION

Grid Interoperability Issues in Resource Management: Questions and Solutions. Attila Kertész [email protected] MTA SZTAKI CoreGRID Institute on Resource Management and Scheduling. Overview. Introduction: Heterogeneity in Grids -> Need for Interoperability - PowerPoint PPT Presentation

Citation preview

Page 1: Grid Interoperability Issues in Resource Management: Questions and Solutions

CoreGrid Summer School Budapest, Hungary, 3-7 September, 2007 1

Grid Interoperability Issues in Resource Management: Questions and Solutions

Attila Kerté[email protected]

MTA SZTAKICoreGRID Institute on

Resource Management and Scheduling

Page 2: Grid Interoperability Issues in Resource Management: Questions and Solutions

2

Overview Introduction:

Heterogeneity in Grids -> Need for Interoperability

Solutions for Grid Interoperability: It can be targeted in different levels of Grid Systems Regarding Resource Management, we see 3

approaches: Extending current Resource Management Systems Interfacing RMSs from portals Developing a higher level mediator to utilize RMSs

Conclusions and future directions

Page 3: Grid Interoperability Issues in Resource Management: Questions and Solutions

3

Current situation and trends inGrid Computing

Fast evolution of Grid systems and middleware: Globus Toolkit (GT2->3->4), EGEE (LCG-2->gLite),

UNICORE, …

Many production Grid systems are built with them: EGEE (LCG-2 gLite), UK NGS (GT2), Open Science Grid

(GT2 GT4), NorduGrid (~GT2)

Although the same set of core services are available everywhere, they are implemented in different ways: Certificate management, Job submission, File management

Page 4: Grid Interoperability Issues in Resource Management: Questions and Solutions

4

Grid Utilization

Page 5: Grid Interoperability Issues in Resource Management: Questions and Solutions

5

How to achieve Grid Interoperability?

Operating Systems

Grid Middleware

Higher level services

1. level

2. level

3. level

GR

ID

arch

itect

ure

Page 6: Grid Interoperability Issues in Resource Management: Questions and Solutions

6

Which levels should we target?

At the 1-2. level, establishing interoperability would be the smartest, but also the hardest.

The 3. level is the most preferable, since it requires the less modifications to the major architecture.

Page 7: Grid Interoperability Issues in Resource Management: Questions and Solutions

7

How can we use existing Resource Managers for Grid Interoperability?

Three possible directions in the resource management level of current grids:

I. Enable Resource Brokers to access resources of different Grids

II. Interface different brokers from Portals

III. Enable communication among Resource Brokers, or coordinate them by a higher-level tool

Page 8: Grid Interoperability Issues in Resource Management: Questions and Solutions

8

I. Extending Current RMSs

The most obvious way to provide interoperability among different grid systems is to extend the existing and widely used Resource Brokers with multiple grid middleware support.

This approach has several advantages and disadvantages, too: Probably this modification would favor the users most, since they

would not need to change their customs, submission methods. But from the other point of view, it requires high efforts by the

developers to interface new middleware services, so it is definitely a time consuming solution. Nevertheless the more system the broker supports, the more robust and unmanageable it becomes.

Page 9: Grid Interoperability Issues in Resource Management: Questions and Solutions

9

Related works

The Gridbus Grid Service Broker is designed for computational and data-grid applications. Although it supports all Globus middleware, Unicore, Nordugrid and it provides an interface to be implemented for other middleware support, it is mainly used in Globus grids.

Gridway is being developed in a Globus incubation project, therefore it supports all Globus versions and it also supports the EGEE middleware.

JSS is a decentralized resource broker that is able to utilize both GT4 and NorduGrid resources.

The UniGrids (GRIP) project aims at supporting interoperability with a semantic matching of the resource descriptions enabling job submissions to Globus and Unicore sites.

Page 10: Grid Interoperability Issues in Resource Management: Questions and Solutions

10

Demonstration: GTbroker

The first widespread and stable grid middleware was the Globus Toolkit 2. Since it lacked a Resource Broker, we developed a tool called GTbroker.

It uses GT2 C API functions to interact with Globus resources and perform job submissions. For determining the available hosts in the grid it queries the

MDS. The job submission to resources is done through GRAM, and a GASS server is used to put the files needed for the job to the remote host and to get back the result files if there are any.

These tools enable this broker to work without additional software on Globus grids (GT2, GT3 and pre-ws GT4).

Since most of the current production grids use this kind of middleware, its simply adaptation made this broker relevant.

Page 11: Grid Interoperability Issues in Resource Management: Questions and Solutions

11

Extension to EGEE middleware

To extend an RMS to support other types of middleware, we need to learn, how to interact with the new system.

Brokers need to gather resource information, move files, perform job submissions, track job states and retrieve output files. Most of these activities need interaction with different middleware services.

GTbroker was redesigned to support the LCG-2 (EGEE) middleware, by modifying the following parts: information querry to be able to gather data from the BDII, and

adding special attributes to the RSL to enable job submission in EGEE VOs.

Since the file movement, job description and job state tracking can also be done through the same Globus services in LCG-2 grids, we did not modify these parts (nevertheless for an entirely different middleware we should have done it).

Page 12: Grid Interoperability Issues in Resource Management: Questions and Solutions

12

VOCE (LCG-2)

NGS (GT2) SEEGRID (LCG-2)

GTbroker

Austrian Grid (GT4)

UserPortal

First step towards Grid Interoperability

Page 13: Grid Interoperability Issues in Resource Management: Questions and Solutions

13

Comparison tests

To prove usability we evaluated broker usage on LCG-2 Grids (VOCE, SEEGRID)

The brokers were invoked by scripts: multiple invocation state checking, log gathering output staging back for LCG2 broker

We performed the tests in 4 phases varying job types and the number of jobs started at the same time

Page 14: Grid Interoperability Issues in Resource Management: Questions and Solutions

14

LCG-2 broker usage

In EGEE the Workload Management System is responsible for brokering

Job properties in JDL, resource information from BDII, job states from Logging and Bookkeeping

Default matchmaking: Only ‘Production’ state resources are taken from BDII The rank is the response time in resource selection

Page 15: Grid Interoperability Issues in Resource Management: Questions and Solutions

15

GTbroker features

Quality of Service features: GTbroker uses an extended RSL file that should contain the user requirements and job properties.

Regarding information systems: in Globus grids it queries the MDS, in LCG-2 grids the BDII.

During matchmaking a ranked list is created from the found resources in the BDII.

Fault tolerance is supported by resubmissions. Should a job fail or be pending for too long on a resource (this time interval can be set in the broker), the broker cancels and resubmits it to another high priority one.

Page 16: Grid Interoperability Issues in Resource Management: Questions and Solutions

16

Test Phases

1. phase: 20 small single and MPI jobs to VOCE

2. phase: 20 10 min jobs to both VOs, 20 10 min MPI jobs to SEEGRID

3. phase: 60 10 min jobs to SEEGRID, 20 at a time, 5 min intervals

4. phase: 60 ~15 min jobs to SEEGRID, 10 at a time, 4 min intervals

Page 17: Grid Interoperability Issues in Resource Management: Questions and Solutions

17

2. phase results

Page 18: Grid Interoperability Issues in Resource Management: Questions and Solutions

18

3. phase results

Page 19: Grid Interoperability Issues in Resource Management: Questions and Solutions

19

All phase results

Page 20: Grid Interoperability Issues in Resource Management: Questions and Solutions

20

Test summary

Sometimes the LCG-2 broker selected long responding or even non-responding resources, its resubmission not always worked

GTbroker made reliable resubmissions and the hidden non-responding or draining resources were skipped

For jobs with short running time GTbroker produced better results, for larger jobs they performed about the same results, but GTbroker was more reliable

As GTbroker has an eager matchmaking, it usually takes the major part of the jobs to the same (‘best’) resource

The user can set a random selection within a range of resources, but this can draw back the performance

Page 21: Grid Interoperability Issues in Resource Management: Questions and Solutions

21

I. Conclusions

We have shown, how additional middleware support can be achieved by redesigning an existing Resource Broker

The results prove that existing resource brokers can be extended to use other middleware systems, but in this way developers need to redesign the system to support services of the additional middleware.

Page 22: Grid Interoperability Issues in Resource Management: Questions and Solutions

22

II. Multi-broker Utilization

To exploit the advantages of various brokers and grids at the same time, we need to use more grid Resource Management Systems.

In this situation we need to learn various job specification languages and broker capabilities.

Grid portals are the currently available tools, which try to hide the details of low level middleware utilization by providing a transparent, uniform interface.

In this kind of grid utilization we do not expect grid broker to support more middleware, but to do their best on their own ones.

Page 23: Grid Interoperability Issues in Resource Management: Questions and Solutions

23

Related works

The well known related works are Pegasus, GridFlow, K-Wf grid portal and SPA portal of the HPC-Europa Project.

Though the first 3 examples provide high-level access to grid services, they usually operate only on one middleware.

The SPA is a portal component that enables brokers to be utilized through plug-in interfaces. These interface methods need to be used by all brokers, providing the same abstract functionality; therefore during an integration the broker would also have to be modified.

Only the P-GRADE Portal supports the execution of multi-grid workflows in both Globus-, and EGEE-based production Grids.

Page 24: Grid Interoperability Issues in Resource Management: Questions and Solutions

24

Demonstration: The P-GRADE Portal

General purpose, workflow-oriented computational Grid portal Supports the development and execution of

workflow-based Grid applications Based on GridSphere-2

Easy to expand with new portlets (e.g. application-specific portlets)

Easy to tailor to end-user needs Support for multi-grid workflows

Page 25: Grid Interoperability Issues in Resource Management: Questions and Solutions

25

What is a P-GRADE Portal workflow?

a Directed Acyclic Graph,where Nodes represent jobs (batch

programs to be executed on a computing element)

Ports represent input/output files the jobs expect/produce

Arcs represent file transfer operations

semantics of the workflow: A job can be executed if all of

its input files are available

Page 26: Grid Interoperability Issues in Resource Management: Questions and Solutions

26

Defining broker jobs

The user can choose a broker for the job

No resource should be selected!

Further requirements can be specified by job description editors, which have similar interfaces

Page 27: Grid Interoperability Issues in Resource Management: Questions and Solutions

27

JDL and RSL Editor

Additional job-related requirements can be set in job description editors:

JDL Editor: Creates a JDL file for the WMS The user can set JDL attributes such as: Rank and

Requirements, Environment variables, … RSL Editor:

Creates an RSL file Basic and special RSL attributes can be set such

as: random resource, skip time…

Page 28: Grid Interoperability Issues in Resource Management: Questions and Solutions

28

Workflow Execution

P-GRADE Portal contains a DAGMan-based workflow manager subsystem

DAGMan degrades workflows into elementary file transfer and job submission tasks, and schedules these tasks according to their dependencies

The submission is done by its pre/post scripts: When a broker is used for job submission, the pre script

invokes the broker, and the post script waits till the execution is finished, and provides information about the actual job status for the portal

Page 29: Grid Interoperability Issues in Resource Management: Questions and Solutions

29

Broker invocation

The portal can invoke different brokers to reach resources of different Grids

While DAGMan schedules the workflow nodes, the brokers do the actual job submissions

Page 30: Grid Interoperability Issues in Resource Management: Questions and Solutions

30

Second step towards Grid Interoperability

Manchester

User

Lausanne

P-GRADE

Portal

NGS GT2

Poznan

Budapest

EGEE: VOCE / SEEGRID

EGEE WMS GTbroker NorduGrid broker

SwissGrid

Page 31: Grid Interoperability Issues in Resource Management: Questions and Solutions

31

II. Conclusions

Portals provide a uniform access to grids Managing multiple Brokers simultaneously in a

transparent way seems to be a good solution to establish Grid Interoperability

Though current portals provide a transparent access to grids, users still need to manually set up workflows and choose RMSs for each job in the workflow.

Again, with examining the available brokers, users could learn the capabilities of the usable brokers, but they are lacking dynamic information, such as successful submission rate, background load of the VO of the brokers, reliability of the brokers and so on.

Page 32: Grid Interoperability Issues in Resource Management: Questions and Solutions

32

III. Meta-Brokering approaches

Users can have certificates to access more Grids or VOs

A new problem arises in this situation: which VO, which broker to choose for my specific application?

Just like users needed Resource Brokers to choose proper resources within a VO, now they need a Meta-Brokering service to decide: which broker (and VO) is the best for them, and also to hide the differences of utilizing them.

Page 33: Grid Interoperability Issues in Resource Management: Questions and Solutions

33

Related works

Meta-brokering is a quite new topic, though the need for interoperable grid networks has already been identified by different research groups.

The InterGrid vision is to operate so-called Gateways communicating with IntraGrid RMs, which should be implemented in all the Grids participating in the network. This vision cannot be realized in current technologies.

The HPC-Europa Project researchers also considered to take steps towards meta-brokering as well as the LA Grid Project. They are both thinking of an intercommunicating peer-to-peer architecture of their current RMSs, which also takes time and needs redesign of their brokers.

Page 34: Grid Interoperability Issues in Resource Management: Questions and Solutions

34

Interacting with the Meta-broker

12

VO 1 VO 2VO 3 VO 4

Grid X Grid Y

User Meta-Broker

1

23

Page 35: Grid Interoperability Issues in Resource Management: Questions and Solutions

35

Languages of the Meta-Broker

Job Submission Description Language (JSDL): for specifying job requirements extension for special attributes

Broker Property Description Language (BPDL): for storing the properties of the utilized brokers updating the performance data of the brokers

Page 36: Grid Interoperability Issues in Resource Management: Questions and Solutions

36

Languages of the Meta-Broker

Page 37: Grid Interoperability Issues in Resource Management: Questions and Solutions

37

JSDL extension – undefined attributes

Page 38: Grid Interoperability Issues in Resource Management: Questions and Solutions

38

Page 39: Grid Interoperability Issues in Resource Management: Questions and Solutions

39

BPDL – Data Model

Page 40: Grid Interoperability Issues in Resource Management: Questions and Solutions
Page 41: Grid Interoperability Issues in Resource Management: Questions and Solutions

41

Page 42: Grid Interoperability Issues in Resource Management: Questions and Solutions

42

Job description (JSDL)

EGEE WMS

GTbroker

NorduGridBroker

EGEE grid

GT2 grid

SwissGrid. . .

Matchmaker

Translator

or

Broker name,its JDL

Job status,output

Submissionresults

Meta-BrokerCore

Invoker

User Portal

. . .

InformationCollectorBPDL List

VO Load

MB Languages

MB Health

Parser

IS Agent

Third step towards Grid Interoperability

Page 43: Grid Interoperability Issues in Resource Management: Questions and Solutions

43

Job request (JSDL)

MatchMaker

Translator

Broker name/ID,Middleware/VO,

its JobDL,proxyname

InformationCollector

Meta-BrokerCore

User or Portal

ParserIS Agent

BPDL List

VO Load

MB Languages

MB Health

1.

2.

4.

5.

6.

7.

3.Broker ID,Middleware/VO,JobDL

a.)BrokerID,

Submissionresults

8.

9.

Scenarios

Page 44: Grid Interoperability Issues in Resource Management: Questions and Solutions

44

Job description (JSDL),Input files

MatchMaker

Translator

Submission result,Output files

InformationCollector

Meta-BrokerCoreParser IS Agent

BPDL List

VO Load

MB Languages

MB Health

1.

2.

4.

5.

6.

9.

3.

Invoker7. Grid

Broker

8.

10.

11.

GridBroker

b.)User

Scenarios

Page 45: Grid Interoperability Issues in Resource Management: Questions and Solutions

45

Components of the architecture I.

The Meta-Broker is the core component: this communicates with the other components

The Translators are responsible for transforming the user request to the language of the actually selected Broker (JSDL<-> JDL, RSL, xRSL…)

The Invokers hand over the job to the brokers and wait for the results

They provide additional information for the Information Collector about the submissions

Page 46: Grid Interoperability Issues in Resource Management: Questions and Solutions

46

Components of the architecture II.

The Information Collector stores the connected broker properties and historical data of the previous submissions

This information shows: whether the chosen broker is available, or how reliable it

is what kind of jobs can be submitted to which broker

(some brokers provide QoS agreements, some better data-handling, …)

what is the current load of the resources reachable by the utilized brokers – these values are regularly updated by IS Agents

Page 47: Grid Interoperability Issues in Resource Management: Questions and Solutions

47

Matchmaking The Matchmaker compares the JSDL of the actual

job to the BPDL of the registered resource brokers First the basic attributes are matched against the

basic properties: this selection determines a group of brokers that are able to submit the job

In the second phase those brokers are kept, which are able to fulfill the special requirement attributes of the job

Finally a priority list of the remaining brokers is created taking into account the ranks (stored for the requested features) and the load of the underlying grid of each broker

Page 48: Grid Interoperability Issues in Resource Management: Questions and Solutions

Manchester

User

Lausanne

P-GRADE

Portal

NGS GT2

Poznan

Budapest

EGEE: VOCE / SEEGRID

EGEE WMS GTbroker NorduGrid broker

SwissGrid

Meta-Broker

Meta-Broker Utilization by Portals

Page 49: Grid Interoperability Issues in Resource Management: Questions and Solutions

49

III. Conclusions

The introduced meta-brokering approach opens a new way for Interoperability support

The design and the architecture of the Grid Meta-Broker enable a higher level resource management by utilizing resource brokers of different grid middleware systems

This service can act as a bridge among the separated islands of the current production Grids, therefore it solves Grid Interoperability at the level of resource management

We expect that with the integration of the Grid Meta-Broker to the portal, we will be able to enhance better application execution with a simplified and more interoperable service in the future.

Page 50: Grid Interoperability Issues in Resource Management: Questions and Solutions

50

Grid Interoperability levels

Meta-

0.

1.

2.

3.

4.

5.

Portal

Portal

Portal

Page 51: Grid Interoperability Issues in Resource Management: Questions and Solutions

51

Final Conclusions

We introduced three different approaches in current grid research that contributes to enable Grid Interoperability at the level of Resource Management.

We have also demonstrated by solutions of all these approaches that interoperability can be achieved.

Though the first two approaches, the RMS extension and multi-brokering offer interoperable heterogeneous resource utilization, the final solution lies in the third approach.

The meta-brokering approach opens a novel way for Grid Interoperability support. The presented Meta-Broker is a standalone Web-Service that can serve both users and portals. We have shown, how such a service can be realized based on the latest Web and OGF standards. The design and the architecture of the Grid Meta-Broker enable a higher level interoperable brokering by utilizing existing resource brokers of different grid middleware.

Page 52: Grid Interoperability Issues in Resource Management: Questions and Solutions

CoreGrid Summer School Budapest, Hungary, 3-7 September, 2007 52

Thank You for Your Attention!