Europar Tutorial GRMSr

8/13/2019 Europar Tutorial GRMSr

1/153

University Dortmund

Tutorial

Grid Resource

Management and Scheduling

EuroPar 2004

Pisa, Italy

Ramin Yahyapour

CEI, University of Dortmund


2/153

2

Agenda

Background on Resource Management and Scheduling

Job scheduling and resource management on HPC resources

Transition to Grid RM and Grid Scheduling

Current state-of-the-art

Future of GRMS

Requirements, outlook for upcoming GRMS


3/153

3

Introduction

We all know what the Grid is

one of the many definitions:

Resource sharing & coordinated problem solving in dynamic, multi-

institutional virtual organizations (Ian Foster)

however, the actual scope of the Grid is still quite controversial

Many people consider High Performance Computing (HPC) as the

main Grid application.

todays Grids are mostly Computational Gridsor Data Gridswith HPCresources as building blocks

thus, Grid resource management is much related to resource

management on HPC resources (our starting point).

we will return to a broader Grid scope and its implications later


4/153

4

Resource Management

on HPC Resources

HPC resources are usually parallel computers or large scale clusters

The local resource management systems (RMS) for such resources

includes:

configuration management

monitoring of machine state

job management

There is no standard for this resource management.

Several different proprietary solutions are in use.

Examples for job management systems: PBS, LSF, NQS, LoadLeveler, Condor


5/153

5

HPC Management Architecture

in General

Compute Resources/Processing Nodes

Master

Server

Control Service

Job Master

Resource/

Job Monitor

Resource/

Job Monitor

Resource/

Job Monitor

Compute

Node

Compute

Node

Compute

Node

Resource and Job

Monitoring and Management Services


6/153

6

Computational Job

A job is a computational task that requires processing capabilities (e.g. 64 nodes) and

is subject to constraints (e.g. a specific other job must finish before thestart of this job)

The job information is provided by the user resource requirements

CPU architecture, number of nodes, speed

memory size per CPU

software libraries, licenses

I/O capabilities

job description additional constraints and preferences

The format of job description is not standardized, but usually verysimilar


7/153

7

Example: PBS Job Description

Simple job script:

#!/bin/csh

# resource limits: allocate needed nodes#PBS -l nodes=1

## resource limits: amount of memory and CPU time([[h:]m:]s).

#PBS -l mem=256mb#PBS -l cput=2:00:00

# path/filename for standard output#PBS -o master:/mypath/myjob.out

./my-task

whole job file is a shell script

information for the RMS are

comments

the actual job is started in the

script


8/153

8

Job Submission

The user submits the job to the RMSe.g. issuing qsub jobscript.pbs

The user can control the job

qsub: submit

qstat: poll status information

qdel: cancel job

It is the task of the resource management system to start a job on the

required resources Current system state is taken into account


9/153

9

Job ExecutionJob & Resource

Monitor

Processing Node


Monitor

Processing Node

PBS Structure

Job Submission

Management

ServerScheduler


Monitor

Processing Node

qsub jobscript


10/153

10

Execution Alternatives

Time sharing:

The local scheduler starts multiple processes per physical CPU withthe goal of increasing resource utilization.

multi-tasking

The scheduler may also suspend jobs to keep the system load undercontrol

preemption

Space sharing:

The job uses the requested resources exclusively; no other job isallocated to the same set of CPUs.

The job has to be queued until sufficient resources are free.


11/153

11

Job Classifications

Batch Jobs vs interactive jobs batch jobs are queued until execution

interactive jobs need immediate resource allocation

Parallel vs. sequential jobs

a job requires several processing nodes in parallel

the majority of HPC installations are used to run batch jobs in space-

sharing mode!

a job is not influenced by other co-allocated jobs

the assigned processors, node memory, caches etc. are exclusivelyavailable for a single job.

overhead for context switches is minimized

important aspects for parallel applications


12/153

12

Preemption

A job is preempted by interrupting its current execution the job might be on hold on a CPU set and later resumed; job still

resident on that nodes (consumption of memory)

alternatively a checkpoint is written and the job is migrated to anotherresource where it is restarted later

Preemption can be useful to reallocate resources due to new jobsubmissions (e.g. with higher priority)

or if a job is running longer then expected.


13/153

13

Job Scheduling

A job is assigned to resources through a scheduling process responsible for identifying available resources

matching job requirements to resources

making decision about job ordering and priorities

HPC resources are typically subject to high utilization

therefore, resources are not immediately available and jobs are

queued for future execution

time until execution is often quite long (many production systems have an

average delay until execution of >1h)

jobs may run for a long time (several hours, days or weeks)


14/153

14

Typical Scheduling Objectives

Minimizing the Average Weighted Response Time

Maximize machine utilization/minimize idle time

conflicting objective

criteria is usually static for an installation and implicit given by the

scheduling algorithm

Jobsj

j

Jobsj

jjj

w

)r(tw

AWRT

r : submission time of a job

t : completion time of a job

w : weight/priority of a job


15/153

15

Job Steps

Scheduler

Schedule

time

lokale

Job-Queue

HPC

Machine

Grid-

User

Job Execution

Management

Node Job

MgmtNode Job

MgmtNode Job

Mgmt

Job

Description

A user job enters a job queue,

the scheduler (its strategy)

decides on start time andresource allocation of the job.


16/153

16

Scheduling Algorithms:

FCFS

Well known and very simple: First-Come First-Serve

Jobs are started in order of submission

Ad-hoc scheduling when resources become free again

no advance scheduling

Advantage: simple to implement

easy to understand and fair for the users

(job queue represents execution order)

does not require a priori knowledge about job lengths

Problems:

performance can extremely degrade; overall utilization of a machine can

suffer if highly parallel jobs occur, that is, if a significant share of nodes is

requested for a single job.


17/153

17

FCFS Schedule

Scheduler

Schedule

time

Job-Queue

Compute

Resource

Resources

Procssing Nodes

Time

Queue

1.

2.

3.

4


18/153

18

Scheduling Algorithms:

Backfilling

Improvement over FCFS A job can be started before an earlier submitted job if it does not

delay the first job in the queue

may still cause delay of other jobs further down the queue

Some fairness is still maintained Advantage:

utilization is improved

Information about the job execution length is needed

sometimes difficult to provide user estimation not necessarily accurate

Jobs are usually terminated after exceeding its allocated execution time;

otherwise users may deliberately underestimate the job length to get anearlier job start time


19/153

19

Backfill Scheduling

Scheduler

Schedule

time

Job-Queue

Compute

Resource

Queue

1.

2.

3.

4

Job 3 is started before Job 2 as it does not delay it

Resources

Procssing Nodes

Time


20/153

20

Backfill Scheduling

Scheduler

Schedule

time

Job-Queue

Compute

Resource

Resources

Procssing Nodes

Time

However, if a job finishes earlier than expected, the backfilling causesdelays that otherwise would not occur

need for accurate job length information (difficult to obtain)

Job finishes earlier!

Queue

1.

2.

3.

4


21/153

21

Job Execution Manager

After the scheduling process,the RMS is responsible for the job execution:

sets up the execution environment for a job,

starts a job,

monitors job state, and cleans-up after execution (copying output-files etc.)

notifies the used (e.g. sending email)


22/153

22

Scheduling Options

Parallel job scheduling algorithms are well studied; performance isusually acceptable

Real implementations may have addition requirements instead of need

of more complex theoretical algorithms:

Prioritization of jobs, users, or groups while maintaining fairness

Partitioning of machines

e.g.: interactive and development partition vs. production batch partitions

Combination of different queue characteristics

For instance, the Maui Scheduler is often deployed as it is quite flexible

in terms of prioritization, backfilling, fairness etc.


23/153

University Dortmund

Transition to Grid Resource

Management and Scheduling

Current state of the art


24/153

24

Transition to the Grid

More resource typescome into play: Resources are any kind of entity, service or capability to perform a specific

task

processing nodes, memory, storage, networks, experimental devices, instruments

data, software, licenses

people

The task/job/activity can also be of a broader meaning

a job may involve different resources and consists of several activities in a

workflow with according dependencies

The resources are distributed and may belong to different administrative

domains

HPC is still key the application for Grids. Consequently, the main resources in

a Grid are the previously considered HPC machines with their local RMS

I li i G id R


25/153

25

Implications to Grid Resource

Management

Several security-related issues have to be considered: authentication,authorization,accounting

who has access to a certain resource?

what information can be exposed to whom?

There is lack of global information:

what resources are when available for an activity?

The resources are quite heterogeneous:

different RMS in use

individual access and usage paradigms

administrative policies have to be considered


26/153

26

Scope of Grids

Cluster Grid Enterprise Grid Global Grid

Source: Ian Foster


27/153

27

Resource Management Layer

Grid Resource Management System consists of : Local resource management system (Resource Layer)

Basic resource management unit

Provide a standard interface for using remote resources

e.g. GRAM, etc.

Global resource management system (Collective Layer) Coordinate all Local resource management system within multiple or distributed

Virtual Organizations (VOs)

Provide high-level functionalities to efficiently use all of resources

Job Submission

Resource Discovery and Selection

Scheduling Co-allocation

Job Monitoring, etc.

e.g. Meta-scheduler, Resource Broker, etc.


28/153

28

Grid Middleware

Application

FabricControl local execution

ConnectivityCommunication with internalresource functions and services

ResourceAdd resource: Negotiate access,control access and utilization

Collective

Coordination of severalresources: infrastructureservices, application services

Internet

Transport

Application

Link

InternetProtocolArchitecture

Source: Ian Foster


29/153

29

Grid Middleware (2)

Resource

Broker

Grid Resource

Manager

Grid Resource

Manager

Grid Resource

Manager

Information

Services

Monitoring

Services

SecurityServices

Core Grid

Infrastructure Services

Grid Middleware

PBS LSF

Resource Resource Resource

Local ResourceManagement

Higher-Level

Services

User/

Application


30/153

30

Globus Grid Middleware

Globus Toolkit common source for Grid middleware

GT2

GT3Web/GridService-based

GT4WSRF-based

GRAM is responsible for providing a service for a given job

specification that can:

Create an environment for a job

Stage files to/from the environment

Submit a job to a local scheduler

Monitor a job

Send job state change notifications

Stream a jobs stdout/err during execution


31/153


32/153

32

Globus GT2 Execution

User/Application Resource Broker

Resource

Allocation

PBS LSF

Resource Resource Resource

GRAM GRAM GRAM

MDS

RSL

Specialized

RSL

RSL


33/153

33

RSL

Grid jobs are described in the resource specificationlanguage (RSL)

RSL Version 1 is used in GT2

It has an LDAP filter-like syntax that supports boolean

expressions:

Example:

& (executable = a.out)(directory = /home/nobody )

(arguments = arg1 "arg 2")(count = 1)


34/153

34

Globus Job States

suspended

pending stage-in active done

failed

stage-out


35/153

35

Globus GT3

With transition to Web/Grid-Services, the job management becomes the Master Managed Job Factory Service (MMJFS)

the Managed Job Factory Service (MJFS)

the Managed Job Service (MJS)

The client contacts the MMJFS

MMJFS informs the MJFS to create a MJS for the job

The MJS takes care of managing the job actions.

interact with local scheduler

file staging store job status


36/153

36

Globus GT3 Job Execution

User/Application

File Streaming

Factor Service

Managed

Job Service

Master Managed

Job Factory Service

Resource Information

Provider Service

Managed

Factory Job Service

Local Scheduler

File Streaming

Service

Globus as a toolkit does not perform scheduling and automatic

resource selection

Example: Extending the Globus


37/153

37

Providers

ResourcePreferenceProvider

Example: Extending the Globus

Architecture at KAIST

JobSubmission

Service

Client

JobMonitoring

Service

SchedulingService

Resource

Selection

Service

ResourceInformation

Service

LocalResource

MonitoringService (RIPS)

InformationProvider

InformationProvider

Job MangerService (MJS)

LocalResource

Manager (PBS)

ResourceReservation

Service

WorkflowMonitoring Information Flow

1

2

9

3 4

57

86

Source: Jin-Soo Kim


38/153

38

Job Description with RSL2

The version 2 of RSL is XML-based Two namespaces are used:

rsl: for basic types as int, string, path, url

gram: for the elements of a job

*GNS = http://www.globus.org/namespaces


39/153

39

RSL2 Attributes

(type = rsl:integerType) Number of processes to run (default is 1)

(type = rsl:integerType)

On SMP multi-computers, number of nodes to distribute the count processes

across

count/hostCount = number of processes per host

(type = rsl:stringType)

Queue into which to submit job

(type = rsl:longType)

Maximum wall clock runtime in minutes


Maximum CPU runtime in minutes


Only applies if above are not used

Maximum wall clock or cpu runtime (schedulerss choice) in minutes


40/153

40

Job Submission Tools

GT 3 provides the Java class GramClient GT 2.x: command line programs for job submission

globus-job-run: interactive jobs

globus-job-submit: batch jobs

globusrun: takes RSL as input


41/153

41

Globus 2 Job Client Interface

A multirequest specifies multiple resources for a job

globus-job-run -dumprsl -: host1 /bin/uname -a \-: host2 /bin/uname a+ ( &(resourceManagerContact="host1")(subjobStartType=strict-barrier) (label="subjob 0")

(executable="/bin/uname") (arguments= "-a"))( &(resourceManagerContact="host2")(subjobStartType=strict-barrier)(label="subjob 1")(executable="/bin/uname") (arguments= "-a"))

A simple job submission requiring 2 nodes:globus-job-run np 2 s myprog arg1 arg2


42/153

42

Globus 2 Job Client Interface

The full flexibility of RSL is available through the command line toolglobusrun

Support for file staging: executable and stdin/stdout

Example:

globusrun -o r hpc1.acme.com/jobmanager-pbs

'&(executable=$(HOME)/a.out) (jobtype=single)

(queue=time-shared)

Problem: Job Submission


43/153

43

Problem: Job Submission

Descriptions differ

The deliverables of the GGF Working Group JSDL:

A specification for an abstract standard Job Submission Description

Language(JSDL) that is independent of language bindings, including;

the JSDL feature set and attribute semantics,

the definition of the relationship between attributes,

and the range of attribute values.

A normative XML Schemacorresponding to the JSDL specification.

A document of translation tablesto and from the scheduling languages of a

set of popular batch systems for both the job requirements and resource

description attributes of those languages, which are relevant to the JSDL.


44/153

44

JSDL Attribute Categories

The job attribute categories will include:

Job Identity Attributes

ID, owner, group, project, type, etc.

Job Resource Attributes

hardware, software, including applications, Web and Grid Services, etc.

Job Environment Attributes

environment variables, argument lists, etc.

Job Data Attributes

databases, files, data formats, and staging, replication, caching, and disk

requirements, etc.

Job Scheduling Attributes start and end times, duration, immediate dependencies etc.

Job Security Attributes

authentication, authorisation, data encryption, etc.

Problem: Resource Management Systems


45/153

45

Problem: Resource Management Systems

DifferAcross Each Component

Interface Format Execution Environment Platform Mix

LSFHas API plus Batch Utilities via

LSF Scripts

User: Local disk exported

System: Remote initialized (option)Unix, Windows

Grid EngineGDI API Interface plus

Command line interface

System: Remote initialized, with SGE local

variables exportedUnix only

PBSAPI (script option)

Batch Utilities via PBS Scripts

System: Remote initialized, with PBS local

variables exportedUnix only

Application

Scheduler

Task definition

Submit

Task Analysis

Executing Host

Staging

Task

Source: Hrabri Rajic


46/153

46

GGF-WG DRMAA

GGF Working Group Distributed Resource ManagementApplication API

From the charter: Develop an API specification for the submission and

control of jobs to one or more Distributed ResourceManagement (DRM) systems.

The scope of this specification is all the high levelfunctionality which is necessary for an application to

consign a job to a DRM system including commonoperations on jobs like termination or suspension.

The objective is to facilitate the direct interfacing ofapplications to today's DRM systems by application'sbuilders, portal builders, and Independent SoftwareVendors (ISVs).


47/153

47

DRMAA State Diagram

The remote job could be in followingstates:

system hold

user hold

system and user holdsimultaneously

queued active

system suspended

user suspended

system and user suspended

simultaneously running

finished (un)successfully

Source: Hrabri Rajic


48/153

48

Example: Condor-G

Condor-G is a Condor system enhanced to manage Globus jobs.

It provides two main features

Globus Universe:interface for submitting, queuing and monitoring jobs that use Globus

resources

GlideIn:system for efficient execution of jobs on remote Globus resources

Condor-G runs as a personal Condor system daemons run as non-privileged user processes

each user runs her/his Condor-G


49/153

49

Condor-G: GlideIn

Globus is used to run the Condor daemons on Grid resources Condor daemons run as a Globusmanaged job

GRAM service starts daemons rather than the Condor jobs

When the resources run these GlideIn jobs, they will join the personal

Condor pool

These daemons can be used to launch a job from Condor-G to aGlobus resource

Jobs are submitted as Condor jobs and they will be matched and run

on the Grid resources

the daemons receive jobs from the users Condor queue

combines the benefits of Globus and Condor


50/153

50

Using GlideIn

Schedd JobManager

LSF

Startd

Collector

Condor-G Grid Resource

GridManager

600 Condorjobs

glide-ins

Source: ANL/USC ISI

E l DAGM


51/153

51

Example: DAGMan

Directed Acyclic Graph Manager DAGMan allows you to specify the dependenciesbetween your Condor-G

jobs, so it can managethem automatically for you.

(e.g., Dont run job B until job A has completed successfully.)

A DAG is defined by a .dag file, listing each of its nodes and their

dependencies:# diamond.dagJob A a.subJob B b.subJob C c.subJob D d.subParent A Child B C

Parent B C Child D

each node will run the Condor-G job specified by its accompanying Condor

submit file

Job A

Job B Job C

Job D

Source: Miron Livny

University Dortmund


52/153

Grid Scheduling

How to select resources in the Grid?

Diff L l f S h d li


53/153

53

Different Level of Scheduling

Resource-level scheduler low-level scheduler, local scheduler, local resource manager

scheduler close to the resource, controlling a supercomputer, cluster, ornetwork of workstations, on the same local area network

Examples: Open PBS, PBS Pro, LSF, SGE

Enterprise-level scheduler Scheduling across multiple local schedulers belonging to the same

organization

Examples: PBS Pro peer scheduling, LSF Multicluster

Grid-level scheduler

also known as super-scheduler, broker, community scheduler Discovers resources that can meet a jobs requirements

Schedules across lower level schedulers

G id L l S h d l


54/153

54

Grid-Level Scheduler

Discovers & selects the appropriate resource(s) for a job If selected resources are under the control of several local

schedulers, a meta-scheduling action is performed

Architecture:

Centralized: all lower level schedulers are under the control of a singleGrid scheduler

not realistic in global Grids

Distributed: lower level schedulers are under the control of several gridscheduler components; a local scheduler may receive jobs from several

components of the grid scheduler

G id S h d li


55/153

55

Grid Scheduling

Scheduler

Schedule

time

Job-Queue

Machine 1

Scheduler

Schedule

time

Job-Queue

Machine 2

Scheduler

Schedule

time

Job-Queue

Machine 3

Grid-SchedulerGrid User

A ti iti f G id S h d l


56/153

56

Activities of a Grid Scheduler

GGF Document:10 Actions of Super

Scheduling (GFD-I.4)1. Authorization Filtering

3. Min. Requirement Filtering

2. Application Definition

Phase One-Resource Discovery

5. System Selection

4. Information Gathering

Phase Two - System Selection

7. Job Submission

6. Advance Reservation

9. Monitoring Progress

8. Preparation Tasks

11. Clean-up Tasks

10 Job Completion

Phase Three- Job Execution

Source: Jennifer Schopf

G id S h d li


57/153

57

Grid Scheduling

A Grid scheduler allows the user to specify the requiredresources and environment of the job without having toindicate the exact location of the resources

A Grid scheduler answers the question: to which local resourcemanger(s) should this job be submitted?

Answering this question is hard:

resources may dynamically join and leave a computational grid

not all currently unused resources are available to grid jobs:

resource owner policies such as maximum number of grid jobs

allowed

it is hard to predict how long jobs will wait in a queue

S l t R f E ti


58/153

58

Select a Resource for Execution

Most systems do not provide advance information about future jobexecution

user information not accurate as mentioned before

new jobs arrive that may surpass current queue entries due to higherpriority

Grid scheduler might consider current queue situation,however this does not give reliable information for future executions:

A job may wait long in a short queue while it would have been executedearlier on another system.

Available information:

Grid information service gives the state of the resources and possiblyauthorization information

Prediction heuristics: estimate jobs wait time for a given resource, basedon the current state and the jobs requirements.

S l ti C it i


59/153

59

Selection Criteria

Distribute jobs in order to balance load across resources not suitable for large scale grids with different providers

Data affinity: run job on the resource where data is located

Use heuristics to estimate job execution time.

Best-fit: select the set of resources with the smallest capabilities and

capacities that can meet jobs requirements

Quality of Service of

a resource or

its local resource management system what features has the local RMS?

can they be controlled from the Grid scheduler?

S h d li Att ib t


60/153

60

Scheduling Attributes

Working Group in the Global Grid Forum toDefine the attributes of a lower-level scheduling instance that can be exploited

by a higher-level scheduling instance.

The following attributes have been defined:

Attributes of allocation properties

Revocation of an allocation The local scheduler reserves the right to withdraw a given allocation.

Guaranteed completion time of allocation,

A deadline for job completion is provided by the local scheduler.

Guaranteed Number of Attempts to Complete a Job A local scheduler will retry a given job task, e.g. useful for data transfer actions.

Allocations run-to-completion A job is not preempted if it has been started

Exclusive Allocations

A job has exclusive access to the given resources; e.g. no time-sharing is performed

Malleable Allocations

The given resource set may change during runtime; e.g. a computational job will gain(moldable job) or lose processors

S h d li Att ib t (2)


61/153

61

Scheduling Attributes (2)

Attributes of available information Access to tentative schedule

The local scheduler exposes his schedule of future allocations

Option: Only the projected start time of a specified allocation is available

Option: Only partial information on the current schedule is available

Exclusive control

The local scheduler is exclusively in charge of the resources; no other jobs can appear onthe resources

Event Notification

The local scheduler provides an event subscription service

Attributes for manipulating allocation execution

Preemption

Checkpointing

Migration

Restart



62/153

62


Attributes for requesting resources Allocation Offers

The local system can provides an interface to request offers for an allocation

Allocation Cost or Objective Information

The local scheduler can provide cost or objective information

Advance Reservation

Allocations can be reserved in advance

Requirement for Providing Maximum Allocation Length in Advance

The higher-level scheduler must provide a maximum job execution length

Deallocation Policy

A policy applies for the allocation that must be met to stay valid

Remote Co-Scheduling

A schedule can be generated by a higher-level instance and imposed on the localscheduler

Consideration of Job Dependencies

The local scheduler can deal with dependency information of jobs; e.g. for workflows

CSFCommunity Scheduler


63/153

63

y

Framework

An open source implementation of OGSA-based metascheduler forVOs.

Supports emerging WS-Agreement spec

Supports GT GRAM

Fills in gaps in existing resource management picture

Contributed from platform to the Globus Toolkit

Extensible, Open Source framework for implementing meta-

schedulers

Provides basic protocols and interfaces to help resources work

together in heterogeneous environments

CSF Architecture


64/153

64

CSF Architecture

Platform

LSF User

Globus

Toolkit User

Platform LSF

LSF

Meta-

schedulerPlugin

SGE PBS

Grid Service Hosting

Environment

Job

Service

Reservation

Service

Meta-SchedulerGlobal

Information

Service

RIPS

GRAM

SGE RIPS

GRAM

PBS RIPS

RM

Adapter

RIPS = Resource Information

Provider Service

Queuing

Service

Source: Chris Smith

Global Information Service


65/153

65



RIPS DB

Index Service

RegistrySD

Aggregator

SDProvider

Manager

Data

Storage

Rsrc

Info

Req

Cluster

Info

Req

Data

Store

Req

Data

Load

Req

Cluster

Register Rsrc,

job rsv

info

RIPS

Data

Store

Load

Job

Info

Rsv

Info

Source: Chris Smith

Support for Virtual Organizations


66/153

66

User of VO A

Virtual

Organization A

Organization 1Organization 2

Organization 3

Community

SchedulerCommunity

Scheduler

VirtualOrganization B

User of VO B

Support for Virtual Organizations

Source: Chris Smith

CSF Grid Services


67/153

67

CSF Grid Services

Job Service: creates, monitors and controls compute jobs

Reservation Service:

guarantees resources are available for running a job

Queuing Service:

provides a service where administrators can customize and definescheduling policies at the VO level and/or at the different resource

manager level

Defines an API for plug in schedulers

RM Adaptor Service: provides a Grid service interface that bridges the Grid service protocol

and resource managers (LSF, PBS, SGE, Condor and other RMs)

GT3 Job Submission /


68/153

68

Architecture

Site AMMJFS on node1

SGE

MJS for SGE

MMJFS

RIPS

Index Service

PBS

MJS for PBS

MMJFS

RIPS

LSF

MJS for LSF

MMJFS

RIPS

managed-job-

globusrun

managed-job-

globusrun

managed-job-

globusrun

Site BMMJFS on node2

Site CMMJFS on node3

MMJFS = Master Managed Job Factory Service

MJS = Managed Job ServiceBlueindicates a Grid Service hosted in a GT3 container

Source: Chris Smith

GT3 + CSF Architecture


69/153

69

GT3 + CSF Architecture

Queuing

Service

Job

Service

Reservation

Service

Virtual

Organization

Index Service

PBS

Site B

RM Adapter

for PBS

RIPS

LSF

Site A

RM Adapterfor LSF

RIPS

SGE

Site C

MMJFS/MJS

RIPS

Source: Chris Smith

Queue Service


70/153

70

Queue Service

In CSF, Job Service instances are submitted to theQueue Service for dispatch to a resource manager.

The Queue Service provides a plug in API for extending

the scheduling algorithms provided by default with CSF.

The Queue Service is responsible for: loads and validates configuration information

loads all configured scheduler plugins

calls the plugin API functions

schedInit() after loading the plugin successfully

schedOrder() when a new job is submitted

schedMatch() during the scheduling cycle

schedPost() before the scheduling cycle ends, and after scheduling

decisions are sent to the job service instances


71/153


72/153

Co-allocation


73/153

73

Co-allocation

It is often requested that several resources are used for a single job. that is, a scheduler has to assure that all resources are available when

needed.

in parallel (e.g. visualization and processing)

with time dependencies (e.g. a workflow)

The task is especially difficult if the resources belong to different

administrative domains.

The actual allocation time must be known for co-allocation

or the different local resource management systems must synchronize

each other (wait for availability of all resources)

Example Multi-Site Job Execution


74/153

74

Example Multi-Site Job Execution

Scheduler

Schedule

time

Job-Queue

Machine 2

Scheduler

Schedule

time

Job-Queue

Machine 3

A job uses several resources at different sites in parallel.

Network communication is an issue.

Scheduler

Schedule

time

Job-Queue

Machine 1

Grid-Scheduler

Multi-Side Job

Advanced Reservation


75/153

75

Advanced Reservation

Co-allocation and other applications require a priori information aboutthe precise resource availability

With the concept of advanced reservation, the resource provider

guarantees a specified resource allocation.

includes a two- or three-phase commit for agreeing on the reservation

Implementations:

GARA/DUROC/SNAP provide interfaces for Globus to create advanced

reservation

implementations for network QoS available.

setup of a dedicated bandwidth between endpoints

Limitations of current Grid RMS


76/153

76

Limitations of current Grid RMS

The interaction between local scheduling and higher-level Gridscheduling is currently a one-way communication

current local schedulers are not optimized for Grid-use

limited information available about future job execution

a site is usually selected by a Grid scheduler and the job enters the

remote queue.

The decision about job placement is inefficient.

Actual job execution is usually not known

Co-allocation is a problem as many systems do not provide advance

reservation

Example of Grid Scheduling


77/153

77

Decision Making

Scheduler

Schedulet

ime

Job-Queue

Machine 1

Scheduler

Schedulet

ime

Job-Queue

Machine 2

Scheduler

Schedulet

ime

Job-Queue

Machine 3

Grid-SchedulerGrid User

15 jobs running

20 jobs queued

5 jobs running

2 jobs queued

40 jobs running

80 jobs queued

Where to put the Grid job?

Available Information from the


78/153

78

Local Schedulers

Decision making is difficult for the Grid scheduler limited information about local schedulers is available

available information may not be reliable

Possible information:

queue length, running jobs detailed information about the queued jobs

execution length, process requirements,

tentative schedule about future job executions

These information are often technically not provided by the localscheduler

In addition, these information may be subject to privacy concerns!

Consequence


79/153

79

Consequence

Consider a workflow with 3 short steps (e.g. 1 minute each) thatdepend on each other

Assume available machines with an average queue length of 1 hour.

The Grid scheduler can only submit the subsequent step if theprevious job step is finished.

Result:

The completion time of the workflow may be larger than 3 hours(compared to 3 minutes of execution time)

Current Grids are suitable for simple jobs, but still quite inefficient in

handling more complex applications

Need for better coordination of higher- and lower-level scheduling!

University Dortmund


80/153

GRMS in

Next Generation Grids

Outlook on future Grid ResourceManagement and Scheduling

Example Grid Scenario


81/153

81

Example Grid Scenario

Remote CenterReads and Generates TB of Data

LAN/WAN Transfer

WAN Transfer ComputeResources

Visualization

Assume a data-intensive simulation

that should be visualized and

steered during runtime!

Resource Request of a Simple

G id J b


82/153

82

Grid Job

A specified architecture with

48 processing nodes,

1 GB of available memory, and

a specified licensed software package

for 1 hour between 8am and 6pm of the following day

Time must be known in advance.

A specific visualization device during program execution

Minimum bandwidth between the VR device and the main computer during program

execution

Input: a specified data set from a data repository

at most 4

preference of cheaper job execution over an earlier execution.

actually a pretty simple example (no complex workflows)

Use-Case Example:Coordinated

Si l ti d Vi li ti


83/153

83

Simulation and Visualization

Expected output of a Grid scheduler:

time

Data Transfer

Loading Data Parallel Computation Providing Data

Data TransferNetwork 1

Computer 1

Parallel ComputationComputer 2

Communication for Computation

Network 3

VR-Cave Visualization

Data Data Access Storing Data

Communication for Visualization

Network 2

Software UsageSoftware License

Data StorageStorage

resources

Reservations are

necessary!

Need for a Grid Scheduling

A hit t


84/153

84

Architecture

Grid-

Scheduler

Grid-

User/Application

Information

Services

Monitoring

Services

Security

Services

Accounting/Billing

Other

Grid ServicesLocal RMS

Schedule

time

Other

Resources/

Services

Schedule

time

Schedule

time

Schedule

time

Grid

Middleware

Local RMS

Grid

Middleware

Local RMS

Grid

Middleware

Local RMS

Grid

Middleware

Data

Resources

Compute

Resources

Network

Resources


85/153

Service Oriented Architectures


86/153

86

Service Oriented Architectures

Services are used to abstract all resources and functionalities. Concept of OGSI and WSRF

using WebServices, SOAP, XML to implement the services

OGSI idea of GridServices is implemented in GT3

transition to WSRF with GT4

Core service for building a Grid are discussed in the Open Grid

Services Architecture (OGSA)

Open Grid Services Architecture


87/153

87

Web Services: Basic Functionality

OGSA

OGSI: Interface to Grid Infrastructure

Applications in Problem Domain X

Compute, Data & Storage Resources

Distributed

Application & Integration Technology for Problem Domain X

Users in Problem Domain X

Virtual Integration Architecture

Generic Virtual Service Access and Integration Layer

-

Structured DataIntegration

Structured Data Access

Structured DataRelational XML Semi-structured

Transformation

Registry

Job Submission

Data Transport Resource Usage

Banking

Brokering Workflow

Authorisation

p

OGSA Outlook


88/153

88

Context

Services

Status +

MonitoringServices

Infrastructure

Services

Security

Services

Resource

Management

Services

Job &

Workflow

Management

Data

Services

Policy +

Agreement

Virtual

Organization

Data Access

Data Integration

Data Provision

Data Catalog

FirewallTransition

Delegation

Authorization

Authentication

WS-RF

(OGSI)Notification

WS

DistributedManagement

Event

Problem

Determination

Information

Service

Job

Manager

Job

Service

Logging

Service

Broker

Execution

Planning

Service

Workflow

Manager

Workload

Manager

Provisioning

Application

Content

Manager

Deploy +Configuration

Service

Service

Container

Reservation

Service

OGSA Execution Planning


89/153

89

g

Work load Mgmt.

Framework

User/Job

Proxies

Environment

Mgmt.

Policies

SupplyDemand

CMM

Resource Mgmt. Framewo rk

Reservation

Opt imiz ing Framework

Resource Selection

Resource Workload

Optimal Mapping

Workload Optimization

Workload Post Balancing

Resource Provisioning

Work load Opt imiz ing Framework

Workload Optimization

Workload Orchestration

Workload Models (History/Prediction)

Dependency management

Scheduling Resource Opt imiz ing Framework

Capacity Management

Resource Placement

Primary Interaction

Meta - Interaction

Represents one or more OGSA services

Resource

Allocation

(or Binding)

Job

Factory

Admission Control (Resources)

Admission Control (Workload)

SLA Management (Workload)

Quality of Service (Resources)

Queuing Services

Resource

Factory

Information Provider

Selection Context (e.g. VO)

Functional Requirements

for Grid Scheduling


90/153

90

for Grid Scheduling

Functional Requirements:

Cooperation between different resource providers

Interaction with local resource management systems

Support for reservations and service level agreements

Orchestration of coordinated resources allocations

Automatic handling of accounting and billing

Distributed Monitoring

Failure Transparency

What are Basic Blocks for

a Grid Scheduling Architecture?


91/153

91

a Grid Scheduling Architecture?

Scheduling

Service

Data Management

Service

Network Management

Service

Information Service

Resources

Data

Network

Network-Resources

Management System

Network

Network Manager

Management

System

Compute/ Storage /Visualization etc

Compute Manager Data Manager

Data-Resources

Query for resources

Maintain information

Maintain information

static &

scheduled/forecasted

Reservation

Accounting

and Billing

Service

Job

SupervisorService

Scheduling-relevant Interfaces

of Basic Blocks are still to be

defined!

Information Service

Resource Discovery


92/153

92

Resource Discovery

Relevant for Grid Scheduling: Access to static and dynamic information

Dynamic information include data about planned or forecasted future

events

e.g.: existing reservations, scheduled tasks, future availabilities

need for anonymous and limited information (privacy concerns)

Information about all resource types

including e.g. data and network

future reservation, data transfers etc.

Job/Workflow Description

Requirement Description


93/153

93

Requirement Description

Information about the job specifics (what is the job) and job requirements (what is required for the job)

including data access and creation

Need for common workflow description

e.g. a DAG formulation

include static and dynamic dependencies

need for the ability to extract workflow information to schedule a whole

workflow in advance

Reservation Management

Agreement and Negotiation


94/153

94

Agreement and Negotiation

Interaction between scheduling instances, betweenresource/agreement providers and agreement initiators (higher-level

scheduler)

access to tentative information necessary

negotiations might take very long

individual scheduling objectives to be considered probably market-oriented and economic scheduling needed

Need for combining agreements from different providers

coordinate complex resource requests or workflows

Maintain different negotiations at the same time

probable several levels of negotiations, agreement commitment andreservation

Accounting and Billing


95/153

95

g g

Interaction to budget information Charging for allocations, reservations;

preliminary allocation of budgets

Concepts for reliable authorization of Grid schedulers to spend

money on behalf of the user

Re-funding in terms of resource/SLA failure, re-scheduling etc.

Reliable monitoring and accounting

required for tracing whether a party fulfilled an agreement

Monitoring Services


96/153

96

Monitoring of resource conditions

agreements

schedules

program execution

SLA conformance workflow

Monitoring must be reliable as it is part of accountability

fail or fulfillment of a service/resource provider must be clearly identifiable

Conclusions for Grid Scheduling


97/153

97

Grids ultimately require coordinated scheduling services.

Support for different scheduling instances

different local management systems

different scheduling algorithms/strategies

For arbitrary resources

not only computing resources, also

data, storage, network, software etc.

Support for co-allocation and reservation

necessary for coordinated grid usage

(see data, network, software, storage)

Different scheduling objectives cost, quality, other

Scheduling Model


98/153

98

Using a Brokerage/Trading strategy:

Submit Grid

Job Description

Analyze Query

Query for

Allocation Offers

Higher-level

scheduling

Lower-levelscheduling

Discover ResourcesSelect Offers

Collect Offers

Coordinate Allocations

Generate

Allocation Offer

Consider individual owner

policies

Consider individual userpolicies

Properties ofMulti Level Scheduling Model


99/153

99

Multi-Level Scheduling Model

Multi-level scheduling must support different RM systems andstrategies.

Provider can enforce individual policies in generating resource

offers.

User receive resource allocation optimized to the individualobjective

Different higher-level scheduling strategies can be applied.

Multiple levels of scheduling instances are possible

Support for fault-tolerant and load-balanced services.

Negotiation in Grids


100/153

100

Multilevel Grid scheduling architecture Lower level local scheduling instance

Implementation of owner policies

Higher level Grid scheduling instance

Resource selection and coordination

(Static) Interface definition between both instances Different types of resources

Different local scheduling systems with different properties

Different owner policies

(Dynamic) Communication between both instances

Resource discovery Job monitoring

Using Service Level Agreements


101/153

101

The mapping of jobs to resources can be abstracted using theconcept of Service Level Agreement (SLAs) (Czajkowski, Foster,Kesselman & Tuecke)

SLA: Contract negotiated between

resource provider, e.g. local scheduler

resource consumer, e.g., grid scheduler, application SLAs provide a uniform approach for the client to

specify resource and QoS requirements, while

hiding from the client details about the resources,

such as queue names and current workload

Service Level Agreement Types


102/153

102

Resource SLA (RSLA)A promise of resource availability

Client must utilize promise in subsequent SLAs

Advance Reservation is an RSLA

Task SLA (TSLA)

A promise to perform a task Complex task requirements

May reference an RSLA

Binding SLA (BSLA) Binds a resource capability to a TSLA

May reference an RSLA (i.e. a reservation) May be created lazily to provision the task

Allows complex resource arrangements

Agreement-Based Negotiation


103/153

103

A client (application) submits a task to a Grid scheduler The client negotiates a TSLA for the task with the Grid Scheduler

In order to provision the TSLA, the Grid Scheduler mayobtain an RSLA with the Grid resource or may use a pre-existing RSLA that the Grid scheduler has negotiatedspeculatively

TSLA that refers to an RSLA assures the jobs gets the reservedresources at a specified time

TSLA without an RSLA tells little about when the resources will be

available to the job


104/153


105/153

GGF: GRAAP-WG


106/153

106

Goal: Defining WebService-based protocols for negotiation and

agreement management WS-Agreement Protocol:

Towards Grid Scheduling


107/153

107

Grid Scheduling Methods: Support for individual scheduling objectives and policies

Multi-criteria scheduling models

Economic scheduling methods to Grids

Architectural requirements:

Generic job description

Negotiation interface between higher- and lower-level scheduler

Economic management services

Workflow management

Integration of data and network management

Grid Scheduling Strategies


108/153

108

Current approach:

Extension of job scheduling for parallel computers.

Resource discovery and load-distribution to a remote resource

Usually batch job scheduling model on remote machine

But actually required for Grid scheduling is:

Co-allocation and coordinationof different resource allocations for a Grid job

Instantaneous ad-hoc allocation is not always suitable

This complex task involves:

Cooperation between different resource providers

Interaction with local resource management systems Support for reservations and service level agreements

Orchestration of coordinated resources allocations

User Objective


109/153

109

Local computingtypically has:

A given scheduling objective as minimization of response time

Use of batch queuing strategies

Simple scheduling algorithms: FCFS, Backfilling

Grid Computingrequires: Individual scheduling objective

better resources

faster execution

cheaper execution

More complex objective functions apply for individual Grid jobs!

Provider/Owner Objective


110/153

110

Local computingtypically has:

Single scheduling objective for the whole system:

e.g. minimization of average weighted response time

or high utilization/job throughput

In Grid Computing: Individual policies must be considered:

access policy,

priority policy,

accounting policy, and other

More complex objective functionsapply for individual resource

allocations!

User and owner policies/objectives may be subject to privacy

considerations!

Grid EconomicsDifferent Business Models


111/153

111

Different Business Models

Cost model

Use of a resource

Reservation of a resource

Individual scheduling objective functions

User and owner objective functions

Formulation of an objective function Integration of the function in a scheduling algorithm

Resource selection

The scheduling instances act as broker Collection and evaluation of resource offers

Scheduling Objectives in the Grid


112/153

112

In contrast to local computing, there is no general scheduling

objective anymore

minimizing response time

minimizing cost

tradeoff between quality, cost, response-time etc.

Cost and different service quality come into play the user will introduce individual objectives

the Grid can be seen as a market where resource are concurring

alternatives

Similarly, the resource provider has individual scheduling policies

Problem: the different policies and objectives must be integrated in the scheduling

process

different objectives require different scheduling strategies

part of the policies may not be suitable for public exposition

(e.g. different pricing or quality for certain user groups)

Grid Scheduling Algorithms


113/153

113

Due to the mentioned requirements in Grids its not to be expected

that a single scheduling algorithm or strategy is suitable for all

problems.

Therefore, there is need for an infrastructure that

allows the integration of different scheduling algorithms the individual objectives and policies can be included

resource control stays at the participating service providers

Transition into a market-oriented Grid scheduling model

Economic Scheduling


114/153

114

Market-oriented approaches are a suitable way to implement theinteraction of different scheduling layers

agents in the Grid market can implement different policies and strategies

negotiations and agreements link the different strategies together

participating sites stay autonomous

Needs for suitable scheduling algorithms and strategies for creating

and selecting offers

need for creating the Pareto-Optimal scheduling solutions

Performance relies highly on the available information

negotiation can be hard task if many potential providers are available.

Economic Scheduling (2)


115/153

115

Several possibilities for market models:

auctions of resources/services

auctions of jobs

Offer-request mechanisms support:

inclusion of different cost models, price determination individual objective/utility functions for optimization goals

Market-oriented algorithms are considered:

robust

flexible in case of errors simple to adapt

markets can have unforeseeable dynamics


116/153

Offer Creation (2)


117/153

117

R1 R2 R3 R4 R5 R6 R7 R8

t

Offer 1

End of current

schedule


118/153

Offer Creation (4)


119/153

119

R1 R2 R3 R4 R5 R6 R7 R8

t

Offer 3

End of current

schedule

New end ofcurrent schedule

Evaluate Offers


120/153

120

Evaluation with utility functions

)( 21max pricealatencyaUutil

A utility function is a mathematical representation of a users

preference

The utility function may be complex and

contain several different criteria

Example using response time (or delay time) and price:

Optimization Space


121/153

121

latency

price

A1

A1

A3

A3

A2

A2

The utility depends on the user and provider objectives

Example: NWIRE


122/153

122

Following are some slides to illustrate the scheduling process in the

NWIRE project.

NWIRE models a Grid infrastructure without central services

A P2P scheduling model is employed where Grid Scheduler/Managersthat reside on each local site are queried for offers.

A Grid schedulers asks known other Schedulers. Similar to P2P systems, no Grid manager knows the whole grid but

keeps a list of know other Grid managers

Those schedulers can forward the request to other schedulers

The scope and depths of the query can be parameterized

Individual objective functions are used to evaluate the utility for a userand the resource provider.

The scheduler selects the allocation offer with the highest utility value

The whole model represents a Grid market and auction system.

Evaluation of Economic Approach


123/153

123

Real Workload Data of Grids is not yet available.

However, available are workloads of single HPC installations.

Example of evaluation results:

considered: traces from Cornell Theory Center (430 nodes IBM RS/6000 SP)

4 workload adapted from the real traces, each 10000 jobs

Assumed Synthetic Grid machine configurations:

Name Configuration Largest Maschine

m64 4*64 + 6*32 + 8*8 64

m128 4*128 128

m256 2*256 256

m384 1*384 + 1*64 + 4*16 384

m512 1*512 512

Scheduling Process


124/153

124

ResourceManager ResourceManager ResourceManager

Grid

ManagerApplication

Request

RequirementsJob Attributes

ObjectiveFunction

1) Client sends a

request

Example Request


125/153

125

KEY Hops {VALUE HOPS {2}}

KEY MaxOfferNumber {VALUE MaxOfferNumber {10}}

KEY MinMulti-Site {VALUE MinMulti-Site {10}}

...

KEY Utility {

ELEMENT 1 {

CONDITION{((OperatingSystem EQ Linux)&&( NumberOfProcessors >= 8))}

VALUE UtilityValue {-EndTime}

VALUE RunTime {5}

}

...

ELEMENT 2 {

CONDITION{ ... }VALUE UtilityValue {-JobCost}

}

...

}

Scheduling Process


126/153

126

Grid

ManagerGrid

Manager

Resource

Manager

Resource

Manager

Resource

Manager

Grid

ManagerApplication

RequestRequest

RequestRequest Request

Scheduler requests offers from other Grid Managers/Schedulers.

Search depths and scope can be parameterized.

Request

Requirements

Job AttributesObjectiveFunction

2) Query for

other offers

3) Limitations of

direction and depth

Scheduling Process


127/153

127

Grid

ManagerGrid

Manager

Resource

Manager

Resource

Manager

Resource

Manager

Grid

ManagerApplication

RequestRequest

Offer

AttributesObjectiveFunction


Offer

Selection of allocation according to the utility value

Returned offers are collected.

4) Allocation;

Maximization of the

objectives

Scheduling Process


128/153

128

Grid

ManagerGrid

Manager

Resource

Manager

Resource

Manager

Resource

Manager

Grid

ManagerApplication

RequestRequest


Offer

The user can directly be informed about the schedule for his execution.

The Grid Manager may re-schedule his allocations to optimize the result further.

ScheduleAllokation_1Allokation_2Allokation_3

5) User and Resource

Managers are informed

about allocations

Offer Creation


129/153

129

Zeit

1 2 3 4 5 6 7

Resources

B

C

D

A

Offer Creation


130/153

130

Zeit

1 2 3 4 5 6 7

Resources

B

C

D

A

Offer Creation


131/153

131

Zeit

1 2 3 4 5 6 7

Resources

B

C

D

A


132/153

Offer Creation


133/153

133

Zeit

1 2 3 4 5 6 7

Resources

B

C

D

A


134/153

Offer Creation


135/153

135

Zeit

1 2 3 4 5 6 7

Resources

B

C

D

A

Offer Creation


136/153

136

Zeit

1 2 3 4 5 6 7

Resources

B

C

D

A

AWRT for Economic SchedulingCompared to Conservative Backfilling


137/153

137

Changes in average completion time of

market-economic vs conventional scheduler

-15,000

-10,000

-5,000

0,000

5,000

10,000

M128 M256 M384

Resource Configuration

Changesonpct

Conventional

algoritms better

Markted-oriented

better

Utilization


138/153

138

Changes in Utilization of

market-economic vs conventional schedulers

-3,00

-2,00

-1,00

0,00

1,00

2,00

3,00

M128 M256 M384

Resource Configurations

Changesinpc

t

Conventional

better

Markted-oriented

better


139/153

Results on Economic Models


140/153

140

Economical models provide results in the range of conventional

algorithms in terms of AWRT

Economic methods leave much more flexibilities in defining the

desired resources

Problems of site autonomy, heterogeneous resources and individualowner and user preferences are solved

Advantage of reservation of resources ahead of time

Further analysis needed:

Usage of other utility functions


141/153


142/153

Data Management


143/153

143

Access to information about the location of data sets

Information about transfer costs

Scheduling of data transfers and data availability

optimize data transfers in regards to available network bandwidth and

storage space

Coordination with network or other resources

Similarities with general grid scheduling:

access to similar services

similar tasks to execute

interaction necessary

Functional View of GridData Management


144/153

144

Location based on

data attributes

Location of one or

more physical replicas

State of grid resources,

performance measurementsand predictions

MetadataService

Application

Replica Location

Service

Information Services

Planner:Data location,Replica selection,Selection of compute

and storage nodes

Security and Policy

Executor:Initiates

data transfers andcomputations

Data Movement

Data Access

Compute Resources Storage Resources

Source: Carl Kesselmann

Replica Location Service InContext


145/153

145

Replica Location ServiceReliable Data

Transfer Service

GridFTP

Reliable Replication Service

Replica Consistency Management Services

Metadata

Service

The Replica Location Service is one component in a layered datamanagement architecture

Provides a simple, distributed registry of mappings

Consistency management provided by higher-level services

Source: Carl Kesselmann

Example of a Scheduling Process


146/153

146

Scheduling Service:

1. receives job description2. queries Information Service for static resource

information

3. prioritizes and pre-selects resources

4. queries for dynamic information about resourceavailability

5. queries Data and Network Management Services

6. generates schedule for job

7. reserves allocation if possibleotherwise selects another allocation

8. delegates job monitoring to Job Supervisor

Job Supervisor/Network and Data Management:service, monitor and initiate allocation

Example:

40 resources of requested type are

found.

12 resources are selected.

8 resources are available.

Network and data dependencies are

detected.

Utility function is evaluated.

6thtried allocation is confirmed.

Data/network provided and

job is started

Use-Case Example:CoordinatedSimulation and Visualization


147/153

147

Expected output of a Grid scheduler:

time

Data Transfer

Loading Data Parallel Computation Providing Data

Data TransferNetwork 1

Computer 1

Parallel ComputationComputer 2

Communication for Computation

Network 3

VR-Cave Visualization

Data Data Access Storing Data

Communication for Visualization

Network 2

Software UsageSoftware License

Data StorageStorage

resources

Re-Scheduling


148/153

148

Reconsidering a schedule with already made agreements may be a

good idea from time to time because resource situation may have changed, or

workload situation has changed

Optimization of the schedule can only work with the bounds of made

agreements and reservations given guarantees must be observed

The schedulers can try to maximize the utility values of the overallschedule

a Grid scheduler may negotiate with other resource providers in order toget better agreements; may cancel previous agreements

a local scheduler may optimize the local allocations to improve theschedule.


149/153

Conclusion


150/153

150

Resource management and scheduling is a key service in an Next

Generation Grid. In a large Grid the user cannot handle this task.

Nor is the orchestration of resources a provider task.

System integration is complex but vital.

The local systems must be enabled to interact with the Grid.

Providing sufficient information, expose services for negotiation

Basic research is still required in this area.

No ready-to-implement solution is available.

New concepts are necessary.

Current efforts provide the basic Grid infrastructure. Higher-levelservices as Grid scheduling are still lacking.

Future RMS systems will provide extensible negotiation interfaces

Grid scheduling will include coordination of different resources

Further Outlook


151/153

151

Commercial Interest in the Grid may have stronger focus on non HPC-

related aspects eCommerce-Solutions

Dynamically migrating Application Server Load

Inter-operation between companies

Supply-Chain Management

request distributed resources under certain constraints Enterprise Application Integration EAI

dynamically discover and use available services

The problems tackled by Grid RMS is common to many application

fields. Grid solutions may become a key technology infrastructure

or solutions from other areas (e.g. the WebService-Community) maysurpass Grid efforts?

References


152/153

152

Book: Grid Resource Management: State of the Art

and Future Trends,co-editors Jarek Nabrzyski, Jennifer M. Schopf, and

Jan Weglarz, Kluwer Publishing, 2004

PBS, PBS pro: www.openpbs.organd

www.pbspro.com

LSF, CSF: www.platform.com

Globus: www.globus.org

Global Grid Forum: www.ggf.org, see SRM area
http://www.openpbs.org/http://www.pbspro.com/http://www.platform.com/http://www.globus.org/http://www.ggf.org/http://www.ggf.org/http://www.globus.org/http://www.platform.com/http://www.pbspro.com/http://www.openpbs.org/


153/153

Contact:

[email protected]

Computer Engineering Institute

University of Dortmund
mailto:[email protected]:[email protected]

Documents

Europar Tutorial GRMSr