Job Scheduling for Grid Computing on Metacomputers

1

Job Scheduling for Grid Computing on Metacomputers

Keqin Li

Proceedings of the 19th IEEE International Parallel and Distributed Procession Symposium (IPDPS’05)

2

Outline

Introduction The Scheduling Model A Communication Cost Model Scheduling Algorithms Worst-Case Performance Analysis Experimental Data

3

Introduction 1

A metacomputer is a network of computational resources linked by software in such a way that they can be used as easily as a single computer.

A metacomputer is able to support distributed supercomputing applications by combining multiple high-speed high-capacity resources on a computational grid into a single, virtual distributed supercomputer.

4

Introduction 2

The most significant result of the paper is that by using any initial order of jobs and any processor allocation algorithm, the list scheduling algorithm can achieve worst-case performance bound

with

Notation: p is the maximum size of an individual machine

P is the total size of a metacomputer

s is minimum job size with s ≥ p

α is the ratio of the communication bandwidth within a parallel

machine to the communication bandwidth of a network

β is the fraction of the communication time in the jobs

5


6

A metacomputer is specified as M = (P1, P2, ..., Pm), where Pj , 1 ≤ j ≤ m, is the name as well as the size (i.e., the number of processors) of a parallel machine.

Let P = P1 +P2 +…+Pm denote the total number of processors. The m machines are connected by a LAN, MAN, WAN, or the Intern

et. A job J is specified as (s, t), where s is the size of J (i.e., the number

of processors required to execute J) and t is J’s execution time. The cost of J is the product st.

Given a metacomputer M and a list of jobs L = (J1, J2, ..., Jn), where Ji = (si, ti), 1 ≤ i ≤ n, we are interested in scheduling the n jobs on M.

7

A schedule of a job Ji = (si, ti) is

τi is the starting time of Ji

Ji is divided into ri subjobs Ji,1, Ji,2, ..., Ji,ri , of sizes si,1, si,2, ..., si,ri , respectively, with si = si,1 + si,2 + … + si,ri

The subjob Ji,k is executed on Pjk by using si,k processors, for all 1 ≤ k ≤ ri

8


9

si processors allocated to Ji communicate with each other during the execution of Ji.

Communication time between two processors residing on different machines connected by a LAN, MAN, WAN, or the Internet is significantly longer than that on the same machine.

The communication cost model takes both inter-machine and intra-machine communications into consideration.

The execution time ti is divided into two components,ti = ti,comp + ti,comm

Each processor on Pjk needs to communicate with the si,k processor

s on Pjk and the si − si,k processors on Pjk’ with k’ ≠ k. t*I,k, the execution time of the subjob Ji,k on Pjk, as

10

11

The execution time of job Ji is t*I = max(t*I,1 , t*i,2 , …, t*I,ri)

we call t*I the effective execution time of job Ji. The above measure of extra communication time among processors

on different machines discourages division of a job into small subjobs.

12

Our job scheduling problem for grid computing on metacomputers can be formally defined as follows:

given a metacomputer M = (P1, P2, ..., Pm) and a list of jobs

L = (J1, J2, ..., Jn), where Ji = (si, ti), 1 ≤ i ≤ n, find a schedule ψ of L, ψ = (ψ1, ψ2, ..., ψn), with ψi = (τi, (Pj1, si,1), (Pj2, si,2), ..., (Pjri, si,ri )),

where Ji is executed during the time interval [τi, τi +t*i ] by using si,k processors on Pjk for all 1 ≤ k ≤ ri, such that the total execution time of L on M,

is minimized.

13

When α = 1, that is, extra communication time over a LAN, MAN, WAN, or the Internet is not a concern, the above scheduling problem is equivalent to the problem of scheduling independent parallel tasks in multiprocessors, which is NP-hard even when all tasks are sequential.

14


15

A complete description of the list scheduling (LS) algorithm is given in the next slide.

There is a choice on the initial order of the jobs in L. Four ordering strategies: Largest Job First (LJF) – Jobs are arranged such that s1 ≥ s2 ≥…≥ sn

Longest Time First (LTF) – Jobs are arranged such that t1 ≥ t2 ≥…≥ tn Largest Cost First (LCF) – Jobs are arranged such that s1t1 ≥ s2t2 ≥…≥ sn

tn. Unordered (U) – Jobs are arranged in any order.

16

The number of available processors P’j on machine Pj is dynamically maintained. The total number of available processors is P’ = P’1 + P’2 + · · · + P’m

17

18

19

Each job scheduling algorithm needs to use a processor allocation algorithm to find resources in a metacomputer.

Several processor allocation algorithms have been proposed, including Naive, LMF (largest machine first), SMF (smallest machine first), and MEET (minimum effective execution time).

20


21

Let A(L) be the length of a schedule produced by algorithm A for a list L of jobs, and OPT(L) be the length of an optimal schedule of L. We say that algorithm A achieves worst-case performance bound B

if A(L)/OPT(L) ≤ B for all L

22

Let t*i,LS be the effective execution time of a job Ji in an LS schedule. Assume that all the n jobs are executed during the time interval [0, L

S(L)]. Let Ji be a job which is finished at time LS(L). It is clear that before Ji is scheduled at time LS(L) − t*i,LS, there are n

o si processors available; otherwise, Ji should be scheduled earlier. That is, during the time interval [0, LS(L)−t*i,LS], the number of busy p

rocessors is at least P − si + 1. During the time interval [LS(L)−t*i,LS, LS(L)], the number of busy proc

essors is at least si. Define effective cost of L in an LS schedule as Then, we have

23

No matter which processor allocation algorithm is used, always have

The effective execution time of Ji in an optimal schedule is

Thus, we get

where

It is clear that φi is an increasing function of si, which is minimized when si = s. Hence, we have

where

24

=> => Since

=>

=>

The right hand side of the above inequality is minimized when

=>

25

=>

The right hand side of the above inequality is a decreasing function of

Si, which is maximized when Si = s.

26

Theorem. If Pj ≤ p for all 1 ≤ j ≤ m, and si ≥ s for all 1 ≤ i ≤ n, where p ≤ s, then algorithm LS can achieve worst-case performance bound

where

The above performance bound is independent of the initial order of L and the processor allocation algorithm.

27

Corollary. If a metacomputer only contains sequential machines, i.e., p = 1, communication heterogeneity vanishes and the worst-case performance bound in the theorem becomes

28

Documents

Job Scheduling for Grid Computing on Metacomputers