[IEEE 2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing (PDGC) - Solan, India (2012.12.6-2012.12.8)] 2012 2nd IEEE International Conference on Parallel,

2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

Load Balanced Static Grid Scheduling Using Max-Min Heuristic

Tarun Kumar Ghosh, Rajmohan Goswami, Sumit Bera and Subhabrata Barman

Abstract- Grid computing, inspired by electrical power Grid, is an emerging trend for making easy access to computing resources. Its main objective is to solve computationally hard problems which otherwise cannot be solved by single CPU. This extremely high computing power is achieved by optimal utilization of geographically distributed heterogeneous resources which are lying idle. Load balanced task scheduling is very important problem in complex Grid environment. So task scheduling which is one of the NP-Complete problems becomes a focus of research scholars in Grid computing area. The traditional Min-Min and Max-Min algorithms are simple algorithms that produces a schedule that minimizes the makespan than the other traditional algorithms in the literature. In real scenario of Min-Min and Max-Min failed to produce a load balanced schedule. The proposed method has two-phases. In the first phase the traditional Max-Min algorithm is executed and in the second phase the tasks are rescheduled to use the un utilized resources effectively.

Index Terms-Grid Computing, Grid Scheduling, Makespan, Max-Min Heuristic, GridSim.

I, INTRODUCT ION

The term "Grid" was coined in the mid-1990s to denote a

distributed computing infrastructure for advanced science and

engineering. Grids enable formation of Virtual Organiza

tions(VO). Grids facilitate collaboration among multiple organ

isations for sharing of resources. Grids also provide fault tol

erance and reliability. The Grid makes provision for automatic

resubrnission of jobs to other available resources when a failure

is detected. Grids balance and share varied resources. This as

pect enables the Grid to evenly distribute the tasks to the avail

able resources. Grids perform parallel processing. Some tasks

can be broken into multiple subtasks, each of which can run on

a different machines. Grids assure Quality of Service (QoS).

Service Level Agreement (SLA) specifies the minimum quality

of service, availability, etc, expected by the user and the charges

levied on those services [1], [2].

This paper is organized as follows. Section II introduces Grid

scheduling issues. Section III briefly outlines the relevant past

work done on static scheduling in Grid environment. In Sec

tion IV, the framework of Grid scheduling problem has been

defined. Section V highlights Max-Min heuristic. Section VI

describes Load Balanced Max-Min algorithm for Grid Schedul

ing. Section VII presents the results obtained from simulation.

T. K. Ghosh(email:[email protected]). S. Bera and S. Barman (email:[email protected]) are with the Computer Science and Engineering, Haldia Institute of Technology, Haldia, Purba Medinipur and R. Goswami(email:[email protected]) is with the Computer Applications, Pailan College of Management and Technology, Kolkata, West Bengal, India.

978-1-4673-2925-5/12/$31.00 ©2012 IEEE

In fine, Section VIII concludes the paper.

II. GRID SCHEDULING ISSUES

Because of the heterogeneous and the dynamic nature of the

Grid, scheduling in Grid Environment is significantly compli

cated.

A. Resource Broker

Most Grid systems use the Grid resource broker for resource

discovery, deciding allocation of a job to a particular resource,

binding of user applications (files)to hardware resources, initi

ating computations, adapting to the changes in Grid resources

and presenting the Grid to the user as a single, unified resource.

It finally controls the physical allocation of the tasks and man

ages the available resources constantly while dynamically up

dating the Grid scheduler whenever there is a change in re

source availability [3].

B. Types of Grid Scheduling

Out of the various scheduling policies known, static and dy

namic schedulings are important and easy to implement.

Static Scheduling, also called off-line scheduling technique,

is a scheduling in which all decisions are taken before the exe

cution of a schedule. It is suitable when all the tasks (or appli

cations) and resources are known in advance.

Dynamic Scheduling, also called on-line scheduling, is a

scheduling in which some or all the decisions are taken during

the execution. It is suitable when jobs and machines are coming

on-line or going off-line due to failures, the processor speed of

each processor is varying during the scheduling and difficulties

are encountered in predicting the cost of applications. Dynamic

mapping is performed when the arrival of tasks is not known

beforehand [4], [5].

III, RELATED WORKS

In classical distributed systems, comprised of homogeneous

and dedicated resources, load balancing algorithms have been

intensively studied. But these algorithms will not work well in

Grid architecture because of its heterogeneity, scalability and

autonomy. This makes load balanced scheduling algorithms for

Grid computing more difficult and an interesting topic for many

researchers [6].

419


A. Min-Min

Min-Min algorithm starts with a set of all unmapped tasks. The

machine that has the minimum completion time for all jobs is

selected. Then the job with the overall minimum completion

time is selected and mapped to that resource. The ready time of

the resources are updated. This process is repeated until all the

unmapped tasks are assigned. Compared to Minimum Comple

tion Time algorithm this algorithm considers all the jobs at a

time. So it produces a better makespan.

B. Max-Min

Max-Min is similar to Min-Min algorithm. The machine that

has the minimum completion time for all jobs is selected. Then

the job with the overall Maximum Completion Time (MCT) is

selected and mapped to that resource. The ready time of the

resources are updated. This process is repeated until all the

unmapped tasks are assigned. The idea of this algorithm is to

reduce the wait time of the large jobs.

C. Double Min-Min

Doreen. D. et al. [7] have proposed an efficient Set Pair Analy

sis (SPA) based task scheduling algorithm named Double Min

Min Algorithm which performs scheduling in order to enhance

system performance in Hypercubic P2P Grid (HPGRID). The

simulation result shows that the SPA based Double Min-Min

scheduling minimizes the makespan with load balancing and

guarantees the high system availability in system performance.

D. QoS Guided Min-Min

He. X et al. [8] have presented a new algorithm based on

the conventional Min-Min algorithm. The proposed algorithm

which is called QoS guided Min-Min, schedules tasks requiring

high bandwidth before the others. Therefore, if the bandwidth

required by different tasks varies highly, the QoS guided Min

Min algorithm provides better results than the Min-Min algo

rithm. Whenever the bandwidth requirement of all of the tasks

is almost the same, the QoS guided Min-Min algorithm acts

similar to the Min-Min algorithm.

E. Min-Mean

Kamalam et al. [9] presents a new scheduling algorithm named

Min-Mean heuristic scheduling algorithm for static mapping to

achieve better performance. The proposed algorithm resched

ules the Min-Min produced schedule by considering the mean

makespan of all the resources. The algorithm deviates in pro

ducing a better schedule than the Min-Min algorithm when the

task heterogeneity increases.

F. Weighted Heuristics

Sameer Singh et al. [10] have presented two heuristic algo

rithms: QoS Guided Weighted Mean Time-Min(QWMTM) and

QoS Guided Weighted Mean Time Min-Min Max-Min Selec

tive(QWMTS). Both algorithms are for batch mode indepen

dent tasks scheduling. The network bandwidth is taken as QoS

parameter.

G. Predictive Heuristics

Singh M. et al. [11] present a QoS based predictive Max-Min,

Min-Min Switcher algorithm for scheduling jobs in a Grid. The

algorithm makes an appropriate selection among the QoS based

Max-Min or QoS based Min-Min algorithm on the basis of

heuristic applied, before scheduling the next job. The effect on

the execution time of Grid jobs due to non-dedicated property

of the resources has also been considered. The algorithm uses

the history information about the execution of jobs to predict

the performance of non-dedicated resources.

IV. PROBLEM DEFINIT ION

In static heuristics, the accurate estimate of the expected exe

cution time for each task on each machine is known a priori to

execution and is contained within an ETC (Expected Time to

Compute) matrix where ETC(ti,mj) is the estimated execu

tion time of task i on machinej [3].

The main aim of the scheduling algorithm is to minimize the

makespan. Using the ETC matrix model, the scheduling prob

lem can be defined as follows:

Let task set T = tl, t2, t3, ... , tn be the group of tasks submit

ted to scheduler.

Let Resource set R = ml, m2, m3, ... , mk be the set of re

sources available at the time of task arrival.

Makespan produced by any algorithm for a schedule can be cal

culated as follows:

makespan = max(ct(ti, mj))

where

where

ct completion time of machines

etij expected execution time of job i on resource j

T'j ready time or availability time of resource j after completing

the previously assigned jobs

V. MAX-M IN - A META-HEURIST IC

In Max-Min heuristic (See Algorithm 1), U is the set of all

unassigned tasks. Firstly, the set of minimum completion times,

M = mino'5cj<u(ct(ti,mj)), for each ti E U, is found. Next,

420


the task with the overall MCT from M is selected and assigned

to the corresponding machine (Hence the name Max-Min). Fi

nally, the newly mapped task is removed from U, and process

is repeated until U is empty, that is, all tasks are assigned.

The advantage of the Max-Min is to minimize the prob

lems incurred from performing tasks that have longer execution

times. Assigning the task to the best machine which has the

longer execution time allows this task to be executed concur

rently with the remaining tasks which have the shorter execu

tion times [12]. The dynamic version is also available.

Algorithm 1 Max-Min Algorithm

1: for all task ti in meta-task Mv (in an arbitrary order) do

2: for all machines mj (in a fixed arbitrary order) do

3: etij = etij + Tj 4: end for

5: carry out some processing

6: while all task in Mv are mapped do

7: for each task ti in Mv find its earliest completion time

and the machine that obtains it do

8: find the task ti with the maximum earliest comple

tion time

9: end for

10: assign the task ti to the machine mi that gives the ear-

liest completion time

11: delete the task tk from Mv

12: update Tl 13: update etil for all ti E M V

14: end while

15: end for

VI. LOAD BALANCED MAX-M IN ALGORITHM

The Max-Min algorithm seems to do better than the Min-Min

algorithm in the cases when the number of short tasks is much

more than the long ones. For example, if there is only one long

task, the Max-Min algorithm executes many short tasks concur

rently with the long task. In this case, the makespan of the sys

tem is most likely determined by the execution time of the long

task.So Load Balanced Max-Min algorithm (See Algorithm 2)

executes Max-Min in the first round. In the second round it

chooses the resources with heavy load and reassigns them to

the resources with light load.

A. Basic Concepts

Load Balanced Max-Min identifies the resources with heavy

load by choosing the resource with high makespan in the sched

ule produced by Max-Min. It then considers the tasks assigned

in that resource and chooses the task with minimum execution

time on that resource. The completion time for that task is cal

culated for all resources in the current schedule. Then the MCT

of that task is compared with the makespan produced by Max

Min. If it is less than makespan then the task is rescheduled

in the resource that produces it, and the ready time of both re

sources are updated. Otherwise the next MCT of that task is se

lected and the steps are repeated again. The process stops if all

resources and all tasks assigned in them have been considered

for rescheduling. Thus the possible resources are rescheduled in

the resources which are idle or have minimum load. This makes

Load Balanced Max-Min to produce a schedule which increases

load balancing. Since it compares the MCT with makespan,

Load Balanced Max-Min reduces the overall completion time

also.

Algorithm 2 Second Round of Load Balanced Max-Min Algo

rithm

1: sort the resources in the order of completion time

2: for all resources R do

3: Compute makespan = max(et(R)) 4: end for

5: for all resources do

6: for all tasks do

7: find the task Ti that has minimum et in Rj 8: find the MCT of task Ti 9: if MCT < makespan then

10: Reschedule the task Ti to the resource that produces

it

11: Update the ready time of both resources

12: end if

13: end for

14: end for

B. Example with Analysis

Consider a Grid environment with two resources Rl and R2

and a meta-task group Mv with four tasks Tl, T2, T3 and

T4. The Grid scheduler is supposed to schedule all the tasks

within Mv on the available resources Rl and R2. Since Load

Balanced Max-Min algorithm is simple and produces a better

makespan than the other algorithms discussed in the literature,

the proposed algorithm executes the Max-Min algorithm in the

first phase to schedule the jobs. But to remove the limitation

of unbalanced load in Max-Min, the jobs are rescheduled in the

second phase. In this problem the execution time of all tasks are

known prior. They can also be calculated if the number of in

structions in each job and the computation rate of each resource

is known. Table I represents the execution time of the tasks on

each resource.

Static mapping of tasks to machines based on Max-Min is

shown in Figure 1. All tasks are scheduled to resource Rl and

resource R2 remains idle. The makespan produced by Max

Min is 11 seconds.

According to the proposed Load Balanced Max-Min algo

rithm, task Tl's MCT is less than makespan produced by Max

Min. Other task's MCT is not less than makespan. So task Tl is

rescheduled in resource R2 and the remaining tasks are sched

uled in the same resource Rl. Mapping of tasks based on Load

Balanced Max-Min is shown in Figure 2. Thus the reschedul-

421


ing of Max-Min algorithm utilizes the idle resource R2 as well

as reduces the makespan to 10 seconds.

TABLE I

COMPLETION TIME OF THE TASKS ON EACH OF THE RESOURCES

Tasksl Resources Rl R2

Tl 3 10

T2 2 13

T3 5 15

T4 1 12

Fig. 1. Gantt chart of the Max-Min algorithms

Fig. 2. Gantt chart of Load Balanced Max·Min algorithm

VII. COMPARISON AND DISCUSSION

The proposed algorithm was simulated in Grid scheduling sim

ulation environment Alea 3.0.

A. Alea 3.0

Alea 3.0 is an extension of Java package GridSim. GridSim

is a toolkit for the modeling and simulation of distributed re

source management and scheduling for Grid computing [13].

Alea 3.0 is used for evaluation of various job scheduling tech

niques. Functions that approximate makespan, tardiness, and

other values important for the scheduling algorithms are imple

mented in Alea 3.0 [14].

The evaluation of the proposed algorithm was a large sim

ulation. Running Alea 3.0 during simulation required a lot of

memories since many objects were created. The simulation was

performed on the Intel Pentium 4, 2.8 GHz machine with 512

MB RAM.

B. Simulation Output

Current Grid scheduling systems are all queue based systems.

These systems use one or more incoming queues where jobs are

stored until they are scheduled for execution. All systems use

basic First Come First Served (FCFS). Through the simulation,

the proposed Revised load Balanced Max-Min scheduling algo

rithm has been compared with the FCFS algorithm.Figure 3 and

Figure 4 present graphs depicting the number of waiting and

running jobs per day, following FCFS and Load Balanced Max

Min algorithms respectively, as were generated by the Alea 3.0

during the simulation.

These graphs demonstrate major differences among the al

gorithms. Concerning the machine usage, FCFS generates poor

results in not being able to utilize available resources which is

depicted by the existence of waiting jobs even in the last part

of the simulation. In contrast, it is observed that the Load Bal

anced Max-Min based approach is able to manage the load ef

ficiently through efficient search technique which is clearly de

picted by the increase in the number non-waiting jobs in the last

part of the simulation.

570

513

456

399

342

285

228

Number of wail ingl running jobs

25

days

Fig. 3. Waiting and running jobs by FCFS

4BO

432

384

336

'BB

240

192

144

96

48

Number of wait ingl running jobs

25

days

wailing jobs running Jobs

50

wailing jobs running jobs

50

Fig. 4. Waiting and running jobs by Load Balanced Max·Min algorithm

VIII. CONCLUSION AND FUTURE WORK

Min-Min and Max-Min algorithms are applicable in small scale

distributed systems. The Max-Min algorithm seems to do bet

ter than the Min-Min algorithm in the cases when the number

of short tasks is much more than the long ones. For example,

if there is only one long task, the Max-Min algorithm executes

many short tasks concurrently with the long task. Although

load balancing in small scale distributed systems is desirable

and leads to reduced total completion times, however, in large

scale distributed systems load balancing does not necessarily

422


results in the shortest makespan. The proposed algorithm out

performs Max-Min in large scale systems, because it focuses

on minimizing the completion time of tasks. The proposed al

gorithm is executed in two-phases. It uses the advantages of

Max-Min and covers it's disadvantages by reducing makespan

and maximizing resource utilization.

The experimental results, obtained by applying the proposed

algorithm for various problems, show that it outperforms the

existing Grid scheduling algorithms. This study is only con

cerned with the limited number of resources and task execution

time. The study can be further extended by considering low and

high machine heterogeneity and task heterogeneity. Also, ap

plying the proposed algorithm on actual Grid environment, its

efficiency may be observed.

REFERENCES

[l] Anthony Sulistio, Chee Shin Yeo and Rajkumar Buyya. A taxonomy of computer-based simulations and its mappings to parallel and distributed systems simulation tools. Software - Practice And Experience, 34:653-673, April 2004.

[2] Ian Foster and Carl Kesselman. editors. T he Grid: Blueprint for a new computing infrastructure. Morgan Kaufmann, San Francisco, CA, 1999.

[3] Ajith Abraham, Rajkumar Buyya and Baikunth Nath. Nature's heuristics for scheduling jobs on computational grids. In P.S. Sinha and R. Gupta, editors, Proceedings of 8th IEEE International Coriference on Advanced Computing and Communications, ( ADCOM2000), pages 45-52. Tata McGraw-Hili Publishing Co. Ltd, New Delhi, 2000.

[4] J.G. Kuhl Casavant. A taxonomy of scheduling in general-purpose distributed computing systems. IEEE Transaction on Software Engineering, 14(2):141-154,1988.

[5] T. D. Braun, Maheswaran and H. J. Siegel. Heterogeneous Distributed Computing, Encyclopedia of Electrical and Electronics Engineering. John Wiley and Sons, New York, NY, 1999.

[6] Rajmohan Goswami, Tarun Kumar Ghosh and Subhabrata Barman. Local search based approach in grid scheduling using simulated annealing. In Proceedings of IEEE International Coriference on Computer and Communication Technology (ICCCT), Allahabad, India, 2011.

[7] D.Doreen Hephzibah Miriam and K.S.Easwarakumar. A double min min algorithm for task metascheduler on hypercubic p2p grid systems. UCSI International Journal of Computer Science Issues, 7(5):8-18, July 2010.

[8] He X., X-He Sun and Laszewski G.v. Qos guided minmin heuristic for grid task scheduling. Journal of Computer Science and Technology, 18:442-451,2003.

[9] Kamalam G.K. and Muralibhaskaran V. A new heuristic approach:minmean algorithm for scheduling meta-tasks on heterogenous computing systems. UCSNS International Journal of Computer Science and Network Security, 10(1):442-451, January 2010.

[10] Sameer Singh Chauhan and R. Joshi. Qos guided heuristic algorithms for grid task scheduling. International Journal of Computer Applications, 2(9):24-31,2010.

[11] Singh M. and Suri P.K. A qos based predictive max-min, min-min switcher algorithm for job scheduling in a grid. lriformation Technology Journal, 7(8):1176-1181, 2008.

[12] V. P. Roychowdhury, Wang, H. 1. Siegel and A. A. Maciejewski. Task matching and scheduling in heterogeneous computing environments using a genetic algorithm based approach. Journal of Parallel and Distributed Computing, 220(4598):8-22, 1997.

[13] Rajkumar Buyya and Manzur Murshed. Gridsim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. Concurrency and Computation: Practice and Experience (CCPE), 14: 1175-1220, Nov.-Dec. 2002.

[14] Dalibor KIusacek, Ludek Matyska and Hana Rudova. Alea - grid scheduling simulation environment. In Proceedings of 7th International Conference on Parallel Processing and Applied Mathematics (PPAM 2007), volume 4967, pages 1029-1038. Springer, 2008.

423

Documents

[IEEE 2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing (PDGC) - Solan, India (2012.12.6-2012.12.8)] 2012 2nd IEEE International Conference on Parallel,