Upload
subhabrata
View
219
Download
3
Embed Size (px)
Citation preview
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing
Load Balanced Static Grid Scheduling Using Max-Min Heuristic
Tarun Kumar Ghosh, Rajmohan Goswami, Sumit Bera and Subhabrata Barman
Abstract- Grid computing, inspired by electrical power Grid, is an emerging trend for making easy access to computing resources. Its main objective is to solve computationally hard problems which otherwise cannot be solved by single CPU. This extremely high computing power is achieved by optimal utilization of geographically distributed heterogeneous resources which are lying idle. Load balanced task scheduling is very important problem in complex Grid environment. So task scheduling which is one of the NP-Complete problems becomes a focus of research scholars in Grid computing area. The traditional Min-Min and Max-Min algorithms are simple algorithms that produces a schedule that minimizes the makespan than the other traditional algorithms in the literature. In real scenario of Min-Min and Max-Min failed to produce a load balanced schedule. The proposed method has two-phases. In the first phase the traditional Max-Min algorithm is executed and in the second phase the tasks are rescheduled to use the un utilized resources effectively.
Index Terms-Grid Computing, Grid Scheduling, Makespan, Max-Min Heuristic, GridSim.
I, INTRODUCT ION
The term "Grid" was coined in the mid-1990s to denote a
distributed computing infrastructure for advanced science and
engineering. Grids enable formation of Virtual Organiza
tions(VO). Grids facilitate collaboration among multiple organ
isations for sharing of resources. Grids also provide fault tol
erance and reliability. The Grid makes provision for automatic
resubrnission of jobs to other available resources when a failure
is detected. Grids balance and share varied resources. This as
pect enables the Grid to evenly distribute the tasks to the avail
able resources. Grids perform parallel processing. Some tasks
can be broken into multiple subtasks, each of which can run on
a different machines. Grids assure Quality of Service (QoS).
Service Level Agreement (SLA) specifies the minimum quality
of service, availability, etc, expected by the user and the charges
levied on those services [1], [2].
This paper is organized as follows. Section II introduces Grid
scheduling issues. Section III briefly outlines the relevant past
work done on static scheduling in Grid environment. In Sec
tion IV, the framework of Grid scheduling problem has been
defined. Section V highlights Max-Min heuristic. Section VI
describes Load Balanced Max-Min algorithm for Grid Schedul
ing. Section VII presents the results obtained from simulation.
T. K. Ghosh(email:[email protected]). S. Bera and S. Barman (email:[email protected]) are with the Computer Science and Engineering, Haldia Institute of Technology, Haldia, Purba Medinipur and R. Goswami(email:[email protected]) is with the Computer Applications, Pailan College of Management and Technology, Kolkata, West Bengal, India.
978-1-4673-2925-5/12/$31.00 ©2012 IEEE
In fine, Section VIII concludes the paper.
II. GRID SCHEDULING ISSUES
Because of the heterogeneous and the dynamic nature of the
Grid, scheduling in Grid Environment is significantly compli
cated.
A. Resource Broker
Most Grid systems use the Grid resource broker for resource
discovery, deciding allocation of a job to a particular resource,
binding of user applications (files)to hardware resources, initi
ating computations, adapting to the changes in Grid resources
and presenting the Grid to the user as a single, unified resource.
It finally controls the physical allocation of the tasks and man
ages the available resources constantly while dynamically up
dating the Grid scheduler whenever there is a change in re
source availability [3].
B. Types of Grid Scheduling
Out of the various scheduling policies known, static and dy
namic schedulings are important and easy to implement.
Static Scheduling, also called off-line scheduling technique,
is a scheduling in which all decisions are taken before the exe
cution of a schedule. It is suitable when all the tasks (or appli
cations) and resources are known in advance.
Dynamic Scheduling, also called on-line scheduling, is a
scheduling in which some or all the decisions are taken during
the execution. It is suitable when jobs and machines are coming
on-line or going off-line due to failures, the processor speed of
each processor is varying during the scheduling and difficulties
are encountered in predicting the cost of applications. Dynamic
mapping is performed when the arrival of tasks is not known
beforehand [4], [5].
III, RELATED WORKS
In classical distributed systems, comprised of homogeneous
and dedicated resources, load balancing algorithms have been
intensively studied. But these algorithms will not work well in
Grid architecture because of its heterogeneity, scalability and
autonomy. This makes load balanced scheduling algorithms for
Grid computing more difficult and an interesting topic for many
researchers [6].
419
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing
A. Min-Min
Min-Min algorithm starts with a set of all unmapped tasks. The
machine that has the minimum completion time for all jobs is
selected. Then the job with the overall minimum completion
time is selected and mapped to that resource. The ready time of
the resources are updated. This process is repeated until all the
unmapped tasks are assigned. Compared to Minimum Comple
tion Time algorithm this algorithm considers all the jobs at a
time. So it produces a better makespan.
B. Max-Min
Max-Min is similar to Min-Min algorithm. The machine that
has the minimum completion time for all jobs is selected. Then
the job with the overall Maximum Completion Time (MCT) is
selected and mapped to that resource. The ready time of the
resources are updated. This process is repeated until all the
unmapped tasks are assigned. The idea of this algorithm is to
reduce the wait time of the large jobs.
C. Double Min-Min
Doreen. D. et al. [7] have proposed an efficient Set Pair Analy
sis (SPA) based task scheduling algorithm named Double Min
Min Algorithm which performs scheduling in order to enhance
system performance in Hypercubic P2P Grid (HPGRID). The
simulation result shows that the SPA based Double Min-Min
scheduling minimizes the makespan with load balancing and
guarantees the high system availability in system performance.
D. QoS Guided Min-Min
He. X et al. [8] have presented a new algorithm based on
the conventional Min-Min algorithm. The proposed algorithm
which is called QoS guided Min-Min, schedules tasks requiring
high bandwidth before the others. Therefore, if the bandwidth
required by different tasks varies highly, the QoS guided Min
Min algorithm provides better results than the Min-Min algo
rithm. Whenever the bandwidth requirement of all of the tasks
is almost the same, the QoS guided Min-Min algorithm acts
similar to the Min-Min algorithm.
E. Min-Mean
Kamalam et al. [9] presents a new scheduling algorithm named
Min-Mean heuristic scheduling algorithm for static mapping to
achieve better performance. The proposed algorithm resched
ules the Min-Min produced schedule by considering the mean
makespan of all the resources. The algorithm deviates in pro
ducing a better schedule than the Min-Min algorithm when the
task heterogeneity increases.
F. Weighted Heuristics
Sameer Singh et al. [10] have presented two heuristic algo
rithms: QoS Guided Weighted Mean Time-Min(QWMTM) and
QoS Guided Weighted Mean Time Min-Min Max-Min Selec
tive(QWMTS). Both algorithms are for batch mode indepen
dent tasks scheduling. The network bandwidth is taken as QoS
parameter.
G. Predictive Heuristics
Singh M. et al. [11] present a QoS based predictive Max-Min,
Min-Min Switcher algorithm for scheduling jobs in a Grid. The
algorithm makes an appropriate selection among the QoS based
Max-Min or QoS based Min-Min algorithm on the basis of
heuristic applied, before scheduling the next job. The effect on
the execution time of Grid jobs due to non-dedicated property
of the resources has also been considered. The algorithm uses
the history information about the execution of jobs to predict
the performance of non-dedicated resources.
IV. PROBLEM DEFINIT ION
In static heuristics, the accurate estimate of the expected exe
cution time for each task on each machine is known a priori to
execution and is contained within an ETC (Expected Time to
Compute) matrix where ETC(ti,mj) is the estimated execu
tion time of task i on machinej [3].
The main aim of the scheduling algorithm is to minimize the
makespan. Using the ETC matrix model, the scheduling prob
lem can be defined as follows:
Let task set T = tl, t2, t3, ... , tn be the group of tasks submit
ted to scheduler.
Let Resource set R = ml, m2, m3, ... , mk be the set of re
sources available at the time of task arrival.
Makespan produced by any algorithm for a schedule can be cal
culated as follows:
makespan = max(ct(ti, mj))
where
where
ct completion time of machines
etij expected execution time of job i on resource j
T'j ready time or availability time of resource j after completing
the previously assigned jobs
V. MAX-M IN - A META-HEURIST IC
In Max-Min heuristic (See Algorithm 1), U is the set of all
unassigned tasks. Firstly, the set of minimum completion times,
M = mino'5cj<u(ct(ti,mj)), for each ti E U, is found. Next,
420
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing
the task with the overall MCT from M is selected and assigned
to the corresponding machine (Hence the name Max-Min). Fi
nally, the newly mapped task is removed from U, and process
is repeated until U is empty, that is, all tasks are assigned.
The advantage of the Max-Min is to minimize the prob
lems incurred from performing tasks that have longer execution
times. Assigning the task to the best machine which has the
longer execution time allows this task to be executed concur
rently with the remaining tasks which have the shorter execu
tion times [12]. The dynamic version is also available.
Algorithm 1 Max-Min Algorithm
1: for all task ti in meta-task Mv (in an arbitrary order) do
2: for all machines mj (in a fixed arbitrary order) do
3: etij = etij + Tj 4: end for
5: carry out some processing
6: while all task in Mv are mapped do
7: for each task ti in Mv find its earliest completion time
and the machine that obtains it do
8: find the task ti with the maximum earliest comple
tion time
9: end for
10: assign the task ti to the machine mi that gives the ear-
liest completion time
11: delete the task tk from Mv
12: update Tl 13: update etil for all ti E M V
14: end while
15: end for
VI. LOAD BALANCED MAX-M IN ALGORITHM
The Max-Min algorithm seems to do better than the Min-Min
algorithm in the cases when the number of short tasks is much
more than the long ones. For example, if there is only one long
task, the Max-Min algorithm executes many short tasks concur
rently with the long task. In this case, the makespan of the sys
tem is most likely determined by the execution time of the long
task.So Load Balanced Max-Min algorithm (See Algorithm 2)
executes Max-Min in the first round. In the second round it
chooses the resources with heavy load and reassigns them to
the resources with light load.
A. Basic Concepts
Load Balanced Max-Min identifies the resources with heavy
load by choosing the resource with high makespan in the sched
ule produced by Max-Min. It then considers the tasks assigned
in that resource and chooses the task with minimum execution
time on that resource. The completion time for that task is cal
culated for all resources in the current schedule. Then the MCT
of that task is compared with the makespan produced by Max
Min. If it is less than makespan then the task is rescheduled
in the resource that produces it, and the ready time of both re
sources are updated. Otherwise the next MCT of that task is se
lected and the steps are repeated again. The process stops if all
resources and all tasks assigned in them have been considered
for rescheduling. Thus the possible resources are rescheduled in
the resources which are idle or have minimum load. This makes
Load Balanced Max-Min to produce a schedule which increases
load balancing. Since it compares the MCT with makespan,
Load Balanced Max-Min reduces the overall completion time
also.
Algorithm 2 Second Round of Load Balanced Max-Min Algo
rithm
1: sort the resources in the order of completion time
2: for all resources R do
3: Compute makespan = max(et(R)) 4: end for
5: for all resources do
6: for all tasks do
7: find the task Ti that has minimum et in Rj 8: find the MCT of task Ti 9: if MCT < makespan then
10: Reschedule the task Ti to the resource that produces
it
11: Update the ready time of both resources
12: end if
13: end for
14: end for
B. Example with Analysis
Consider a Grid environment with two resources Rl and R2
and a meta-task group Mv with four tasks Tl, T2, T3 and
T4. The Grid scheduler is supposed to schedule all the tasks
within Mv on the available resources Rl and R2. Since Load
Balanced Max-Min algorithm is simple and produces a better
makespan than the other algorithms discussed in the literature,
the proposed algorithm executes the Max-Min algorithm in the
first phase to schedule the jobs. But to remove the limitation
of unbalanced load in Max-Min, the jobs are rescheduled in the
second phase. In this problem the execution time of all tasks are
known prior. They can also be calculated if the number of in
structions in each job and the computation rate of each resource
is known. Table I represents the execution time of the tasks on
each resource.
Static mapping of tasks to machines based on Max-Min is
shown in Figure 1. All tasks are scheduled to resource Rl and
resource R2 remains idle. The makespan produced by Max
Min is 11 seconds.
According to the proposed Load Balanced Max-Min algo
rithm, task Tl's MCT is less than makespan produced by Max
Min. Other task's MCT is not less than makespan. So task Tl is
rescheduled in resource R2 and the remaining tasks are sched
uled in the same resource Rl. Mapping of tasks based on Load
Balanced Max-Min is shown in Figure 2. Thus the reschedul-
421
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing
ing of Max-Min algorithm utilizes the idle resource R2 as well
as reduces the makespan to 10 seconds.
TABLE I
COMPLETION TIME OF THE TASKS ON EACH OF THE RESOURCES
Tasksl Resources Rl R2
Tl 3 10
T2 2 13
T3 5 15
T4 1 12
Fig. 1. Gantt chart of the Max-Min algorithms
Fig. 2. Gantt chart of Load Balanced Max·Min algorithm
VII. COMPARISON AND DISCUSSION
The proposed algorithm was simulated in Grid scheduling sim
ulation environment Alea 3.0.
A. Alea 3.0
Alea 3.0 is an extension of Java package GridSim. GridSim
is a toolkit for the modeling and simulation of distributed re
source management and scheduling for Grid computing [13].
Alea 3.0 is used for evaluation of various job scheduling tech
niques. Functions that approximate makespan, tardiness, and
other values important for the scheduling algorithms are imple
mented in Alea 3.0 [14].
The evaluation of the proposed algorithm was a large sim
ulation. Running Alea 3.0 during simulation required a lot of
memories since many objects were created. The simulation was
performed on the Intel Pentium 4, 2.8 GHz machine with 512
MB RAM.
B. Simulation Output
Current Grid scheduling systems are all queue based systems.
These systems use one or more incoming queues where jobs are
stored until they are scheduled for execution. All systems use
basic First Come First Served (FCFS). Through the simulation,
the proposed Revised load Balanced Max-Min scheduling algo
rithm has been compared with the FCFS algorithm.Figure 3 and
Figure 4 present graphs depicting the number of waiting and
running jobs per day, following FCFS and Load Balanced Max
Min algorithms respectively, as were generated by the Alea 3.0
during the simulation.
These graphs demonstrate major differences among the al
gorithms. Concerning the machine usage, FCFS generates poor
results in not being able to utilize available resources which is
depicted by the existence of waiting jobs even in the last part
of the simulation. In contrast, it is observed that the Load Bal
anced Max-Min based approach is able to manage the load ef
ficiently through efficient search technique which is clearly de
picted by the increase in the number non-waiting jobs in the last
part of the simulation.
570
513
456
399
342
285
228
Number of wail ingl running jobs
25
days
Fig. 3. Waiting and running jobs by FCFS
4BO
432
384
336
'BB
240
192
144
96
48
Number of wait ingl running jobs
25
days
wailing jobs running Jobs
50
wailing jobs running jobs
50
Fig. 4. Waiting and running jobs by Load Balanced Max·Min algorithm
VIII. CONCLUSION AND FUTURE WORK
Min-Min and Max-Min algorithms are applicable in small scale
distributed systems. The Max-Min algorithm seems to do bet
ter than the Min-Min algorithm in the cases when the number
of short tasks is much more than the long ones. For example,
if there is only one long task, the Max-Min algorithm executes
many short tasks concurrently with the long task. Although
load balancing in small scale distributed systems is desirable
and leads to reduced total completion times, however, in large
scale distributed systems load balancing does not necessarily
422
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing
results in the shortest makespan. The proposed algorithm out
performs Max-Min in large scale systems, because it focuses
on minimizing the completion time of tasks. The proposed al
gorithm is executed in two-phases. It uses the advantages of
Max-Min and covers it's disadvantages by reducing makespan
and maximizing resource utilization.
The experimental results, obtained by applying the proposed
algorithm for various problems, show that it outperforms the
existing Grid scheduling algorithms. This study is only con
cerned with the limited number of resources and task execution
time. The study can be further extended by considering low and
high machine heterogeneity and task heterogeneity. Also, ap
plying the proposed algorithm on actual Grid environment, its
efficiency may be observed.
REFERENCES
[l] Anthony Sulistio, Chee Shin Yeo and Rajkumar Buyya. A taxonomy of computer-based simulations and its mappings to parallel and distributed systems simulation tools. Software - Practice And Experience, 34:653-673, April 2004.
[2] Ian Foster and Carl Kesselman. editors. T he Grid: Blueprint for a new computing infrastructure. Morgan Kaufmann, San Francisco, CA, 1999.
[3] Ajith Abraham, Rajkumar Buyya and Baikunth Nath. Nature's heuristics for scheduling jobs on computational grids. In P.S. Sinha and R. Gupta, editors, Proceedings of 8th IEEE International Coriference on Advanced Computing and Communications, ( ADCOM2000), pages 45-52. Tata McGraw-Hili Publishing Co. Ltd, New Delhi, 2000.
[4] J.G. Kuhl Casavant. A taxonomy of scheduling in general-purpose distributed computing systems. IEEE Transaction on Software Engineering, 14(2):141-154,1988.
[5] T. D. Braun, Maheswaran and H. J. Siegel. Heterogeneous Distributed Computing, Encyclopedia of Electrical and Electronics Engineering. John Wiley and Sons, New York, NY, 1999.
[6] Rajmohan Goswami, Tarun Kumar Ghosh and Subhabrata Barman. Local search based approach in grid scheduling using simulated annealing. In Proceedings of IEEE International Coriference on Computer and Communication Technology (ICCCT), Allahabad, India, 2011.
[7] D.Doreen Hephzibah Miriam and K.S.Easwarakumar. A double min min algorithm for task metascheduler on hypercubic p2p grid systems. UCSI International Journal of Computer Science Issues, 7(5):8-18, July 2010.
[8] He X., X-He Sun and Laszewski G.v. Qos guided minmin heuristic for grid task scheduling. Journal of Computer Science and Technology, 18:442-451,2003.
[9] Kamalam G.K. and Muralibhaskaran V. A new heuristic approach:minmean algorithm for scheduling meta-tasks on heterogenous computing systems. UCSNS International Journal of Computer Science and Network Security, 10(1):442-451, January 2010.
[10] Sameer Singh Chauhan and R. Joshi. Qos guided heuristic algorithms for grid task scheduling. International Journal of Computer Applications, 2(9):24-31,2010.
[11] Singh M. and Suri P.K. A qos based predictive max-min, min-min switcher algorithm for job scheduling in a grid. lriformation Technology Journal, 7(8):1176-1181, 2008.
[12] V. P. Roychowdhury, Wang, H. 1. Siegel and A. A. Maciejewski. Task matching and scheduling in heterogeneous computing environments using a genetic algorithm based approach. Journal of Parallel and Distributed Computing, 220(4598):8-22, 1997.
[13] Rajkumar Buyya and Manzur Murshed. Gridsim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. Concurrency and Computation: Practice and Experience (CCPE), 14: 1175-1220, Nov.-Dec. 2002.
[14] Dalibor KIusacek, Ludek Matyska and Hana Rudova. Alea - grid scheduling simulation environment. In Proceedings of 7th International Conference on Parallel Processing and Applied Mathematics (PPAM 2007), volume 4967, pages 1029-1038. Springer, 2008.
423