7

Click here to load reader

[IEEE 2007 22nd international symposium on computer and information sciences - Ankara, Turkey (2007.11.7-2007.11.9)] 2007 22nd international symposium on computer and information sciences

  • Upload
    meriem

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: [IEEE 2007 22nd international symposium on computer and information sciences - Ankara, Turkey (2007.11.7-2007.11.9)] 2007 22nd international symposium on computer and information sciences

A Load Balancing Model for Grid EnvironmentBelabbas Yagoubi #1 and Meriem Medebber ∗2

#Department of Computer Science, Faculty of Sciences, University of OranCampus Professeur Taleb Mourad, 31000 Oran, Algeria

[email protected]∗Department of Computer Science, University of Mascara

29000 Mascara, Algeria2mimi mer [email protected]

Abstract— Workload and resource management are two essen-tial functions provided at the service level of the Grid software.To improve the global throughput of these environments, effectiveand efficient load balancing algorithms are fundamentally impor-tant. Although load balancing problem in classical distributedsystems has been intensively studied, new challenges in Gridcomputing still make it an interesting topic, and many researchprojects are under way. This is due to the Grid characteristicsand to the complex nature of the problem.This paper presents a task load balancing model in Gridenvironment. First we propose a tree-based model to representGrid architecture in order to manage workload. This modelis characterized by three main features: (i) it is hierarchical;(ii) it supports heterogeneity and scalability; and, (iii) it is totallyindependent from any Grid physical architecture. Second, wedevelop a hierarchical load balancing strategy to balance tasksamong Grid resources. The main characteristics of the proposedstrategy are: (i) it uses a task-level load balancing; (ii) it privilegeslocal tasks transfer to reduce communication cost; and, (iii) it isa distributed strategy with local decision making.

I. INTRODUCTION

The availability of low cost powerful computers coupledwith the popularity of the Internet and high-speed networkshave led the computing environment to be mapped fromclassical distributed to Grid environments. In fact, recentresearches on computing architectures allowed the emergenceof a new computing paradigm known as Grid computing.Grid is a type of a distributed system which supports thesharing and coordinated use of resources, independently fromtheir physical type and location [1]. This technology allowsthe use of geographical widely distributed and multi-ownerresources to solve large-scale applications like meteorologicalsimulations, data intensive applications, etc. [2].

Although load balancing problem in conventional dis-tributed systems has been intensively studied, new challengesin Grid computing still make it an interesting topic, andmany research projects are under way. This is due to thecharacteristics of Grid computing and to the complex natureof the problem itself. Load balancing algorithms in classicaldistributed systems, which usually run on homogeneous anddedicated resources, cannot work well in the Grid architec-tures. Grids have a lot of specific characteristics [3], likeheterogeneity, autonomy and dynamicity, which remain obsta-cles for applications to harness conventional load balancingalgorithms directly.

Load balancing is a mapping strategy that efficiently equi-librates the tasks load into multiple computational resourcesin the network based on the system status to improve perfor-mance [4]. The essential objective of a load balancing can be,depending on the user or the system administrator, defined by:

• The aim for the user is to minimize the makespan ofits own application, regardless the performance of otherapplications in the system.

• The main goal for administrator is to maximize meetthe tasks deadline by ensuring maximal utilization ofavailable resources.

Typically, a load balancing scheme consists of four policies:

1) The information policy is responsible to define whenand how the information on the Grid resourcesavailability is updated.

2) The location policy determines a suitable transferpartner (server or receiver) once the transference policydecided that this resource is server or receiver.

3) The selection policy defines the task that should betransferred from the busiest resource to the idlest one.

4) The transference policy classifies a resource as server orreceiver of tasks according to its availability status.

Our main contributions in this paper are two folds. Firstwe propose a dynamic tree-based load balancing model.Second we develop a hierarchical strategy to load balancetasks among resources of computational Grid. Based on atree representation of a Grid, our strategy privileges localbalancing than global one. The main objectives addressedby this neighborhood strategy are: (i) The reduction of theaverage response time of tasks; and, (ii) The reduction of thecommunication cost induced by task transferring.

The rest of this paper is organized as follows: Some relatedworks are described in Section 2. The mapping of any Gridarchitecture into a tree is explained in Section 3. Section 4describes the main steps of the proposed load balancingstrategy. Associated load balancing algorithms are depictedin section 5. Behaviors and performance of the strategy aredescribed in Section 6. Finally, Section 7 concludes the paperand provides a preview of future research.

1-4244-1364-8/07/$25.00 ©2007 IEEE

Page 2: [IEEE 2007 22nd international symposium on computer and information sciences - Ankara, Turkey (2007.11.7-2007.11.9)] 2007 22nd international symposium on computer and information sciences

II. RELATED WORKS

Most application-level load balancing approaches are ori-ented on application partitioning via graph algorithms [5].However, it does not address the issue of reducing migrationcost, that is, the cost entailed by load redistribution, whichcan consume order of magnitude more time than the actualcomputation of a new decomposition. Some works [6] haveproposed a latency- tolerant algorithm that takes advantageof overlapping the computation of internal data and thecommunication of incoming data to reduce data migrationcost. Unfortunately, it requires applications to provide sucha parallelism between data processing and migration, whichrestricts its applicability. Agent-based approaches have beentried to provide load balancing in cluster of machines [7].Genaud [8] enhance the MPI Scatterv primitive to supportmaster-slave load balancing by taking into consideration theoptimization of computation and data distribution using alinear programming algorithm. However, this solution is lim-ited to static load balancing. In [9], Hu proposes an optimaldata migration algorithm in diffusive dynamic load balancingthrough the calculation of Lagrange multiplier of the Euclideanform of transferred weight. This work can effectively minimizethe data movement in homogenous environments, but it doesnot consider the network heterogeneity. In particular, workloadmigration is critical to take into account because the widearea network performance is dynamic, changing throughoutexecution, instable, etc., in addition to considering the resourceheterogeneity. This communication aspect is neglected intraditional application-level load balancing strategies.

Although, as mentioned above, a large number of loadbalancing techniques and heuristics have been presented in theliterature, most of them target only homogeneous resources.However, modern computing systems, such as the compu-tational Grid, are most likely to be widely distributed andstrongly heterogeneous. Therefore, it is essential to considerthe impact of these characteristics on the design and analysisof load balancing algorithms.

The traditional objective, when balancing sets of com-putational tasks, is to minimize the overall execution timecalled makespan. However, in the context of heterogeneousdistributed platforms, makespan minimization problems are inmost cases NP-complete, sometimes even APX-complete [4].But, when dealing with large scale systems, an absolute min-imization of the total execution time is not the only objectiveof a load balancing strategy. We think that the communicationcost, induced by load redistribution, is also a critical issue.For this purpose, we propose, in this paper, a novel loadbalancing strategy to address the new challenges in Gridcomputing. Comparatively to the existing works, the maincharacteristics of our proposed strategy can be summarizedas follows:(i) It uses a task-level load balancing;(ii) It privileges local tasks transfer than global ones , to reducecommunication costs;(iii) It is a distributed strategy with local decision making.

III. TREE-BASED LOAD BALANCING MODEL

A. Grid topology

From the topological point of view, we regard a Gridcomputing as a set of clusters in a multi-nodes platform.Each cluster owns a set of worker nodes and belongs to alocal domain, i.e. a LAN (Local Area Network). Every clusteris connected to the global network or WAN (World AreaNetwork) by a switch. Figure 1 describes this topology.

Fig. 1. Example of a Grid topology

B. Mapping a Grid into a tree-based model

The load balancing strategy proposed in this paper is basedon mapping of any Grid architecture into a tree structure. Thistree is built by aggregation as follows:

• First, for each cluster we create a two levels subtree.The leaves of this subtree correspond to the nodes of thecluster, and its root, called cluster manager, represents avirtual tree node associated to the cluster, whose role isto manage the cluster workload.

• Second, subtrees corresponding to all clusters are aggre-gated to build a three levels tree whose root is a virtualtree node designated as Grid manager.

The final tree is denoted by C/N , where C is the numberof clusters that compose the Grid and N the total number ofworker nodes.As illustrated in Figure 2, this tree can be transformed intotwo specific trees: C/N and 1/N , depending on the valuesof C and N . The mapping function generates a non cyclicconnected graph where each level has specific functions.

• Level 0: In this first level, we have the Grid manager,which realizes the following functions:(i) Maintains the workload information about the clustermanagers.(ii) Decides to start a global load balancing between theclusters of the Grid, which we will call intra-Grid loadbalancing.(iii) Sends the load balancing decisions to the clustermanagers of level 1 for execution.

Page 3: [IEEE 2007 22nd international symposium on computer and information sciences - Ankara, Turkey (2007.11.7-2007.11.9)] 2007 22nd international symposium on computer and information sciences

• Level 1: Each cluster manager of this level is associatedto a physical cluster of the Grid. In our load balancingstrategy, this manager is responsible to :(i) Maintain the workload information relating to eachone of its worker nodes.(ii) Estimate the workload of associated cluster and sendthis information to the Grid manager.(iii) Decide to start a local load balancing, which we willcall intra-cluster load balancing.(iv) Send the load balancing decisions to the worker nodeswhich it manages, for execution.

• Level 2: At this last level, we find the worker nodes ofthe Grid linked to their respective clusters. Each node atthis level is responsible to:(i) Maintain its workload information.(ii) Send this information to its cluster manager.(iii) Perform the load balancing decided by its manager.

Nodes

Level 2

Model 1/N

Level 1

���

���

��������

��������

������������������������������������������������������������������������������������������������������������������������

������������������������������������������������������������������������������������������������������������������������

������������������������������������������������������������������������������������������������������������������������������������������������������������

������������������������������������������������������������������������������������������������������������������������������������������������������������

����������������������

����������������������

������Level 0

Clusters Manager

Grid Manager

Model C/N

Model 1/N

Fig. 2. Tree-based representation of a Grid

C. Characteristics of the proposed model

The proposed model is characterized as follows:1) It is hierarchical: This propriety will facilitate the infor-

mation flow through the tree. We distinguish three typesof information flow:(i) Ascending flow: This flow relates to the workloadinformation flow in order to get current workload statutof Grid resources.(ii) Horizontal flow: It concerns the inputs useful toperform load balancing operations.(iii) Descending flow: This flow convey the load balanc-ing decisions made by managers corresponding to thevarious levels of the model.

2) It supports heterogeneity and scalability of Grids: Con-necting/disconnecting Grid resources corresponds toadding/removing leaves or subtrees from the tree model.

3) It is totally independent from any physical architectureof a Grid: The transformation of a Grid topology into atree is univocal. For each Grid, we associate one andonly one tree, independently of the Grid topologicalcomplexity.

IV. LOAD BALANCING STRATEGY

A. Principles

In accordance with the hierarchical structure of the proposedmodel, we distinguish two load balancing levels: Intra-cluster(or Inter-nodes) and Intra-Grid (or Inter-clusters):

• Intra-cluster load balancing: In this first level, dependingon current workload of its associated cluster, estimatedfrom its own worker nodes, each cluster manager decideswhether to start or not a load balancing operation. If itdecides to start a balancing operation, then it tries, inpriority, to load balance its workload among its workernodes. Hence, we can proceed C parallel local loadbalancing, where C is the number of clusters.

• Intra-Grid load balancing: The load balancing at thislevel is performed only if some cluster managers failto load balance their workload among their associatedworker nodes. The local balancing failure may be dueeither to saturation of the cluster or insufficient supply.In this case, tasks of overloaded clusters are transferredto underloaded ones regarding the communication costand according to the selection criteria. The chosen un-derloaded clusters are those which need minimal com-munication cost for transferring tasks from overloadedclusters.

The main advantage of this strategy is to privilege localload balancing in first (within a cluster and then on the entireGrid). The goal of this neighbourhood approach is to decreasethe amount of messages exchanged between clusters. As aconsequence, the communication overhead induced by taskstransfer and flow information is reduced.

B. Generic strategy

At any load balancing level, we propose the following threesteps strategy. As the description will be done in a generic way,we will use the concept of group and element. Depending oncases, a group designs either a cluster or the Grid (level 1 orlevel 0 in the tree). An element is a group component (workernode of level 2 or cluster of level 1).The main steps of our strategy can be summarized as follows:Step 1: Estimation of the current group workloadHere we are interested by the information policy to definewhat information reflects the workload status of the group?When is it to be collected and from where?Knowing the number of available elements under its con-trol and their computing capabilities, each group managerestimates its associated group capability by performing thefollowing actions:

• Estimates current workload of the group based on work-load information received from its component elements.

• Computes the standard deviation over the workload indexin order to measure the deviations between its involvedelements.

• Sends workload information to its Grid manager.

To consider the heterogeneity between worker nodes capa-bilities, we propose to take as workload index the processing

Page 4: [IEEE 2007 22nd international symposium on computer and information sciences - Ankara, Turkey (2007.11.7-2007.11.9)] 2007 22nd international symposium on computer and information sciences

time denoted TEX . We define the processing time of an entity(element or group) as ratio between its workload (noted LOD)and its capability (SPD) ⇔ TEX = LOD

SPD .

Step 2: Decision makingIn this step the manager decides whether it is necessary toperform a load balancing operation or not. For this purpose itexecutes the two following actions:

1) Defining the imbalance/saturation state of the group.If we consider that the standard deviation measuresthe average deviation between the processing times ofelements and the processing time of their group, we cansay that this group is in balance state when this deviationis small. Indeed, this implies that processing time ofeach element converges to the processing time of itsgroup. Then, we define the balance and saturation statesas follows:Balance state: In practice, we define a balance thresh-old, denoted as ε, from which we can say that thestandard deviation tends to zero and hence the groupis balanced. We propose for this purpose a thresholdε ∈ [0 − 1], thus can write the following expression:If (σ ≤ ε) Then the group is balanced Else It isimbalanced.Saturation state: A group can be balanced while beingsaturated. In this particular case, it is not useful to startan intra group load balancing since its elements willremain overloaded. To measure saturation, we introduceanother threshold called saturation threshold, denotedas δ. When the current workload of a group borders itscapacity, it is obvious that it is useless to balance sinceall belonging components are saturated.

2) Group partitioning.For an imbalance case, we determine the overloadedelements (sources) and the underloaded ones (receivers),depending on processing time of every element relativelyto average processing time of the associated group.

Step 3: Tasks transferringIn order to transfer tasks from overloaded elements to underloaded ones, we propose the following heuristic:

1) Evaluate the total amount of load ”Supply”, available onreceiver elements.

2) Compute the total amount of load ”Demand”, requiredby source elements.

3) If the supply is much lower than the demand (supply isfar to satisfying the request) it is not recommended tostart local load balancing.We introduce a third threshold, called expectationthreshold denoted ρ, to measure relative deviation be-tween supply and demand.We can then write the following expression:If (Supply/Demand > ρ) Then Perform a local loadbalancing Else Perform a higher level load balancing.

4) Performs tasks transfer regarding communication cost

induced by this transfer and according to selectioncriteria.

As criterion selection we propose the following:

• Shortest process time: which transfer in first the task withshortest remaining processing time.

• Longest process time: transfer the task with longestremaining processing time.

• FIFO: transfer the first submitted task (oldest task).• LIFO: transfer the last submitted task (youngest task).• Random: choosing a task randomly.

C. Supply and Demand estimation

The supply of a receiver element Er corresponds to theamount of load Xr that it agrees to receive so that itsprocessing time: TEXr ∈ [TEXG − σG ; TEXG + σG],where σG corresponds to the standard deviation over theprocessing time of the group associated to element Er,and TEXG represents the processing time of this group.In practice, we must reach the convergence: TEXr → TEXG.TEXr = LODr+Xr

SPDr� LODG

SPDG⇒

Xr = LODG.SPDr

SPDG− LODr

Thus, we estimate the total supply of receivers set GER by:

Supply =∑

Er∈GER

Xr

By similar reasoning we determine the demand of a sourceelement Es witch corresponds to the load Ys that it requeststo transfer so that: TEXs → TEXG

TEXs = LODs−Ys

SPDs� LODG

SPDG⇒ Ys = LODs − LODG.SPDs

SPDG

The total demand of sources set GES is :

Demand =∑

Es∈GES

Ys

V. LOAD BALANCING ALGORITHM

We define two levels of load balancing algorithms: intra-cluster and intra-Grid.

A. Intra cluster load balancing algorithm

This algorithm is considered as the kernel of our loadbalancing strategy. The neighbourhood load balancing used byour strategy makes us think that the imbalance situations canbe resolved within a cluster. It is triggered when any clustermanager finds that there is a load imbalance between the nodeswhich are under its control. To do this, the cluster managerreceives periodically workload information from each workernode. On the basis of these information and the estimatedbalance threshold ε , it analyzes the current workload of thecluster. According to the result of this analysis, it decideswhether to start a local balancing in the case of imbalancestate, or eventually just to inform its Grid manager about itscurrent workload. At this level, communication costs are nottaken into account in the task transfer since the worker nodesof the same cluster are interconnected by a LAN network, ofwhich communication cost is constant.

Page 5: [IEEE 2007 22nd international symposium on computer and information sciences - Ankara, Turkey (2007.11.7-2007.11.9)] 2007 22nd international symposium on computer and information sciences

B. Intra Grid load balancing algorithm

This algorithm, that uses a source-initiated approach, per-forms a global load balancing among all clusters of the Grid. Itis started in the extreme case where some cluster managers failto locally balance their overload. Knowing the global state ofeach cluster, the Grid manager can evenly distribute the globaloverload between its clusters. Contrary to the intra-clusterlevel, we should consider the communication cost betweenclusters. A task can be transferred only if the sum of its latencyin the source cluster and cost transfer is lower than its latencyon the receiver cluster. This assumption will avoid makinguseless task migration.

C. Generic intra-group algorithm (Case of a group G):

Step 1: Workload estimation1. Workload information about each element Ei of G:For Every Ei AND according to its specific period do

sends its workload LODi to its group managerend For

2. According to its period the group manager performs:a- Computes speed SPDG and capacity SATG of G.b- Evaluates current load LODG and processing time TEXG.c- Computes the standard deviation σG over processing times.d- Sends workload information of G to its manager.

Step 2: Decision making3. Balance criteria:Switch

G = cluster: If (σG ≤ ε) thenCluster in balance state; Return

end IfG = Grid: If (# overloaded clusters ≤ Given threshold) then

Grid is balanced ; Returnend If

end Switch

4. Saturation criteria:If ( LODG

SATG> δ) then

Group G is saturated ; Load balancing Fail; Returnend If

5. Partitioning G into overloaded (GES), underloaded(GER) and balanced (GEN )GES ← Φ; GER← Φ; GEN ← Φ

For Every element Ei of G doIf (Ei saturated) then

GES ← GES∪ { Ei}else

SwitchTEXi > TEXG + σG ⇒ GES ← GES ∪ {Ei}TEXi < TEXG − σG ⇒ GER ← GER ∪ {Ei}

TEXG − σG ≤ TEXi ≤ TEXG + σG

⇒ GEN ← GEN ∪ {Ei}end Switch

end Ifend For

Step 3: Tasks transferring6. Test on Supply and Demand:Supply =

∑Er∈GER

LODG.SPDrSPDG

− LODr

Demand =∑

Es∈GES LODs − LODG.SPDsSPDG

If (( SupplyDemand

≤ ρ)) thenlocal Load balancing Fail; Return

end If7. Tasks transferring:If (G = Cluster) then

Perform Heuristic1else

Perform Heuristic2end If

Heuristic 1: Intra-cluster tasks transferring

-Sort elements of GES by descending order of processing times.-Sort elements of GER by ascending order of processing times.While ((GES �= Φ And GER �= Φ)) do

For i = 1 To # (GER) do(i)Sort tasks of first node of GES by selectioncriterion,(ii) Transfer the higher priority task from first nodeof GES to ith node of GER,(iii) Update the current workloads of receiver andsource elements,(iv) Update sets GES, GER and GEN ,(v) If (GES = Φ OR GER = Φ) then

Returnend If

(vi) Sort GES by descending order of theirprocessing time

end Fordone

Heuristic 2: Intra-Grid tasks transferring

- Sort items of GES by descending order of their processingtimes.For Every cluster Cj of set GES do

(i) Sort the clusters Cr of GER by ascending order ofinter clusters (Cj-Cr) WAN bandwidth sizes.(ii) Sort the nodes of Cj by descending order of theirprocessing times.(iii) While ((GES �= Φ And GER �= Φ)) do

For i = 1 To # (GER) do(a) Sort tasks of first node belonging to Cj byselection criterion and communication cost,(b) Transfer the higher priority task from firstnode of Cj to ith cluster of GER,(c) Update the current workloads of sourceand receiver clusters,(d) Update sets GES, GER and GEN,(e) If (GES = Φ OR GER = Φ) then

Returnend If

(f) Sort GES by descending order of theirprocessing times.

end Fordone

end For

Page 6: [IEEE 2007 22nd international symposium on computer and information sciences - Ankara, Turkey (2007.11.7-2007.11.9)] 2007 22nd international symposium on computer and information sciences

VI. EXPERIMENTAL STUDY

A. Simulation parameters

In order to evaluate the practicability of proposed modeland performance of our strategy we have implemented theassociated algorithms on the GridSim V4.0 simulator [10],which we extended to support simulation of varying Gridload balancing problems. GridSim provides a comprehensivefacility for simulation of different classes of heterogeneous re-sources, users, applications, resource brokers, and schedulers.In GridSim, application tasks/jobs are modeled as Gridletobjects that contain all the information related to the job andthe execution management details, such as:

1) Resources parameters: These parameters give informa-tion about worker nodes, clusters and networks. A nodeis characterized by its capacity, speed and networksbandwidth sizes. Each characteristic is described througha configuration file. Both intra and inter clusters networkcapacities may be defined.

2) Tasks parameters: These parameters include the numberof tasks queued at every node, task submission date,number of instructions per task, cumulative processingtime, cumulative waiting time and so on.

The experiments were performed, based on the variationof several performance parameters in a Grid, namely thenumber of clusters, their worker nodes and the number oftasks. We focused on the following objectives relating to aset of tasks submitted during a given period: Average waitingtime, Average execution time and Average response time.To evaluate the benefit of our strategy, we compute the abovemetrics before (denoted Bef ) and after (Aft) execution of ourload balancing algorithm. All the experiment were performedon 3Ghz P4 Intel Pentium with 1 GB main memory, runningon Linux Redhat 9.0. In order to obtain reliable results, wereiterated the same experimentations more than ten (10) times.For the needs for our experiment, we considered that the tasksdistribution is done in a periodic and random way accordingto an uniform law. We randomly generated nodes capabilitiesand tasks parameters. After many evaluation tests, variousthresholds was set to: ε = 0.5, δ = 0.8 and ρ = 0.75.As selection criterion, we have adopted the shortest processtime which transfer in first the task with shortest remainingprocessing time. The LAN bandwidth clusters was defined tobe 500 MIPS (Million Instructions per Sec), and the WANbandwidth was set to be between 30 MIPS and 400 MIPS.

B. Results

Experiments 1: Intra-cluster load balancingIn the first set of experiments we focused our results relatingto objective metrics, according to various numbers of tasks andworker nodes, in intra-cluster load balancing. We have variedthe nodes number from 100 to 400 by step of 100. For eachnode we randomly generate associated speed varying between5 and 40 MIPS. The number of tasks have been varied from5000 to 10000 by step of 1000, with sizes randomly generatedbetween 1000 and 200000 MI (Million of Instructions). Table I

shows the variation of the average response time (in seconds)before and after execution of our load balancing strategy. Wecan note the following:

• Proposed strategy allowed to reduce in a very clear waythe mean response time of the tasks. We obtain a gain inmore than 95% of cases, varying between 6% and 57%.

• In more than 90% of cases, this improvement is greaterthan 10%.

• The lower improvements were obtained when the numberof nodes exceed 300. We can justify this by the instabilityof the Grid state (nodes are largely under-loaded).

• The best gains were realized when the number of nodeswas between 200 and 300. In this case, we can say thatour strategy is optimal.

• Due to GridSim limitations, we could not increase thenumber of tasks over 11000 in order to test the behaviorof our strategy for large task numbers.

TABLE I

IMPROVEMENT REALIZED BY INTRA-CLUSTER STRATEGY

# Nodes 100 200 300 400# Tasks

Bef 4.78E+05 2.32E+04 4.92E+03 1.75E+035000 Aft 3.32E+05 1.19E+04 3.48E+03 1.64E+03

Gain 31% 49% 29% 6%Bef 6.00E+05 3.03E+04 6.44E+03 2.28E+03

6000 Aft 4.21E+05 1.49E+04 3.81E+03 2.11E+03Gain 30% 51% 41% 7%Bef 7.43E+05 3.89E+04 8.31E+03 2.89E+03

7000 Aft 5.21E+05 1.94E+04 4.84E+03 2.67E+03Gain 30% 50% 42% 8%Bef 9.06E+05 4.86E+04 1.05E+04 3.63E+03

8000 Aft 6.46E+05 2.43E+04 5.76E+03 3.04E+03Gain 29% 50% 45% 16%Bef 1.09E+06 6.04E+04 1.32E+04 4.56E+03

9000 Aft 7.84E+05 3.00E+04 6.69E+03 3.84E+03Gain 28% 50% 49% 16%Bef 1.30E+06 7.42E+04 1.63E+04 5.63E+03

10000 Aft 9.39E+05 3.93E+04 7.03E+03 4.30E+03Gain 28% 47% 57% 24%

Experiments 2: Intra-Grid load balancingDuring these experiments , we were interested in the intra-Grid load balancing behaviors. We have considered differentnumbers of clusters and we supposed that each cluster involves30 worker nodes. For every node we generated capabilities andtask characteristics in the same way as the first experiments.The figure 3 illustrates the improvement of the mean responsetime, obtained by our load balancing strategy for variousnumbers of clusters by varying the number of tasks.

• Except the case of 16 clusters, all the profits are higherthan 10%. We consider this important behaviour verypromising.

• Best improvements are obtained when the Grid is in astable state: (For # clusters∈ {4, 6, 8}).

• The lower benefits were obtained when the number ofclusters were set to 16. We can justify this by theinstability of the Grid state (Most nodes are underloadedor even idle).

Page 7: [IEEE 2007 22nd international symposium on computer and information sciences - Ankara, Turkey (2007.11.7-2007.11.9)] 2007 22nd international symposium on computer and information sciences

• In some infrequent cases, we noted that the variation ofthe gains changes abruptly. We believe that this situationcomes from the fact that the number of tasks variessuddenly and generates instability.

Fig. 3. Gain according to various number of clusters

VII. CONCLUSION AND FUTURE WORKS

This paper addressed the problem of load balancing in Gridcomputing. We proposed a load balancing model based on atree representation of a Grid. The model takes into account theheterogeneity of the resources and it is completely independentfrom any Grid physical architecture. Basing on this model,we defined a hierarchical load balancing strategy having twomain objectives: (i) Reduction of the mean response time oftasks submitted to a Grid computing; and, (ii) Reduction ofthe communication costs during task transferring.Relatively to the existing works, our strategy uses a task-levelload balancing and privileges, as much as possible, a localload balancing to avoid the use of WAN communication. Itis a distributed strategy with local decision-making. Takinginto account the over cost generated by the execution of loadbalancing system, our strategy starts a balancing operationonly when it ensures that it is convenient. The first resultsobtained on GridSim simulator are very promising. We haveappreciably improved the average response time with a weakcommunication cost.

In the future, we plan to integrate our load balancingstrategy on other known Grid simulators like SimGrid andHyperSim. This will allow us to measure the effectiveness ofour strategy in existing simulators. We also envisage to developour strategy as a service of GLOBUS middleware [11]. Asanother perspective, we plan to extend the proposed modelto a fully distributed model (removal of the root from thetree structure). Finally, we think that it is significant to takeaccount, in a balancing strategy of application characteristics.More these characteristics are known, more the strategy willbe adapted, which suggests adapting a strategy of balancingto a class of applications.

REFERENCES

[1] I. Foster, C. Kesselman, and S. Tuecke, “The anatomy of the grid:Enabling scalable virtual organizations,” International Journal of HighPerformance Computing Applications, vol. 15, no. 3, 2001.

[2] A. Chervenak, I. F. C. Kesselman, C. Salisbury, and S. Tuecke, “Thedata grid: towards an architecture for the distributed management andanalysis of large scientific datasets,” Journal of Network and ComputingApplications, vol. 23, no. 3, pp. 187–200, 2000.

[3] M. Baker, R.Buyya, and D. Laforenza, “Grids and grid technologiesfor wide-area distributed computing,” International Journal of Software:Practice and Experience (SPE), vol. 32, no. 15, 2002.

[4] C. Xu and F. Lau, Load Balancing in Parallel Computers: Theory andPractice. Kluwer, Boston, MA, 1997.

[5] H. Johansson and J. Steensland, “A performance characterization ofload balancing algorithms for parallel SAMR applications,” UppsalaUniversity, Department of Information Technology, Tech. Rep. 2006-047, 2006.

[6] H. Shan, L. Oliker, R. Biswas, and W. Smith, “Scheduling in heteroge-neous grid environments: The effects of data migration,” in Proceedingsof ADCOM2004: International Conference on Advanced Computing andCommunication, India, December 2004.

[7] J. Cao, D. P. Spooner, S. A. Jarvi, and G. R. Nudd, “Grid load balancingusing intelligent agents,” Future Generation Computer Systems, vol. 21,pp. 135–149, 2005.

[8] S. Genaud, A. Giersch, and F. Vivien, “Load balancing scatter opera-tions for grid computing,” in Proceedings of the 12th HeterogeneousComputing Workshop (HCW’2003), Nice, France, April 2003, pp. 101–110.

[9] Y. Hu, R. Blake, and D. Emerson, “An optimal migration algorithmfor dynamic load balancing,” Concurrency: Practice and Experience,vol. 10, pp. 467–483, 1998.

[10] R. Buyya. A grid simulation toolkit for resource modelling andapplication scheduling for parallel and distributed computing. [Online].Available: http://www.buyya.com/gridsim/

[11] I. Foster, “Globus toolkit version 4: Software for service orientedsystems,” in IFIP: International Conference on Network and ParallelComputing, Beijing, China, November 2005, pp. 2–13.