8
Market-based Load Balancing for Distributed Heterogeneous Multi-Resource Servers Chih-Chiang Yang Kun-Ting Chen Chien Chen Jing-Ying Chen Department of Computer Science National Chiao Tung University, Hsiu-Chu, Taiwan [email protected], [email protected], {chienchen,jyc}@cs.nctu.edu.tw Abstract To cope with rapidly increasing Internet usage nowadays, providing Internet services using multiple servers has become a necessity. To ensure sufficient service quality and server utilization at the same time, effective methods are needed to spread load among servers properly. Existing load balancing methods often assume servers are homogeneous and consider only one type of resource, such as CPU. Such methods suffer from the fact that different requests often demand multiple types of resources with different requirements; trying to balance the usage of only one resource type may induce an inadvertent performance bottleneck, leading to low resource utilization and service quality. To address this problem, we propose a load balancing method based on the concept of distributed market mechanism, where requests are priced with respect to the load of multiple resources on each server. By migrating jobs among servers to balance inter-server load and minimize intra-server job cost at the same time, our method shows significant improvement in terms of load imbalance degrees, server utilization, and response time when compared to other published methods, especially when server heterogeneity increases. Keywords: Load balancing, multi-resource scheduling, market mechanisms 1. Introduction In recent years, more and more people rely on the Internet for their daily activities. Consequently, organizations and companies rush to provide their services on-line. Since constantly improving hardware performance no longer suffices to cope with the growing volume of user requests while preserving acceptable service quality, it becomes a common practice to use multiple servers to process user requests simultaneously. However, when there are multiple servers working at the same time, load balancing becomes an issue. If we cannot spread service requests among servers effectively, some servers may become overloaded while the others remain idle, leading to low server utilization and poor quality of services for the whole system. Conventional load balancing methods often assume servers are homogeneous and consider only one type of resource such as CPU. In reality, however, different jobs often demand multiple types of resources with different requirements, respectively. Trying to balance the usage of only one resource type may induce an inadvertent performance bottleneck, especially when other scarcer resources are under contention. Accordingly, an effective load balancing scheme for heterogeneous multi-recourse servers needs to recognize both inter-server heterogeneity, that is, different servers may have different resource capacities, and intra-server heterogeneity, that is, each server contains multiple types of resources such as CPU, memory, and network bandwidth, and their capacities can also vary. In this paper, we focus on the so-called server-based load balancing architecture [1] where load balancing is performed collaboratively among a set of distributed, heterogeneous, multi-resource servers. We propose a load balancing method which attempts to balance not only the load among servers, but also the usage of multiple resources within each server. Specifically, our method is based on the concept of market mechanism (MM), in which we treat resource requesters and providers as consumers and suppliers in the commodity market. Requests to resources are priced in such a way that supply and demand rule is maintained. The cost of requesting a resource is directly proportional to the load of that resource in the server, and the cost of executing a job is a weighted sum of the respective resource costs. Based on this pricing model, servers can constantly exchange their state information with each other to determine which jobs should be migrated and which servers they should migrate to, so as to balance both inter-server load and to minimize intra-server job cost at the same time. Simulation showed that, compared with other published methods, our approach improves system performance significantly in terms of load imbalance degrees, server utilization, and request respond time. The rest of this paper is organized as follows. Section 2 reviews some related work on the problem of load balancing. Section 3 presents a framework characterizing load balancing methods; section 4 presents our distributed MM method using the framework. Section 5 discusses the simulation results. Finally, a conclusion and discussion about future work are given in Section 6. 2009 15th International Conference on Parallel and Distributed Systems 1521-9097/09 $26.00 © 2009 IEEE DOI 10.1109/ICPADS.2009.96 158

[IEEE 2009 15th International Conference on Parallel and Distributed Systems - Shenzhen, China (2009.12.8-2009.12.11)] 2009 15th International Conference on Parallel and Distributed

Embed Size (px)

Citation preview

Page 1: [IEEE 2009 15th International Conference on Parallel and Distributed Systems - Shenzhen, China (2009.12.8-2009.12.11)] 2009 15th International Conference on Parallel and Distributed

Market-based Load Balancing for Distributed Heterogeneous Multi-Resource Servers

Chih-Chiang Yang Kun-Ting Chen Chien Chen Jing-Ying Chen

Department of Computer Science National Chiao Tung University, Hsiu-Chu, Taiwan

[email protected], [email protected], {chienchen,jyc}@cs.nctu.edu.tw

Abstract

To cope with rapidly increasing Internet usage nowadays, providing Internet services using multiple servers has become a necessity. To ensure sufficient service quality and server utilization at the same time, effective methods are needed to spread load among servers properly. Existing load balancing methods often assume servers are homogeneous and consider only one type of resource, such as CPU. Such methods suffer from the fact that different requests often demand multiple types of resources with different requirements; trying to balance the usage of only one resource type may induce an inadvertent performance bottleneck, leading to low resource utilization and service quality. To address this problem, we propose a load balancing method based on the concept of distributed market mechanism, where requests are priced with respect to the load of multiple resources on each server. By migrating jobs among servers to balance inter-server load and minimize intra-server job cost at the same time, our method shows significant improvement in terms of load imbalance degrees, server utilization, and response time when compared to other published methods, especially when server heterogeneity increases.

Keywords: Load balancing, multi-resource scheduling, market mechanisms

1. Introduction

In recent years, more and more people rely on the Internet for their daily activities. Consequently, organizations and companies rush to provide their services on-line. Since constantly improving hardware performance no longer suffices to cope with the growing volume of user requests while preserving acceptable service quality, it becomes a common practice to use multiple servers to process user requests simultaneously.

However, when there are multiple servers working at the same time, load balancing becomes an issue. If we cannot spread service requests among servers effectively, some servers may become overloaded while the others remain idle, leading to low server utilization and poor quality of services for the whole system.

Conventional load balancing methods often assume servers are homogeneous and consider only one type of resource such as CPU. In reality, however, different jobs

often demand multiple types of resources with different requirements, respectively. Trying to balance the usage of only one resource type may induce an inadvertent performance bottleneck, especially when other scarcer resources are under contention. Accordingly, an effective load balancing scheme for heterogeneous multi-recourse servers needs to recognize both inter-server heterogeneity, that is, different servers may have different resource capacities, and intra-server heterogeneity, that is, each server contains multiple types of resources such as CPU, memory, and network bandwidth, and their capacities can also vary.

In this paper, we focus on the so-called server-based load balancing architecture [1] where load balancing is performed collaboratively among a set of distributed, heterogeneous, multi-resource servers. We propose a load balancing method which attempts to balance not only the load among servers, but also the usage of multiple resources within each server. Specifically, our method is based on the concept of market mechanism (MM), in which we treat resource requesters and providers as consumers and suppliers in the commodity market. Requests to resources are priced in such a way that supply and demand rule is maintained. The cost of requesting a resource is directly proportional to the load of that resource in the server, and the cost of executing a job is a weighted sum of the respective resource costs. Based on this pricing model, servers can constantly exchange their state information with each other to determine which jobs should be migrated and which servers they should migrate to, so as to balance both inter-server load and to minimize intra-server job cost at the same time. Simulation showed that, compared with other published methods, our approach improves system performance significantly in terms of load imbalance degrees, server utilization, and request respond time.

The rest of this paper is organized as follows. Section 2 reviews some related work on the problem of load balancing. Section 3 presents a framework characterizing load balancing methods; section 4 presents our distributed MM method using the framework. Section 5 discusses the simulation results. Finally, a conclusion and discussion about future work are given in Section 6.

2009 15th International Conference on Parallel and Distributed Systems

1521-9097/09 $26.00 © 2009 IEEE

DOI 10.1109/ICPADS.2009.96

158

Page 2: [IEEE 2009 15th International Conference on Parallel and Distributed Systems - Shenzhen, China (2009.12.8-2009.12.11)] 2009 15th International Conference on Parallel and Distributed

2. Related work

There has been extensive research on the problem of uneven server load in recent years. [1][2] classify existing distributed load balancing methods according to the entity that distributes the incoming requests among the servers, namely, client-based, DNS-based ([3]), dispatcher-based ([4][5][6]), and server-based ([7][8][9]) architecture. A detailed comparison of these four approaches can be found in [1]. For example, in DNS-based architecture, all servers are geographically distributed, and there is no direct geographical relationship between the DNS server and the (service) servers. To perform load balancing, the DNS server distributes the requests among servers in a round-robin manner. Because the servers send their load status to the DNS server periodically, the DNS server can skip overloaded servers. [3] presents an adaptive time-to-live (TTL) policy in DNS-based architecture which assigns different TTL value to each client request based on server availability information as well as client proximity. To resolve the uneven domain load distribution, requests coming from popular domains will receive lower TTL.

[10] presents a queuing model for analyzing distributed load balancing model to determine whether it is worthwhile to execute a job across regions. The goal of the algorithm is to re-route the traffic from heavy-loaded regions to lightly loaded ones, and the result indicates that when the load is heavy and there is a big difference in queue length between the two regions, routing some of the requests may improve the overall system performance. [7] presents a decentralized and distributed algorithm for scheduling tasks and balancing the load of resources in heterogeneous Grid environments. The algorithm takes into consideration the coordination and communication overhead between Grid nodes, where each node is assumed to be an N-resource server with varying resource capacities. The goal is to assign each node a job which would utilize its resources in the best possible manner, thus providing an effective scheduling strategy. One interesting result of the study shows that because the time needed for various communication overheads overlaps with that of executing the jobs already committed to the nodes, the effective time for scheduling overhead becomes virtually zero.

Multi-resource load balancing also has received increasing attention recently. For example, [4] proposes a heuristic method that selects jobs to correct resource imbalance for the system. The idea is that if all resource usages are kept balanced, more jobs will likely fit into the system, creating a larger backfill candidate pool. In addition, it provides different quality of services for different classes of users. [5] introduces a heuristic method, called backfill balance, which selects a job based

on its overall ability to balance the resource utilization. Specifically, the method uses a load balancing measure, i.e. maximum load/average load, to help select the server that will result in the lowest imbalance degree for a given job. In [8], the method is subsequently extended to handle multiple servers. When intra-server heterogeneity is concerned, [11][12] discuss the characteristics of heavy-tailed workload of web service requests, and claim that different-sized jobs should be executed in different servers.

Market-based approaches to resource allocation and management are also abundant in literature. For example, [13] studies two economic models, namely exchange-based and price-based models, for allocating resources in computer systems. In the exchange-based model resources are exchanged among agents until the marginal rate of substitution resources is the same for all agents; the pareto optimal is achieved when no further mutually beneficial resource exchanges can occur. The price-based economy, on the other hand, prices resources based on the demand, supply, and wealth in the economic system; the goal is to arrive at an equilibrium price vector where the demand equals the supply. [14] also explores the use of market mechanisms as a way of balancing the computational work load in distributed systems, where consumers and suppliers access resources and sell resources, respectively. Several protocols for the transactions between consumers and suppliers are employed. In the commodity-based scheme, each supplier prices commodity based on a dynamic pricing policy. The sellers set their prices and may change them at any time depending on the consumers’ demand. This model is conceptually closer to ours. In the auction-based scheme, consumers bid for resources according to a particular auction protocol. It allows one to determine resource value in a group of bidders. However, this scheme incurs higher communication cost and might not be appropriate for markets in which the demand for resources is low. Grid computing is another major research area where researchers experiment different kinds of economic models for the problem of resource allocation and management [15][16]. This is not surprising because a grid environment attempts to virtualize heterogeneous, distributed computational resources into commodities for general use, which in turn can enable a global commodity market where people can sell resources for profit. It should be noted that compared to the more general resource allocation problem addressed by the grid computing community, the load balancing problem addressed in this paper is more focused with specific goals. 3. Server-based load balancing architecture

In this paper we focus on server-based load balancing methods as described in [1][2]. In this model, all servers

159

Page 3: [IEEE 2009 15th International Conference on Parallel and Distributed Systems - Shenzhen, China (2009.12.8-2009.12.11)] 2009 15th International Conference on Parallel and Distributed

need to exchange their state information and cooperate with each other to perform necessary load balancing decisions. A small number of failed servers do not drastically affect the entire system’s operation. Compared to dispatcher-based load balancing architecture, this approach can achieve higher scalability, higher reliability, and better average turnaround time. However, its implementation is more complicated.

As described in [17][18], distributed load balancing methods can be characterized with four policies, namely the information policy, transfer policy, location policy, and selection policy:

Information policy. The information policy determines the kind of state information to be exchanged among servers. Conventional methods usually exchange information about a single resource, such as the CPU load. Transfer policy. The transfer policy determines the set of servers that need to adjust. The most common approach is threshold-based, which calculates an upper threshold Ts and a lower threshold Tr. The upper threshold Ts may be in the form of avg

avgL + d, where avgavgL stands for the

average resource load of the whole server cluster and d a designated constant, or a avg

avgL where a is a constant value

greater than 1. Likewise, Tr may be in the form of avgavgL - d

or a avgavgL where a < 1.

As shown in Fig. 1, if the load of a server is greater than Ts, the server is said to be in the sender state; if less than Tr, the server is in the receiver state. Otherwise, the server is in the common state. As the names suggest, a server in sender state tends to send some of its jobs to another server with a lower load, while a server in receiver state tends to receive jobs from another server with higher load. The servers in common state do not have to do anything. In this paper, we also define a new state, called the exchanger state, to deal with multi-resource load imbalance issue within each server. Location Policy. The location policy concerns the steps needed to find the target server to which a server in sender state can send jobs to, and to find the source server from which a server in receiver state can receive jobs from. A

simple heuristic is for a server in sender state to match the server with lowest load, and similarly for a server in receiver state to match the one with highest load. Selection Policy. Once the server pair is chosen, the selection policy determines which jobs should be sent to or received from the matching server. In this paper, we investigate three popular job scheduling approaches, called latest arrival job, backfill lowest and backfill balance [5]:

Latest Arrival Job (LAJ): send the latest arriving job from the sending server to the receiving server,

Backfill Lowest (BL): find the resource of the receiving server that is most available, and send the job that demands that resource most from the sending server.

Backfill Balance (BB): send the job which can minimize the (maximum load / average load) measure for the receiving server.

As an example, consider Fig. 2 in which there are two servers with 3 jobs and 1 job, respectively, and the load of their resources are 13, 15, 14 and 7, 5, 6, respectively, before job exchange (Fig. 2a). Assume the server on the left is in sender state and has to send a job to the server on the right. If the LAJ selection policy is used, the latest

3377 55

3355

55

55 55

7755

13 15 14 7 5 613 15 14 7 5 6

66

44

(a) Before job exchange

3377 55

3355

55

5555

7755 66

44

8 12 9 12 8 118 12 9 12 8 11 (b) Result of LAJ selection policy

77

55

55

3

5

33

55 3

5

33

55 77

55

66

5

4

55

44

8 8 9 12 12 118 8 9 12 12 11 (c) Result of BL selection policy

33

77 55

33 55

55

55 55

7755 66

44

10 10 10 10 10 1010 10 10 10 10 10 (d) Result of BB selection policy

Fig. 2. Example of selection policies

avg

Tr

Ts

Receiver

Common

Sender

Fig. 1. Sender, receiver, and common states

160

Page 4: [IEEE 2009 15th International Conference on Parallel and Distributed Systems - Shenzhen, China (2009.12.8-2009.12.11)] 2009 15th International Conference on Parallel and Distributed

arriving job (with resource requirements 5, 3, and 5) will be selected (Fig. 2b). In the case that the BL selection policy is used, since the second resource of the receiving server is least busy, the first job (with resource requirements 5, 7, and 5) of the sending server will be chosen because it demands the second resource most (Fig. 2c). Finally, when the BB selection policy is used, since the measure of executing the three jobs on the receiving server are 1.16, 1.0, and 1.03, respectively, the second job will be selected (Fig. 2d). 4. Distributed MM load balancing method

Although conventional methods can achieve acceptable load balancing among servers, the usage of resources within each server may still be highly imbalanced. This is because once the load among servers becomes sufficiently balanced, the servers enter the common state, and the chance to balance the resource load within each server become less. To address this problem, we present a distributed MM load balancing method which adds a new exchanger state to the servers to further improve the load imbalance degree among resources inside a server.

Specifically, we model each resource request as a consumer’s demand, and available resources such as CPU, memory and network bandwidth of each server as commodities. We propose a dynamic pricing policy for commodity suppliers to price each request according to a supply and demand rule in the commodity market, in which the resource utilization will affect its prices.

Below we describe our method in terms of the four policies of distributed load balancing architecture mentioned in the previous section:

Information policy. In our method, servers exchange their entire resource load K

jj LL ~1 with each other.

Transfer policy. We define the load imbalance degree for server j as:

K

k

kavg

kjj LLB

1

(1)

where kavgL is the average load of resource k of the whole

server cluster. Bj measures the load imbalance degree between servers. Similarly, the absolute load imbalance degree for server j is defined as the sum of absolute values of load differences:

K

k

kavg

kjjabs LLB

1_

(2)

which measures the load imbalance degree between resources inside a server j. We then determine the state of a server as follows:

Sender: js BT .

Receiver: rj TB .

Exchanger: sjr TBT and jabsBT _ .

Common: the rest. Ts, Tr are thresholds that are also commonly used in conventional methods. The difference is that we add an exchanger state with the threshold T for servers to further improve load balancing of resources inside of a server. Location policy. For servers in sender or receiver state, the location policy to find matching server pairs is the same as those in conventional methods. For servers in exchanger state, on the other hand, the policy is different:

Assume server i and j are both in exchanger state, we define the load imbalance degree for the pair before job exchange as:

jabsiabs BB __ (3) and the ideal load imbalance degree after job exchange as:

K

kkj

ki

kj

kavg

kj

ki

kavg

ki

RRRLLRLL

12 (4)

where kiR and k

jR are the capacity ratios of resource k for heterogeneous servers i and j. Our goal is to find the pair of servers i, j such that their imbalance degree before job exchange minus their ideal load imbalance degree afterwards is the largest. For illustration, consider Fig. 3a in which there are three servers; the ratio of their capacities is 1:3:4 and their current loads compared to the whole system load are +4%, -4%, and +3% respectively. Before job exchange, Babs_1, Babs_2, and Babs_3 are 4, 4, and 3, respectively. Without considering server capacity, we may tend to pair server 1 with server 2, because the imbalance degree before job exchange is 4 + 4 = 8, and

42231

34142 afterwards (Fig. 3b), hence

the improvement of imbalance degree is 8 – 4 = 4. If we pair server 2 and server 3 instead, the imbalance degree is 4 + 3 = 7 before job exchange, and becomes

00243

43342 afterwards (Fig. 3c), an

improvement by 7. Selection policy. In the MM model, a job to be executed on a server will be associated with a cost, which is directly proportional to the resource load of that server and the job resource requirement. We define the pricing model as follows. The cost per resource requirement of job w executed on server j is:

K

k

kw

K

k

kj

kw

j

J

LJwC

1

1 (5)

161

Page 5: [IEEE 2009 15th International Conference on Parallel and Distributed Systems - Shenzhen, China (2009.12.8-2009.12.11)] 2009 15th International Conference on Parallel and Distributed

where kwJ is the requirement of resource k for job w, k

jL the load of resource k in server j, and K the number of resource types.

The selection policy is: for each server pair i, j selected by the location policy among the servers in sender and receiver states, the job, w, to be sent from server i to j is the one that can result in minimum wC j for server j.

For each server pair i, j where both servers are in exchanger state, we want to pick a job from each server for exchange, and the exchange can benefit both servers overall. We define the benefit of transferring a job w from server i to server j as:

wCwC ji (6)

Accordingly, the total benefit of exchanging job w1 in server i with job w2 in server j becomes

)2()2()1()1( wCwCwCwC ijji (7)

Furthermore, the benefit should be larger than a threshold T for the exchange to take place. Without the threshold T, it is possible that two servers in exchanger state may constantly exchange the same pair of jobs to and from each other without converging.

The reason we define the cost per resource requirement (5) rather than use the total cost of a job executed on a server is that the former helps improve the load imbalance degree within a server. Consider the example shown in Fig. 4 where there is a server with load 80%, 80%, and 50% for three kinds of resources, respectively, and jobs J1 and

J2 whose requirements for these three resources are 1%, 1%, 5% and 2%, 2%, 1%, respectively. To decide which job to execute next on the server, intuitively, we would choose J1 over J2 because it helps improve the load imbalance degree more. However, if we just make decision based on the total cost, in this case 410 for J1 and 370 for J2, we may choose J2 instead. This is because the total cost measure is also affected by the total resource requirements. By taking the cost per unit of resource requirement instead, we would get 58.57 for J1 and 74 for J2, resulting in a better decision (i.e. J1). 5. Simulation

To investigate the effectiveness of our load balancing method, we compare the following five approaches:

NO: once a job is assigned to a server, it will not be transferred to another server.

LAJ (Latest Arrival Job). BL (Backfill Lowest). BB (Backfill Balance). MM (Market Mechanism)

As also assumed in [4][5][8], a request can be executed on a server only when all its requirements can be met by that server. A job needs three kinds of resources: CPU, memory, and network bandwidth. For each request, the requirements for the resources are generated randomly with uniform distribution:

CPU: 1~3 Gflop (giga floating point operations). Memory: 1~3GB Network bandwidth: 10~30Gb/s

The execution time of a request is also randomly generated with range from 5 to 25 seconds. The arrival rate of job requests is, for each server, λ=0.75 requests per second following the Poisson process. The arrival rate is adjusted so that server utilization for the worst case (no

+4% -4% +3%+4% -4% +3%

S1 S2 S3 (a) Before job exchange

-2% -2% +3%-2% -2% +3%

S1 S2 S3 (b) Result of pairing server 1 and server 2

+4% 0% 0%+4% 0% 0%

S1 S2 S3 (c) Result of pairing server 2 and server 3

Fig. 3. Example of location policy

Fig. 4. Example of cost calculation

162

Page 6: [IEEE 2009 15th International Conference on Parallel and Distributed Systems - Shenzhen, China (2009.12.8-2009.12.11)] 2009 15th International Conference on Parallel and Distributed

load balancing) is around 70%. We also consider different degrees of server heterogeneity:

Homogeneous: the resource capacities of a server are: 2 Gflops, 2GB memory, and 20Gb/s network bandwidth, respectively.

Heterogeneous (Low): the resource capacities of a server are randomly generated independently with the following ranges: 2±0.5Gflops, 2±0.5GB memory, and 20±5Gb/s network bandwidth.

Heterogeneous (High): Similar to above but with wider ranges: 2±1 Gflops, 2±1GB memory, and 20±10Gb/s bandwidth

Simulation is conducted for clusters of 2, 4, …, 128 servers, respectively, and for each cluster size we perform 20 runs of simulation and obtain the average results. We use the following four measures to compare different load balancing methods:

Average Standard Deviation of Server Load. The standard deviation of server load among servers, averaged over the 20 runs.

Average Standard Deviation of Resource Load. The average standard deviation of resource load within

each server. Server Utilization. For CPU, we measure the percentage of the CPU in busy state; for memory and network bandwidth, we measure the percentages of occupied capacities. A weighted average of the three utilization measures is obtained.

Average Turnaround Time. The average response time of job requests.

Fig. 5 shows the standard deviation of server load in homogeneous environment. All methods LAJ, BL, BB, and MM have slightly better load imbalance degrees when compared to the NO method, but the effect of using which selection policies is not obvious. Fig. 6 and 7 shows the load imbalance degrees in heterogeneous environments. The results show that increasing resource heterogeneity has negative impact on load imbalance degrees among servers. Furthermore, choosing which selection policy also affects the load imbalance degrees. In either case, our MM method performs better than the other load balancing methods considerably.

Fig. 8, 9, and 10 show the results of intra-server load imbalance degrees, which have similar trend compared to the inter-server load imbalance degrees shown previously.

Fig. 8, 9, 10. Average standard deviation of resource load in

homogeneous and heterogeneous environments

Fig. 5, 6, 7. Average standard deviation of server load in

homogeneous and heterogeneous (low and high) environments

163

Page 7: [IEEE 2009 15th International Conference on Parallel and Distributed Systems - Shenzhen, China (2009.12.8-2009.12.11)] 2009 15th International Conference on Parallel and Distributed

It is worth noting that the LAF method has little impact on intra-server load imbalance degrees in all simulation, which is not surprising because all the load-balancing methods except the LAF method also take into consideration the load of different resources in their selection policies, respectively. When comparing the BL, BB, and MM methods, it is clear that the MM method maintains low load imbalance degrees when resource heterogeneity increases, while the other methods are affected.

One main idea behind the design of the MM method is that, if the load of resources within a server is balanced, the chance of executing more requests concurrently becomes higher. This idea is confirmed when looking into the resource utilization data, as shown in Fig. 11, 12, and 13, by increasing resource heterogeneity. As the data show, the improvement in server utilization is not much in homogeneous environment, because server utilization is already high even without load balancing. When resource heterogeneity increases, however, server utilization drops a lot if no load balancing is performed. Overall, our MM method maintains high server utilization (80-85%) related to the homogeneous case, especially when the number of

servers is larger, while the other methods are affected by resource heterogeneity with varying degrees.

The improvement in server utilization directly affects system performance. The average turnaround time of the three system configurations are shown in Fig. 14, 15, and 16, indicating that the more resource heterogeneity is, the larger the turnaround time. Specifically, since the average request processing time is 15 seconds (uniformly distributed from 5 to 25 seconds), our MM method approximates that value in all cases compared with other methods, especially when the number of servers increases, which is also confirmed by the low average wait queue length measured for the MM method (not shown here).

Note that our method deals with not only the load imbalance between servers, but also the intra-server load imbalance, thus it has more chances to balance the system than BL and BB have by migrating more tasks to suitable servers. In our simulation, the total number of times of task migration for MM is about 2 times of that of BL and BB on average. This is partly due to the simulation parameters we used such as load balancing frequency; further more realistic simulation study is underway.

Fig. 14, 15, 16. Average turnaround time in homogeneous and

heterogeneous environments

Fig. 11, 12, 13. Average server utilization in homogeneous and

heterogeneous environments

164

Page 8: [IEEE 2009 15th International Conference on Parallel and Distributed Systems - Shenzhen, China (2009.12.8-2009.12.11)] 2009 15th International Conference on Parallel and Distributed

6. Conclusion and future work

We have proposed a load balancing method based on the concept of distributed market mechanism. The method considers both the consumption of multiple resources for job execution as well as the heterogeneity of resource capacities among servers. In the transfer policy, we add an exchanger state for servers to open the door for servers under moderate load to exchange their jobs so that the intra-server load imbalance degrees can be further improved. Specifically, our method detects the load imbalance phenomena inside a server, and attempts to find the pair of servers such that their imbalance degree after job exchange can be minimized. To do so, in the selection policy, a request to be executed on a server is associated with a cost, which is directly proportional to the resource load of that server. By assigning higher cost to heavily used resources and selecting the server with lowest cost for each job to be migrated, our method successfully distributes jobs among servers such that both inter-server and intra-server imbalance degree are minimized. Simulations showed that the extra effort taken to ensure low intra-server imbalance degree has considerable, positive effect on system performance, as the chance of packing more jobs in a server increases, resulting in higher overall system utilization under the same request rate. Our simulation also shows that resource heterogeneity has significant impact on system performance, and existing methods are also affected by resource heterogeneity with varying degrees. Still, our MM method maintains high system performance in all cases, which can be safely attributed to the minimization of intra-server imbalance degree as mentioned above.

In future work, we would like to extend the MM method in the context of Web-based systems or cloud computing. In these computing environments, requests tend to be large in volume but short in processing time, and therefore load balancing can no longer be done on a per-request basis. Instead, it is the workload (of various request types) that needs to be balanced dynamically among servers. Another direction we would like to explore is how well the MM method responses when request patterns are heterogeneous and vary in time.

7. References [1] V. Cardellini, M. Colajanni, and P. S. Yu, “Dynamic Load

Balancing on Web-Server Systems”, IEEE Internet Computing, Vol. 3May, 1999, pp. 28 – 39.

[2] H. Bryhni, E. Klovning, and O Kure, “A Comparison of Load Balancing Techniques for Scalable Web Servers”, IEEE Network, Vol. 14(4), July/August 2000, pp. 58 - 64.

[3] M. Colajanni, and P.S. Yu, “Dynamic Load Balancing in Geographically Distributed Heterogeneous Web Servers”, Proc. of the International Conference on Distributed Computing Systems, May 1998, pp. 205 - 302.

[4] Y.D. Lin, C.M. Tien, S.C. Tsao, R.H. Feng, and Y.C. Lai, “Multiple-resource request scheduling for differentiated QoS at website gateway”, Advanced Information Networking and Applications, Mar. 2008. pp. 433 - 440.

[5] W. Leinberger, G. Karypis, and V. Kumar, “Job Scheduling in the presence of Multiple Resource Requirements”, Proc. of the 1999 ACM/IEEE conference on Supercomputing, Article No. 47, 1999.

[6] M. Harchol-Balter, “Task Assignment with Unknown Duration”, Journal of the ACM, Vol. 49(2), 2002, pp. 260 - 288.

[7] M. Arora, S. K. Das, and R. Biswas, “A De-centralized Scheduling and Load Balancing Algorithm for Heterogeneous Grid Environments”, Proc. of the International Conference on Parallel Processing Workshops, 2002, pp. 499 - 505.

[8] W. Leinberger, G. Karypis, and V. Kumar, “Load Balancing Across Near-Homogeneous Multi-Resource Server”, Proc. of Heterogeneous Computing Workshop, Aug. 2000, pp. 60 - 74.

[9] M. Aramudhan, and V. R. Uthariaraj, “LDMA: Load Balancing Using Decentralized Decision Making Mobile Agents”, Lecture note in computer science, 3994, 2006, pp. 388-395.

[10] Z. Zhang, and W. Fan, “Web server load balancing: A queueing analysis”, European Journal of Operational Research, Vol. 186( 2), pp. 681-693.

[11] M. E. Crovella, M. H. Balter, and C. D. Murta, “Task Assignment in a Distributed System: Improving Performance by Unbalancing Load”, Proc. of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems, Oct. 1998, pp. 268 - 269.

[12] G. Ciardo, A. Riska, and E. Smirni, “EquiLoad: a load balancing policy for clustered web servers”, Performance Evaluation, Vol. 46, 2001, pp. 46 - 101.

[13] D. F. Ferguson, C. Nikolaou, J. Sairamesh, Y. Yemini, “Economic Models for Allocating Resources in Computer Systems”, Market-based control: a paradigm for distributed resource allocation, 1996, pp. 156 - 183.

[14] J. Gomoluch and M. Schroeder, “Flexible Load Balancing in Distributed Information Agent Systems”, Proc. of the ECCAI-ACAI/EASSS, AEMAS, HoloMAS on Multi-Agent-Systems and Applications, 2002, pp. 188 - 197.

[15] R. Buyya, "Economic-based Distributed Resource Management and Scheduling for Grid Computing", PHD dissertation of Monash University, 2002.

[16] R.Wolski, J.S. Plank, J. Brevik and T. Bryan, “Grid resource allocation and control using computational economies”, In Grid Computing: Making The Global Infrastructure a Reality (Ed. F. Berman, et al), John Wiley & Sons, 2003.

[17] D.L. Eager, E.D. Lazowska, and J. Zahorjan, “A Comparison of Receiver-Initiated and Sender-Initiated Adaptive Load Sharing,” Proc. of the 1985 ACM SIGMETRICS conference on Measurement and modeling of computer systems, Oct. 1985, pp. 1 - 3.

[18] N.G. Shivaratri, P. Krueger, and M. Singhal, “Load Distributing for Locally Distributed Systems”, Computer, Vol. 25, December, 1992, pp. 33 – 44.

[19] T. Osogami, M. Harchol-Balter, A. Scheller-Wolf, and L. Zhang, “Exploring Threshold-based Policies for Load Sharing”, ACM SIGMETRICS Performance Evaluation Review,Vol. 33(2), Sep. 2005, pp. 36 - 38.

165