APSCC206

8/2/2019 APSCC206

1/6

Hybrid Genetic Algorithm for Cloud Computing Applications

Kai Zhu, Huaguang Song, Lijing Liu, Jinzhu Gao, Guojian Cheng

School of Engineering and Computer Science

University of the Pacific, Stockton, CA 95211

Email: kzhu, hsong, [email protected]

School of Computer Science

Xi'an Shiyou University, Dian Zi 2nd Road 18, Xi'an, Shaanxi P.R. China 710065

Email: [email protected], [email protected]

AbstractIn the cloud computing system, the schedule ofcomputing resources is a critical portion of cloud computing

study. An effective load balancing strategy is able to markedly

improve the task throughput of cloud computing. Virtual

machines are selected as a fundamental processing unit of cloud

computing. The resources in cloud computing will increasesharply and vary dynamically due to the utilization of

virtualization technology. Therefore, implementation of load

balancing in cloud computing has become complicated and it is

difficult to achieve. Multi-agent genetic algorithm (MAGA) is a

hybrid algorithm of GA, whose performance is far superior to

that of the traditional GA. This paper demonstrates the

advantage of MAGA over traditional GA, and then exploits

multi-agent genetic algorithms to solve the load balancing

problem in cloud computing, by designing a load balancing

model on the basis of virtualization resource management.

Finally, by comparing MAGA with Min_min strategy, the

experiment results prove that MAGA is able to achieve better

performance of load balancing.

Keywords-cloud computing, load balance, multi-agent genetic

algorithm, virtualization technology

I. INTRODUCTIONCloud computing is an inevitable trend in the future

computing development of technology. Its critical

significance lies in its ability to provide all users with high

performance and reliable calculation. Cloud computing is the

evolution of distributed computing, grid computing, and

multiple other techniques. One of the primary differences

between cloud computing and previous large-scale cluster

computing is that cloud computing uses distributed

computing and grid computing as the fundamental

processing unit. In cloud computing, by using virtualization

technology [8], one physical host can be virtualized into

multiple virtual hosts and use these hosts as a basic

computing unit. By adopting virtualization technology, cloud

computing in tandem with conventional cluster computing

will greatly improve utilization of hardware and also achieve

automatic monitoring for all hosts. Virtualization technology

not only has brought a lot of convenience to cloud

computing, but it also has made a large number of virtual

resources available in the cloud. The quantity of these virtual

resources is both enormous and dynamically changing.

Therefore, load balancing of the host in cloud computing is

one of the primary concerns in research.

Proposed by Professor J. H. Holland from the

University of Michigan in the early 1960s, GA was the first

evolutionary computation algorithm [7], extracting,simplifying, and abstracting the basic ideas from Darwins

theory of evolution and Mendels laws of inheritance. Using

the evolution theory of biosphere as a reference, the

algorithm utilizes computers to simulate the natural selection

mechanism of parent gene recombination and survival of

the fittest in the process of species reproduction. It can be

exploited to solve complicated problems in science and

engineering.

In recent years, research and application of GA has

been rapidly developed and widely utilized. It can be applied

to solve the complicated issues in science and engineering.

GA brings a remarkable impact to many growing fields, such

as artificial intelligence, knowledge discovery, patternrecognition, image processing, decision analysis, product

process design, resource scheduling, and stock market

analysis. However, there exist some restricting conditions in

solving high-dimensional function optimization problems,

rendering GA less effective in cloud computing. When using

classic GA to solve problems, such as coarse-grained, high-

dimensional, and large data set optimization problems, issues

like imperfect convergence, slow convergence, and no

convergence are inevitable. Therefore, scholars have

proposed a variety of improved genetic algorithms.

This paper mainly focuses on multi-agent genetic

algorithm (MAGA) [10]. It is a hybrid algorithm combining

GA and multi-agent techniques, which was originally

proposed by Professor Licheng Jiao. MAGA is a kind of

improved hybrid GA; execution demonstrates greatly

enhanced convergence time and optimization results

compared to that of traditional GA. MAGA has obvious

superiority, especially when handling very large-scale, high-

dimensional, complex, and dynamic optimization problems.

Therefore, this paper first introduces the strengths of MAGA

by comparing it with GA. Then, a build model will be used

8/2/2019 APSCC206

2/6

to convert it into a mathematical problem based on load

balancing of virtualized cloud computing. At last, we use

MAGA to solve load-balancing strategy, and compare the

result with common Min_min algorithm [11].

II. RELATED WORKThe basic principle of cloud computing is to distribute

computing tasks across a large number of distributed

computers rather than the local computer. Hu et al. [1]

proposed a scheduling strategy for VM load balancing in

cloud computing based on GA. This could effectively

improve overall system reliability and availability, which is

one of the primary concerns in cloud computing. Gong et al.

[2] conducted an analysis for the features of cloud

computing. He explored the applications of virtualization

technology which stems from an advanced virtual host. And

then, apply the virtualization technology into resource

management and virtualization storage.

In recent years, artificial intelligence methods such as

evolutionary computation, especially its branch in geneticalgorithms, has gradually drawn attention by people due to

its intelligence and implied parallelism [13]. GA has been

widely applied to solve the problem of resources scheduling

in large-scale, nonlinear cluster systems, and has achieved

ideal effects [3].

The core of resources scheduling technology lies in the

scheduling algorithm. At present, there exist numerous

algorithms for cluster resources scheduling, such as the

round-robin algorithm, least connection scheduling, the

minimum number of task algorithm, and the minimum

response time algorithm [4]. Later, some other scholars

proposed a series of dynamic algorithms, such as resources

scheduling algorithm based on task priority, dynamicweighted resources scheduling algorithm, and queue

resources scheduling algorithm [5]. Some scholars also

applied AI methods into resources scheduling of large-scale

cluster system, such as particle swarm optimization

algorithm (PSO) [9] and genetic algorithm (GA) [6].

Experiments show that the artificial methods can achieve

more optimal load balancing than traditional approaches.

III. MULTIPLE AGENT GENETIC ALGORITHMA. Agent Genetic Algorithm

From the agent perspective, treat an individual

within GA as an agent. This agent is capable of local

perception, competition, cooperation, self-learning, andreaching the purpose of global optimization through the

interaction between both agent and environment, and agent

and agent. This is the idea of MAGA [10]. The

implementation mechanism of MAGA is quite different

from GAs, and is mainly manifested in the interaction,

collaboration makes available, and self-learning among

individuals.

B. Individual Survival EnvironmentLike GA, MAGA still conducts manipulations on

individual. In MAGA, each individual MAGA is considered

an agent, capable of sensing, changing, and impacting its

environment autonomously, thus possessing its own

characteristics. All of the agents live in the agent grid

environment, as shown in Figure 1 below:

sizeL

sizeL

sizeL

,size

L ,size

L,size

L

Figure 1. Agent Grid

C.

Genetic OperatorIn MAGA, the genetic operators mainly include:

neighborhood competition operator, neighborhood

orthogonal crossover operator, mutation operator, and the

self-learning operator. Among these operators, the

neighborhood competition operator realized the operation of

competition among all agents; the neighborhood orthogonal

crossover operator achieved collaboration among agents; the

mutation and self-learning operators accomplished the

behavior that agents exploit their own knowledge [10].

D. Comparison between MAGA and GATable 1 illustrates the distinctions in the genetic

operation between GA and MAGA.GA MAGA

Individual Isomorphic Isomerous

Informationinteraction

method

After selected,through crossover

operation

Obtained four neighborhood

informationand self

updating

Geneticoperator

Selection,crossover,mutation

Neighborhood competition,Orthogonal crossover,mutation, self-learning

Self-learning No Yes

EvolutionEvolve without

purposeEvolve with purpose

Competition Roulette selectionInteraction withneighborhood

Table 1. The Operation Difference

Exemplifying the function optimization, compare the

performance of MAGA and GA. Optimization function

shown as follows, setting n = 20.

8/2/2019 APSCC206

3/6

1

( ) sin , [ 500, 500]n

n

i i

i

f x x x S=

= = (1)

Figure 2 shows the experiment results by comparing

the optimal function values from ten times running resultsof MAGA and GA.

00.10.20.30.40.50.60.70.8

Optimizing

value

Result for 10 times MAGA and GA oper

Figure 2. The Performance Difference

On the basis of Figure 2, the optimization results achieved

by MAGA are far superior to the traditional GAs.IV. THE ESTABLISHMENT OF LOAD BALANCING MODEL

The main parameters required from a single user

include: User (ReqPerHrPerUserReqSizeReqCPU

ReqMemory Count). Among them, ReqPerHrperUser

refers to the number of online users in every hour on

average in a user group. ReqSize represents the size of the

request sent by each user in the user group. ReqCPUindicates the quantity of CPU use needed to execute therequest, relative to a 2.4GHZ single-core CPU; the unit is a

percentage. ReqMemory means the size of the memory (in

M) consumed to execute the request. Count indicates the

number of sent requests in every minute.In order to solve the issue of exploding dimension,

group strategy is exploited to set up a resource schedulingmodel. The group strategy is based on the user requested

parameters, and the parameters inside the group all have the

maximum value. The sum of all users parameters inside

each group is no more than the maximum value set by the

group. According to a user request time sequence, we tend

to divide all user requests satisfying first arrival andmaximum value into one group, and reset all the parameters

inside the group. The resetting rule is shown in followingformula:

10 4

n

i j

i

j

Userg j

n

=

=

(2)

A. Establish Load Balancing ModelThe establishment of the load balancing model mainly

refers to the design of fitness function. On the basis of groupstrategy, all of the VM (virtual machine) virtual resources

on a physical resources host correspond to the user group

strategy request. One host contains several VM, and each

VM is able to allocate several groups. Each group can be

described as: Group (ReqPerHrPerUser, ReqSize, ReqCPU,

ReqMemory, Count).

The third parameter within Group is the size of

memory consumed to execute each request on average.

Therefore, after completing a group of tasks, the memory

load of the ith VM, VM_i, is:

/ 100%i i i i

Ml M Vmp Vm= + (3)

In this formula,i

Vmp andi

Vm are constant, and Mi is

the rest of the memory percentage before VM_i executing

the tasks.

The forth parameter within group is the consumption of

CPU needed to execute each request on average. Therefore,

after accomplishing a group of tasks, the CPU load of the ith

VM, VM_i, is:

/ 100%i i i i

Cl C Vmc Vc= + (4)

In this formula, Vmpi and Vmi are constant, and Ci is

the rest of the CPU percentage before executing the tasks.

On the basis of consumption of memory and CPU, theoverall load Vli on VM_i can be calculated according to

following formula:

i i iVl w M l v Cl = + (5)

In this formula, w and v are weighting factors, and

meet w+v=1. Thus, the overall load,Hlj on the host is:

0 0 ( )

j jm m

j ji ji jii iH l V l w M l v C l= =

= = +

(6)

In this formula, Vlji represents the load of the ith MV of

thejth host.Mlji indicates the memory load of the ith MV of

jth host, and Clji is the CPU load ofith MV on thejth host.M represents the number of VM activated on the jth hosts

physical machine. Calculate the average loadElof all of the

hosts within the DC.

0 0 0 0 0

( )j j

m mo o o

j ji ji ji

j j i j i

Hl Vl w Ml v Cl

Elo o o

= = = = =

+

= = =

(7)

In this formula, O is the number of the host physicalresources, and mj represents the number of MV resources.

The load difference between each host and system average

load is: |Hlj El|. Therefore, the fitness function can be setas:

0

o

j

j

f H l E l=

= (8)

Restriction condition:

8/2/2019 APSCC206

4/6

1 & 1ji ji

M l C l< < (9)

The goal is to make the functionfas small as possible

under the restriction condition.

B. EncodingThere are several encoding methods in GA, such as onedimensional encoding, multidimensional encoding, binary

encoding, decimal encoding, and floating-point encoding.

All these encoding approaches are also suitable to MAGA.For the sake of convenient operation, we exploit binary

encoding [12] which is the simplest and most commonly

used.

Suppose we have 10 user groups {Group0Group1

Group9}, and simultaneously have 30 VMs {VM_0

VM_1VM_29}. Each user group is treated as one

dimension.

These ten dimensions are set as {x0, x1, , x9},

respectively.xi corresponds to Group i. In that way, thereare 30 alternative VMs available forxi, 30 < 2

5. We take

each xi encoding length as five. Among 00001VM_0,

00010VM_1, , 11110VM_29, if adopting binary

multiple dimensions encoding fashion, then the encoding forthe entire system is {00000, 00000, , 00000}. A possiblesolution for a system with 10 user groups and 30 VMs

resources is as follows:

{00001, 00100,10010, 00110,10010,

11000,11100,00010, 01000,10000}

Therefore, for a system with n user groups and M VMs

virtual resources, the number of solution dimensions of the

issue is n, and the encoding length of each dimension for

each individual is log2M.

V. ALGORITHM PROCEDUREAlgorithm execution flow is shown as follows:

Step 1: Randomly generateLsize2

agents, and initiateL

0,

and then updateBest0, assuming t0.

Step 2: Execute neighborhood competition operator for

each agent inLt, and then obtainL

t+1/3.

Step 3: If U (0, 1) < Pc, then apply neighborhood

orthogonal crossover operator into each agent in Lt+1/3

to

generateL

t+2/3

.Step 4: If U (0, 1) < Pm, then applies the mutation

operator into each agent in Lt+2/3

, and then achieves Lt+1

.

Step 5: Determine the CBestt+1

fromLt+1

and apply theself-learning operator into CBest

t+1.

Step 6: IfEnergy (CBestt+1

) > Energy (CBestt), then

assumeBestt+1CBest

t+1; otherwise, Best

t+1Best

t

CBestt+1Bestt.

Step 7: If it meets the termination conditions, output

Besttand terminate; otherwise, set tt+1 and then resume

at Step 2.

Lt represents the tth generation agent network, and

Lt+1/3

and Lt+2/3

is the middle generation agent network ofLt

and Lt+1

. Besttis the optimal agent amongL

0,L

1, ,L

t, and

CBestt

represents the optimal agent among Lt

. Theparameters, Pc and Pm, are preset, and represent the

executive probability of neighborhood orthogonal crossoveroperator and mutation operator, respectively.

VI. SIMULATION EXPERIMENT RESULT AND ANALYSISIn the experiment, the parameters of the MAGA are set

as follows: Lsize=5 represents agent grid size or population

size; Po=0.25 represents occupation strategy of

neighborhood competition operator; Pc=0.1 representsexecution probability of neighborhood orthogonal crossover

operator; Pm=0.1 represents execution probability ofmutation operator; in self-learning, sLsize=2 refers to

population size; searching radius is 0.2; sPm=0.05 representsmutation rate, and the number of iterations is 10.

The experiment has been divided into three parts. Eachpart will apply Min_min scheduling and MAGA scheduling

respectively, and then compare and analyze the utilization

rate of CPU and memory.

In the first part of the experiment, set total 20

heterogeneous VMs to 10 hosts. The number of user groupsis 100 with weighting factors [ ] of w=0.5 and v=0.5. In the

second part, we adjust weighting factors to w=0.01and

v=0.99. The last part of the experiment is for testing single-

point failure rate, where weighting factors are w=0.01 and

v=0.99. Figure 3, 4, 5, 6 shows the first part of the

experiment.

40.00%

60.00%

80.00%

100.00%

Min_min Algorithm CPU Sampling

Figure 3. Sampling Result of CPU by Using Min_min

40.00%

60.00%

80.00%

100.00%

MAGA Algorithm CPU Sampling R

Figure 4. Sampling Result of CPU by Using MAGA

8/2/2019 APSCC206

5/6

40.00%

60.00%

80.00%

100.00%

Min_min Algorithm Memory Sampli

Figure 5. Sampling Result of Memory by Using Min_min

40.00%

60.00%

80.00%

100.00%

MAGA Memory Sampling Result

Figure 6. Sampling Result of Memory by Using MAGA

When w= 0.5, v= 0.5, MAGA has a significantadvantage over the Min_min algorithm on the load

balancing of CPU utilization, but not on the load balancing

of RAM usage. For this reason, we adjust the weight to

w=0.01 and v=0.99 in part 2. In this part, Figure 7 and 8

show the sampling result of CUP and memory usage by use

of MAGA.

40.00%

60.00%

80.00%

100.00%

MAGA CPU Sampling Result

Figure 7. Sampling Result of CPU by Using MAGA

40.00%

60.00%

80.00%

100.00%

MAGA Memory Sampling Result

Figure 8. Sampling Result of Memory by Using MAGA

From Figure 7 and 8 we can see that MAGA canachieve effective load balancing of both CPU and memory

usage when weighting factors are w=0.01 and v=0.99, andits degree of load balancing is still better than Min_min

algorithm. Generally, we usually consider CPU utilizationas a focus in practical applications, and disregard the

memory usage condition, or only consider memory

utilization rate and disregard CPU utilization. At this point,

we can adjust corresponding parameter values according to

the actual situation for achieving the desired load balancing

state.

Figure 9 shows a comparison of two algorithms ofsingle-point failure rate when weighting factors are

w=0.01and v=0.99. It can be seen from Figure 9 that the

single-point failure rate of MAGA is much smaller than

Min_min algorithm.

40

60

80

Number of failure of two scheduling poli

Figure 9. Number of Single-Point of Failure

For high performance cloud computing, it mainlyconsiders the efficiency of request and can ignore the

influence of the memory, and then the value ofw can be setto be bigger. Some cloud computing systems do not need

high computing power, but need to consume a large amount

of memory. At this time the value ofv should be bigger.

VII.CONCLUSIONThis paper experimentally proves that MAGA is more

appropriate than GA to handle high-dimensional function

optimization problems. Then, establishing a cloud

computing load balancing model, Min_min and MAGA

algorithms were applied for resource schedulingrespectively. By adjusting the parameters, the scheduling

results show that both CPU utilization and memory load

balancing for MAGA is much better than Min_minscheduling on average, and the comprehensive load

balancing effect can be achieved by adjusting weightingfactor. Moreover, the MAGA scheduling algorithm canresult in a smaller single-point failure rate. This shows that

this method, used for solving load balancing strategy based

on virtualized cloud computing, is feasible and effective.

REFERENCE

[1] Jinhua Hu; Jianhua Gu; Guofei Sun; Tianhai Zhao; , "A SchedulingStrategy on Load Balancing of Virtual Machine Resources in CloudComputing Environment," Parallel Architectures, Algorithms andProgramming (PAAP), 2010 Third International Symposium on ,pp.89-96, 18-20 Dec. 2010

[2] Chunye Gong; Jie Liu; Qiang Zhang; Haitao Chen; Zhenghu Gong; ,"The Characteristics of Cloud Computing," Parallel ProcessingWorkshops (ICPPW), 2010 39th International Conference on ,pp.275-279, 13-16 Sept. 2010

[3] Zhongni Zheng; Rui Wang; Hai Zhong; Xuejie Zhang; , "Anapproach for cloud resource scheduling based on Parallel GeneticAlgorithm," Computer Research and Development (ICCRD), 20113rd International Conference on, vol.2, pp.444-447, 11-13 March2011

[4] Shirero, S.; Takashi, M.; Kei, H.; , "On the schedulability conditionson partial time slots," Real-Time Computing Systems and

8/2/2019 APSCC206

6/6

Applications, 1999. RTCSA '99. Sixth International Conference on,pp.166-173, 1999

[5] Kant Soni, V.; Sharma, R.; Kumar Mishra, M.; , "An analysis ofvarious job scheduling strategies in grid computing," SignalProcessing Systems (ICSPS), 2010 2nd International Conference on,vol.2, pp.V2-162-V2-166, 5-7 July 2010

[6] Alizadeh, G.; Baradarannia, M.; Yazdizadeh, P.; Alipouri, Y.; , "Serialconfiguration of genetic algorithm and particle swarm optimizationto increase the convergence speed and accuracy," Intelligent Systems

Design and Applications (ISDA), 2010 10th InternationalConference on, pp.272-277, Nov. 29 2010-Dec. 1 2010

[7] John H. Holland. 1992.Adaptation in Natural and Artificial Systems.MIT Press, Cambridge, MA, USA.

[8] Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris,Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Warfield. 2003.Xen and the art of virtualization. In Proceedings of the nineteenthACM symposium on Operating systems principles (SOSP '03). ACM,New York, NY, USA, 164-177.

[9] Deb, K, and H G Beyer. Self-adaptive genetic algorithms withsimulated binary crossover.Evolutionary Computation 9.2 (2001) :197-221.

[10] Licheng Jiao, Jing Liu, and Weicai Zhong. CoevolutionaryComputation and Multi-agent systems. Beijing: Science Press.August, 2006.

[11] Yuxia Du, Fangai Liu, and Lei Guo. Research and improvement ofMin-Min scheduling algorithm. Computer Engineering and

Applications20104624107-109.

[12] R. Caruana and J. D. Schaffer, Representation and hidden bias:Gray vs. binary coding for genetic algorithms, in Proceedings of the5th International Conference on Machine Learning, ICML 1998,1988, pp. 153161.

[13] Baowen Xu; Yu Guan; Zhenqiang Chen; Leung, K.R.P.H.; , "Parallelgenetic algorithms with schema migration," Computer Software andApplications Conference, 2002. COMPSAC 2002. Proceedings. 26thAnnual International, vol., no., pp. 879- 884, 2002

Documents

APSCC206