Upload
kai-zhu
View
220
Download
0
Embed Size (px)
Citation preview
8/2/2019 APSCC206
1/6
Hybrid Genetic Algorithm for Cloud Computing Applications
Kai Zhu, Huaguang Song, Lijing Liu, Jinzhu Gao, Guojian Cheng
School of Engineering and Computer Science
University of the Pacific, Stockton, CA 95211
Email: kzhu, hsong, [email protected]
School of Computer Science
Xi'an Shiyou University, Dian Zi 2nd Road 18, Xi'an, Shaanxi P.R. China 710065
Email: [email protected], [email protected]
AbstractIn the cloud computing system, the schedule ofcomputing resources is a critical portion of cloud computing
study. An effective load balancing strategy is able to markedly
improve the task throughput of cloud computing. Virtual
machines are selected as a fundamental processing unit of cloud
computing. The resources in cloud computing will increasesharply and vary dynamically due to the utilization of
virtualization technology. Therefore, implementation of load
balancing in cloud computing has become complicated and it is
difficult to achieve. Multi-agent genetic algorithm (MAGA) is a
hybrid algorithm of GA, whose performance is far superior to
that of the traditional GA. This paper demonstrates the
advantage of MAGA over traditional GA, and then exploits
multi-agent genetic algorithms to solve the load balancing
problem in cloud computing, by designing a load balancing
model on the basis of virtualization resource management.
Finally, by comparing MAGA with Min_min strategy, the
experiment results prove that MAGA is able to achieve better
performance of load balancing.
Keywords-cloud computing, load balance, multi-agent genetic
algorithm, virtualization technology
I. INTRODUCTIONCloud computing is an inevitable trend in the future
computing development of technology. Its critical
significance lies in its ability to provide all users with high
performance and reliable calculation. Cloud computing is the
evolution of distributed computing, grid computing, and
multiple other techniques. One of the primary differences
between cloud computing and previous large-scale cluster
computing is that cloud computing uses distributed
computing and grid computing as the fundamental
processing unit. In cloud computing, by using virtualization
technology [8], one physical host can be virtualized into
multiple virtual hosts and use these hosts as a basic
computing unit. By adopting virtualization technology, cloud
computing in tandem with conventional cluster computing
will greatly improve utilization of hardware and also achieve
automatic monitoring for all hosts. Virtualization technology
not only has brought a lot of convenience to cloud
computing, but it also has made a large number of virtual
resources available in the cloud. The quantity of these virtual
resources is both enormous and dynamically changing.
Therefore, load balancing of the host in cloud computing is
one of the primary concerns in research.
Proposed by Professor J. H. Holland from the
University of Michigan in the early 1960s, GA was the first
evolutionary computation algorithm [7], extracting,simplifying, and abstracting the basic ideas from Darwins
theory of evolution and Mendels laws of inheritance. Using
the evolution theory of biosphere as a reference, the
algorithm utilizes computers to simulate the natural selection
mechanism of parent gene recombination and survival of
the fittest in the process of species reproduction. It can be
exploited to solve complicated problems in science and
engineering.
In recent years, research and application of GA has
been rapidly developed and widely utilized. It can be applied
to solve the complicated issues in science and engineering.
GA brings a remarkable impact to many growing fields, such
as artificial intelligence, knowledge discovery, patternrecognition, image processing, decision analysis, product
process design, resource scheduling, and stock market
analysis. However, there exist some restricting conditions in
solving high-dimensional function optimization problems,
rendering GA less effective in cloud computing. When using
classic GA to solve problems, such as coarse-grained, high-
dimensional, and large data set optimization problems, issues
like imperfect convergence, slow convergence, and no
convergence are inevitable. Therefore, scholars have
proposed a variety of improved genetic algorithms.
This paper mainly focuses on multi-agent genetic
algorithm (MAGA) [10]. It is a hybrid algorithm combining
GA and multi-agent techniques, which was originally
proposed by Professor Licheng Jiao. MAGA is a kind of
improved hybrid GA; execution demonstrates greatly
enhanced convergence time and optimization results
compared to that of traditional GA. MAGA has obvious
superiority, especially when handling very large-scale, high-
dimensional, complex, and dynamic optimization problems.
Therefore, this paper first introduces the strengths of MAGA
by comparing it with GA. Then, a build model will be used
8/2/2019 APSCC206
2/6
to convert it into a mathematical problem based on load
balancing of virtualized cloud computing. At last, we use
MAGA to solve load-balancing strategy, and compare the
result with common Min_min algorithm [11].
II. RELATED WORKThe basic principle of cloud computing is to distribute
computing tasks across a large number of distributed
computers rather than the local computer. Hu et al. [1]
proposed a scheduling strategy for VM load balancing in
cloud computing based on GA. This could effectively
improve overall system reliability and availability, which is
one of the primary concerns in cloud computing. Gong et al.
[2] conducted an analysis for the features of cloud
computing. He explored the applications of virtualization
technology which stems from an advanced virtual host. And
then, apply the virtualization technology into resource
management and virtualization storage.
In recent years, artificial intelligence methods such as
evolutionary computation, especially its branch in geneticalgorithms, has gradually drawn attention by people due to
its intelligence and implied parallelism [13]. GA has been
widely applied to solve the problem of resources scheduling
in large-scale, nonlinear cluster systems, and has achieved
ideal effects [3].
The core of resources scheduling technology lies in the
scheduling algorithm. At present, there exist numerous
algorithms for cluster resources scheduling, such as the
round-robin algorithm, least connection scheduling, the
minimum number of task algorithm, and the minimum
response time algorithm [4]. Later, some other scholars
proposed a series of dynamic algorithms, such as resources
scheduling algorithm based on task priority, dynamicweighted resources scheduling algorithm, and queue
resources scheduling algorithm [5]. Some scholars also
applied AI methods into resources scheduling of large-scale
cluster system, such as particle swarm optimization
algorithm (PSO) [9] and genetic algorithm (GA) [6].
Experiments show that the artificial methods can achieve
more optimal load balancing than traditional approaches.
III. MULTIPLE AGENT GENETIC ALGORITHMA. Agent Genetic Algorithm
From the agent perspective, treat an individual
within GA as an agent. This agent is capable of local
perception, competition, cooperation, self-learning, andreaching the purpose of global optimization through the
interaction between both agent and environment, and agent
and agent. This is the idea of MAGA [10]. The
implementation mechanism of MAGA is quite different
from GAs, and is mainly manifested in the interaction,
collaboration makes available, and self-learning among
individuals.
B. Individual Survival EnvironmentLike GA, MAGA still conducts manipulations on
individual. In MAGA, each individual MAGA is considered
an agent, capable of sensing, changing, and impacting its
environment autonomously, thus possessing its own
characteristics. All of the agents live in the agent grid
environment, as shown in Figure 1 below:
sizeL
sizeL
sizeL
,size
L ,size
L,size
L
Figure 1. Agent Grid
C.
Genetic OperatorIn MAGA, the genetic operators mainly include:
neighborhood competition operator, neighborhood
orthogonal crossover operator, mutation operator, and the
self-learning operator. Among these operators, the
neighborhood competition operator realized the operation of
competition among all agents; the neighborhood orthogonal
crossover operator achieved collaboration among agents; the
mutation and self-learning operators accomplished the
behavior that agents exploit their own knowledge [10].
D. Comparison between MAGA and GATable 1 illustrates the distinctions in the genetic
operation between GA and MAGA.GA MAGA
Individual Isomorphic Isomerous
Informationinteraction
method
After selected,through crossover
operation
Obtained four neighborhood
informationand self
updating
Geneticoperator
Selection,crossover,mutation
Neighborhood competition,Orthogonal crossover,mutation, self-learning
Self-learning No Yes
EvolutionEvolve without
purposeEvolve with purpose
Competition Roulette selectionInteraction withneighborhood
Table 1. The Operation Difference
Exemplifying the function optimization, compare the
performance of MAGA and GA. Optimization function
shown as follows, setting n = 20.
8/2/2019 APSCC206
3/6
1
( ) sin , [ 500, 500]n
n
i i
i
f x x x S=
= = (1)
Figure 2 shows the experiment results by comparing
the optimal function values from ten times running resultsof MAGA and GA.
00.10.20.30.40.50.60.70.8
Optimizing
value
Result for 10 times MAGA and GA oper
Figure 2. The Performance Difference
On the basis of Figure 2, the optimization results achieved
by MAGA are far superior to the traditional GAs.IV. THE ESTABLISHMENT OF LOAD BALANCING MODEL
The main parameters required from a single user
include: User (ReqPerHrPerUserReqSizeReqCPU
ReqMemory Count). Among them, ReqPerHrperUser
refers to the number of online users in every hour on
average in a user group. ReqSize represents the size of the
request sent by each user in the user group. ReqCPUindicates the quantity of CPU use needed to execute therequest, relative to a 2.4GHZ single-core CPU; the unit is a
percentage. ReqMemory means the size of the memory (in
M) consumed to execute the request. Count indicates the
number of sent requests in every minute.In order to solve the issue of exploding dimension,
group strategy is exploited to set up a resource schedulingmodel. The group strategy is based on the user requested
parameters, and the parameters inside the group all have the
maximum value. The sum of all users parameters inside
each group is no more than the maximum value set by the
group. According to a user request time sequence, we tend
to divide all user requests satisfying first arrival andmaximum value into one group, and reset all the parameters
inside the group. The resetting rule is shown in followingformula:
10 4
n
i j
i
j
Userg j
n
=
=
(2)
A. Establish Load Balancing ModelThe establishment of the load balancing model mainly
refers to the design of fitness function. On the basis of groupstrategy, all of the VM (virtual machine) virtual resources
on a physical resources host correspond to the user group
strategy request. One host contains several VM, and each
VM is able to allocate several groups. Each group can be
described as: Group (ReqPerHrPerUser, ReqSize, ReqCPU,
ReqMemory, Count).
The third parameter within Group is the size of
memory consumed to execute each request on average.
Therefore, after completing a group of tasks, the memory
load of the ith VM, VM_i, is:
/ 100%i i i i
Ml M Vmp Vm= + (3)
In this formula,i
Vmp andi
Vm are constant, and Mi is
the rest of the memory percentage before VM_i executing
the tasks.
The forth parameter within group is the consumption of
CPU needed to execute each request on average. Therefore,
after accomplishing a group of tasks, the CPU load of the ith
VM, VM_i, is:
/ 100%i i i i
Cl C Vmc Vc= + (4)
In this formula, Vmpi and Vmi are constant, and Ci is
the rest of the CPU percentage before executing the tasks.
On the basis of consumption of memory and CPU, theoverall load Vli on VM_i can be calculated according to
following formula:
i i iVl w M l v Cl = + (5)
In this formula, w and v are weighting factors, and
meet w+v=1. Thus, the overall load,Hlj on the host is:
0 0 ( )
j jm m
j ji ji jii iH l V l w M l v C l= =
= = +
(6)
In this formula, Vlji represents the load of the ith MV of
thejth host.Mlji indicates the memory load of the ith MV of
jth host, and Clji is the CPU load ofith MV on thejth host.M represents the number of VM activated on the jth hosts
physical machine. Calculate the average loadElof all of the
hosts within the DC.
0 0 0 0 0
( )j j
m mo o o
j ji ji ji
j j i j i
Hl Vl w Ml v Cl
Elo o o
= = = = =
+
= = =
(7)
In this formula, O is the number of the host physicalresources, and mj represents the number of MV resources.
The load difference between each host and system average
load is: |Hlj El|. Therefore, the fitness function can be setas:
0
o
j
j
f H l E l=
= (8)
Restriction condition:
8/2/2019 APSCC206
4/6
1 & 1ji ji
M l C l< < (9)
The goal is to make the functionfas small as possible
under the restriction condition.
B. EncodingThere are several encoding methods in GA, such as onedimensional encoding, multidimensional encoding, binary
encoding, decimal encoding, and floating-point encoding.
All these encoding approaches are also suitable to MAGA.For the sake of convenient operation, we exploit binary
encoding [12] which is the simplest and most commonly
used.
Suppose we have 10 user groups {Group0Group1
Group9}, and simultaneously have 30 VMs {VM_0
VM_1VM_29}. Each user group is treated as one
dimension.
These ten dimensions are set as {x0, x1, , x9},
respectively.xi corresponds to Group i. In that way, thereare 30 alternative VMs available forxi, 30 < 2
5. We take
each xi encoding length as five. Among 00001VM_0,
00010VM_1, , 11110VM_29, if adopting binary
multiple dimensions encoding fashion, then the encoding forthe entire system is {00000, 00000, , 00000}. A possiblesolution for a system with 10 user groups and 30 VMs
resources is as follows:
{00001, 00100,10010, 00110,10010,
11000,11100,00010, 01000,10000}
Therefore, for a system with n user groups and M VMs
virtual resources, the number of solution dimensions of the
issue is n, and the encoding length of each dimension for
each individual is log2M.
V. ALGORITHM PROCEDUREAlgorithm execution flow is shown as follows:
Step 1: Randomly generateLsize2
agents, and initiateL
0,
and then updateBest0, assuming t0.
Step 2: Execute neighborhood competition operator for
each agent inLt, and then obtainL
t+1/3.
Step 3: If U (0, 1) < Pc, then apply neighborhood
orthogonal crossover operator into each agent in Lt+1/3
to
generateL
t+2/3
.Step 4: If U (0, 1) < Pm, then applies the mutation
operator into each agent in Lt+2/3
, and then achieves Lt+1
.
Step 5: Determine the CBestt+1
fromLt+1
and apply theself-learning operator into CBest
t+1.
Step 6: IfEnergy (CBestt+1
) > Energy (CBestt), then
assumeBestt+1CBest
t+1; otherwise, Best
t+1Best
t
CBestt+1Bestt.
Step 7: If it meets the termination conditions, output
Besttand terminate; otherwise, set tt+1 and then resume
at Step 2.
Lt represents the tth generation agent network, and
Lt+1/3
and Lt+2/3
is the middle generation agent network ofLt
and Lt+1
. Besttis the optimal agent amongL
0,L
1, ,L
t, and
CBestt
represents the optimal agent among Lt
. Theparameters, Pc and Pm, are preset, and represent the
executive probability of neighborhood orthogonal crossoveroperator and mutation operator, respectively.
VI. SIMULATION EXPERIMENT RESULT AND ANALYSISIn the experiment, the parameters of the MAGA are set
as follows: Lsize=5 represents agent grid size or population
size; Po=0.25 represents occupation strategy of
neighborhood competition operator; Pc=0.1 representsexecution probability of neighborhood orthogonal crossover
operator; Pm=0.1 represents execution probability ofmutation operator; in self-learning, sLsize=2 refers to
population size; searching radius is 0.2; sPm=0.05 representsmutation rate, and the number of iterations is 10.
The experiment has been divided into three parts. Eachpart will apply Min_min scheduling and MAGA scheduling
respectively, and then compare and analyze the utilization
rate of CPU and memory.
In the first part of the experiment, set total 20
heterogeneous VMs to 10 hosts. The number of user groupsis 100 with weighting factors [ ] of w=0.5 and v=0.5. In the
second part, we adjust weighting factors to w=0.01and
v=0.99. The last part of the experiment is for testing single-
point failure rate, where weighting factors are w=0.01 and
v=0.99. Figure 3, 4, 5, 6 shows the first part of the
experiment.
40.00%
60.00%
80.00%
100.00%
Min_min Algorithm CPU Sampling
Figure 3. Sampling Result of CPU by Using Min_min
40.00%
60.00%
80.00%
100.00%
MAGA Algorithm CPU Sampling R
Figure 4. Sampling Result of CPU by Using MAGA
8/2/2019 APSCC206
5/6
40.00%
60.00%
80.00%
100.00%
Min_min Algorithm Memory Sampli
Figure 5. Sampling Result of Memory by Using Min_min
40.00%
60.00%
80.00%
100.00%
MAGA Memory Sampling Result
Figure 6. Sampling Result of Memory by Using MAGA
When w= 0.5, v= 0.5, MAGA has a significantadvantage over the Min_min algorithm on the load
balancing of CPU utilization, but not on the load balancing
of RAM usage. For this reason, we adjust the weight to
w=0.01 and v=0.99 in part 2. In this part, Figure 7 and 8
show the sampling result of CUP and memory usage by use
of MAGA.
40.00%
60.00%
80.00%
100.00%
MAGA CPU Sampling Result
Figure 7. Sampling Result of CPU by Using MAGA
40.00%
60.00%
80.00%
100.00%
MAGA Memory Sampling Result
Figure 8. Sampling Result of Memory by Using MAGA
From Figure 7 and 8 we can see that MAGA canachieve effective load balancing of both CPU and memory
usage when weighting factors are w=0.01 and v=0.99, andits degree of load balancing is still better than Min_min
algorithm. Generally, we usually consider CPU utilizationas a focus in practical applications, and disregard the
memory usage condition, or only consider memory
utilization rate and disregard CPU utilization. At this point,
we can adjust corresponding parameter values according to
the actual situation for achieving the desired load balancing
state.
Figure 9 shows a comparison of two algorithms ofsingle-point failure rate when weighting factors are
w=0.01and v=0.99. It can be seen from Figure 9 that the
single-point failure rate of MAGA is much smaller than
Min_min algorithm.
40
60
80
Number of failure of two scheduling poli
Figure 9. Number of Single-Point of Failure
For high performance cloud computing, it mainlyconsiders the efficiency of request and can ignore the
influence of the memory, and then the value ofw can be setto be bigger. Some cloud computing systems do not need
high computing power, but need to consume a large amount
of memory. At this time the value ofv should be bigger.
VII.CONCLUSIONThis paper experimentally proves that MAGA is more
appropriate than GA to handle high-dimensional function
optimization problems. Then, establishing a cloud
computing load balancing model, Min_min and MAGA
algorithms were applied for resource schedulingrespectively. By adjusting the parameters, the scheduling
results show that both CPU utilization and memory load
balancing for MAGA is much better than Min_minscheduling on average, and the comprehensive load
balancing effect can be achieved by adjusting weightingfactor. Moreover, the MAGA scheduling algorithm canresult in a smaller single-point failure rate. This shows that
this method, used for solving load balancing strategy based
on virtualized cloud computing, is feasible and effective.
REFERENCE
[1] Jinhua Hu; Jianhua Gu; Guofei Sun; Tianhai Zhao; , "A SchedulingStrategy on Load Balancing of Virtual Machine Resources in CloudComputing Environment," Parallel Architectures, Algorithms andProgramming (PAAP), 2010 Third International Symposium on ,pp.89-96, 18-20 Dec. 2010
[2] Chunye Gong; Jie Liu; Qiang Zhang; Haitao Chen; Zhenghu Gong; ,"The Characteristics of Cloud Computing," Parallel ProcessingWorkshops (ICPPW), 2010 39th International Conference on ,pp.275-279, 13-16 Sept. 2010
[3] Zhongni Zheng; Rui Wang; Hai Zhong; Xuejie Zhang; , "Anapproach for cloud resource scheduling based on Parallel GeneticAlgorithm," Computer Research and Development (ICCRD), 20113rd International Conference on, vol.2, pp.444-447, 11-13 March2011
[4] Shirero, S.; Takashi, M.; Kei, H.; , "On the schedulability conditionson partial time slots," Real-Time Computing Systems and
8/2/2019 APSCC206
6/6
Applications, 1999. RTCSA '99. Sixth International Conference on,pp.166-173, 1999
[5] Kant Soni, V.; Sharma, R.; Kumar Mishra, M.; , "An analysis ofvarious job scheduling strategies in grid computing," SignalProcessing Systems (ICSPS), 2010 2nd International Conference on,vol.2, pp.V2-162-V2-166, 5-7 July 2010
[6] Alizadeh, G.; Baradarannia, M.; Yazdizadeh, P.; Alipouri, Y.; , "Serialconfiguration of genetic algorithm and particle swarm optimizationto increase the convergence speed and accuracy," Intelligent Systems
Design and Applications (ISDA), 2010 10th InternationalConference on, pp.272-277, Nov. 29 2010-Dec. 1 2010
[7] John H. Holland. 1992.Adaptation in Natural and Artificial Systems.MIT Press, Cambridge, MA, USA.
[8] Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris,Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Warfield. 2003.Xen and the art of virtualization. In Proceedings of the nineteenthACM symposium on Operating systems principles (SOSP '03). ACM,New York, NY, USA, 164-177.
[9] Deb, K, and H G Beyer. Self-adaptive genetic algorithms withsimulated binary crossover.Evolutionary Computation 9.2 (2001) :197-221.
[10] Licheng Jiao, Jing Liu, and Weicai Zhong. CoevolutionaryComputation and Multi-agent systems. Beijing: Science Press.August, 2006.
[11] Yuxia Du, Fangai Liu, and Lei Guo. Research and improvement ofMin-Min scheduling algorithm. Computer Engineering and
Applications20104624107-109.
[12] R. Caruana and J. D. Schaffer, Representation and hidden bias:Gray vs. binary coding for genetic algorithms, in Proceedings of the5th International Conference on Machine Learning, ICML 1998,1988, pp. 153161.
[13] Baowen Xu; Yu Guan; Zhenqiang Chen; Leung, K.R.P.H.; , "Parallelgenetic algorithms with schema migration," Computer Software andApplications Conference, 2002. COMPSAC 2002. Proceedings. 26thAnnual International, vol., no., pp. 879- 884, 2002