APSCC206

  • Upload
    kai-zhu

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

  • 8/2/2019 APSCC206

    1/6

    Hybrid Genetic Algorithm for Cloud Computing Applications

    Kai Zhu, Huaguang Song, Lijing Liu, Jinzhu Gao, Guojian Cheng

    School of Engineering and Computer Science

    University of the Pacific, Stockton, CA 95211

    Email: kzhu, hsong, [email protected]

    School of Computer Science

    Xi'an Shiyou University, Dian Zi 2nd Road 18, Xi'an, Shaanxi P.R. China 710065

    Email: [email protected], [email protected]

    AbstractIn the cloud computing system, the schedule ofcomputing resources is a critical portion of cloud computing

    study. An effective load balancing strategy is able to markedly

    improve the task throughput of cloud computing. Virtual

    machines are selected as a fundamental processing unit of cloud

    computing. The resources in cloud computing will increasesharply and vary dynamically due to the utilization of

    virtualization technology. Therefore, implementation of load

    balancing in cloud computing has become complicated and it is

    difficult to achieve. Multi-agent genetic algorithm (MAGA) is a

    hybrid algorithm of GA, whose performance is far superior to

    that of the traditional GA. This paper demonstrates the

    advantage of MAGA over traditional GA, and then exploits

    multi-agent genetic algorithms to solve the load balancing

    problem in cloud computing, by designing a load balancing

    model on the basis of virtualization resource management.

    Finally, by comparing MAGA with Min_min strategy, the

    experiment results prove that MAGA is able to achieve better

    performance of load balancing.

    Keywords-cloud computing, load balance, multi-agent genetic

    algorithm, virtualization technology

    I. INTRODUCTIONCloud computing is an inevitable trend in the future

    computing development of technology. Its critical

    significance lies in its ability to provide all users with high

    performance and reliable calculation. Cloud computing is the

    evolution of distributed computing, grid computing, and

    multiple other techniques. One of the primary differences

    between cloud computing and previous large-scale cluster

    computing is that cloud computing uses distributed

    computing and grid computing as the fundamental

    processing unit. In cloud computing, by using virtualization

    technology [8], one physical host can be virtualized into

    multiple virtual hosts and use these hosts as a basic

    computing unit. By adopting virtualization technology, cloud

    computing in tandem with conventional cluster computing

    will greatly improve utilization of hardware and also achieve

    automatic monitoring for all hosts. Virtualization technology

    not only has brought a lot of convenience to cloud

    computing, but it also has made a large number of virtual

    resources available in the cloud. The quantity of these virtual

    resources is both enormous and dynamically changing.

    Therefore, load balancing of the host in cloud computing is

    one of the primary concerns in research.

    Proposed by Professor J. H. Holland from the

    University of Michigan in the early 1960s, GA was the first

    evolutionary computation algorithm [7], extracting,simplifying, and abstracting the basic ideas from Darwins

    theory of evolution and Mendels laws of inheritance. Using

    the evolution theory of biosphere as a reference, the

    algorithm utilizes computers to simulate the natural selection

    mechanism of parent gene recombination and survival of

    the fittest in the process of species reproduction. It can be

    exploited to solve complicated problems in science and

    engineering.

    In recent years, research and application of GA has

    been rapidly developed and widely utilized. It can be applied

    to solve the complicated issues in science and engineering.

    GA brings a remarkable impact to many growing fields, such

    as artificial intelligence, knowledge discovery, patternrecognition, image processing, decision analysis, product

    process design, resource scheduling, and stock market

    analysis. However, there exist some restricting conditions in

    solving high-dimensional function optimization problems,

    rendering GA less effective in cloud computing. When using

    classic GA to solve problems, such as coarse-grained, high-

    dimensional, and large data set optimization problems, issues

    like imperfect convergence, slow convergence, and no

    convergence are inevitable. Therefore, scholars have

    proposed a variety of improved genetic algorithms.

    This paper mainly focuses on multi-agent genetic

    algorithm (MAGA) [10]. It is a hybrid algorithm combining

    GA and multi-agent techniques, which was originally

    proposed by Professor Licheng Jiao. MAGA is a kind of

    improved hybrid GA; execution demonstrates greatly

    enhanced convergence time and optimization results

    compared to that of traditional GA. MAGA has obvious

    superiority, especially when handling very large-scale, high-

    dimensional, complex, and dynamic optimization problems.

    Therefore, this paper first introduces the strengths of MAGA

    by comparing it with GA. Then, a build model will be used

  • 8/2/2019 APSCC206

    2/6

    to convert it into a mathematical problem based on load

    balancing of virtualized cloud computing. At last, we use

    MAGA to solve load-balancing strategy, and compare the

    result with common Min_min algorithm [11].

    II. RELATED WORKThe basic principle of cloud computing is to distribute

    computing tasks across a large number of distributed

    computers rather than the local computer. Hu et al. [1]

    proposed a scheduling strategy for VM load balancing in

    cloud computing based on GA. This could effectively

    improve overall system reliability and availability, which is

    one of the primary concerns in cloud computing. Gong et al.

    [2] conducted an analysis for the features of cloud

    computing. He explored the applications of virtualization

    technology which stems from an advanced virtual host. And

    then, apply the virtualization technology into resource

    management and virtualization storage.

    In recent years, artificial intelligence methods such as

    evolutionary computation, especially its branch in geneticalgorithms, has gradually drawn attention by people due to

    its intelligence and implied parallelism [13]. GA has been

    widely applied to solve the problem of resources scheduling

    in large-scale, nonlinear cluster systems, and has achieved

    ideal effects [3].

    The core of resources scheduling technology lies in the

    scheduling algorithm. At present, there exist numerous

    algorithms for cluster resources scheduling, such as the

    round-robin algorithm, least connection scheduling, the

    minimum number of task algorithm, and the minimum

    response time algorithm [4]. Later, some other scholars

    proposed a series of dynamic algorithms, such as resources

    scheduling algorithm based on task priority, dynamicweighted resources scheduling algorithm, and queue

    resources scheduling algorithm [5]. Some scholars also

    applied AI methods into resources scheduling of large-scale

    cluster system, such as particle swarm optimization

    algorithm (PSO) [9] and genetic algorithm (GA) [6].

    Experiments show that the artificial methods can achieve

    more optimal load balancing than traditional approaches.

    III. MULTIPLE AGENT GENETIC ALGORITHMA. Agent Genetic Algorithm

    From the agent perspective, treat an individual

    within GA as an agent. This agent is capable of local

    perception, competition, cooperation, self-learning, andreaching the purpose of global optimization through the

    interaction between both agent and environment, and agent

    and agent. This is the idea of MAGA [10]. The

    implementation mechanism of MAGA is quite different

    from GAs, and is mainly manifested in the interaction,

    collaboration makes available, and self-learning among

    individuals.

    B. Individual Survival EnvironmentLike GA, MAGA still conducts manipulations on

    individual. In MAGA, each individual MAGA is considered

    an agent, capable of sensing, changing, and impacting its

    environment autonomously, thus possessing its own

    characteristics. All of the agents live in the agent grid

    environment, as shown in Figure 1 below:

    sizeL

    sizeL

    sizeL

    ,size

    L ,size

    L,size

    L

    Figure 1. Agent Grid

    C.

    Genetic OperatorIn MAGA, the genetic operators mainly include:

    neighborhood competition operator, neighborhood

    orthogonal crossover operator, mutation operator, and the

    self-learning operator. Among these operators, the

    neighborhood competition operator realized the operation of

    competition among all agents; the neighborhood orthogonal

    crossover operator achieved collaboration among agents; the

    mutation and self-learning operators accomplished the

    behavior that agents exploit their own knowledge [10].

    D. Comparison between MAGA and GATable 1 illustrates the distinctions in the genetic

    operation between GA and MAGA.GA MAGA

    Individual Isomorphic Isomerous

    Informationinteraction

    method

    After selected,through crossover

    operation

    Obtained four neighborhood

    informationand self

    updating

    Geneticoperator

    Selection,crossover,mutation

    Neighborhood competition,Orthogonal crossover,mutation, self-learning

    Self-learning No Yes

    EvolutionEvolve without

    purposeEvolve with purpose

    Competition Roulette selectionInteraction withneighborhood

    Table 1. The Operation Difference

    Exemplifying the function optimization, compare the

    performance of MAGA and GA. Optimization function

    shown as follows, setting n = 20.

  • 8/2/2019 APSCC206

    3/6

    1

    ( ) sin , [ 500, 500]n

    n

    i i

    i

    f x x x S=

    = = (1)

    Figure 2 shows the experiment results by comparing

    the optimal function values from ten times running resultsof MAGA and GA.

    00.10.20.30.40.50.60.70.8

    Optimizing

    value

    Result for 10 times MAGA and GA oper

    Figure 2. The Performance Difference

    On the basis of Figure 2, the optimization results achieved

    by MAGA are far superior to the traditional GAs.IV. THE ESTABLISHMENT OF LOAD BALANCING MODEL

    The main parameters required from a single user

    include: User (ReqPerHrPerUserReqSizeReqCPU

    ReqMemory Count). Among them, ReqPerHrperUser

    refers to the number of online users in every hour on

    average in a user group. ReqSize represents the size of the

    request sent by each user in the user group. ReqCPUindicates the quantity of CPU use needed to execute therequest, relative to a 2.4GHZ single-core CPU; the unit is a

    percentage. ReqMemory means the size of the memory (in

    M) consumed to execute the request. Count indicates the

    number of sent requests in every minute.In order to solve the issue of exploding dimension,

    group strategy is exploited to set up a resource schedulingmodel. The group strategy is based on the user requested

    parameters, and the parameters inside the group all have the

    maximum value. The sum of all users parameters inside

    each group is no more than the maximum value set by the

    group. According to a user request time sequence, we tend

    to divide all user requests satisfying first arrival andmaximum value into one group, and reset all the parameters

    inside the group. The resetting rule is shown in followingformula:

    10 4

    n

    i j

    i

    j

    Userg j

    n

    =

    =

    (2)

    A. Establish Load Balancing ModelThe establishment of the load balancing model mainly

    refers to the design of fitness function. On the basis of groupstrategy, all of the VM (virtual machine) virtual resources

    on a physical resources host correspond to the user group

    strategy request. One host contains several VM, and each

    VM is able to allocate several groups. Each group can be

    described as: Group (ReqPerHrPerUser, ReqSize, ReqCPU,

    ReqMemory, Count).

    The third parameter within Group is the size of

    memory consumed to execute each request on average.

    Therefore, after completing a group of tasks, the memory

    load of the ith VM, VM_i, is:

    / 100%i i i i

    Ml M Vmp Vm= + (3)

    In this formula,i

    Vmp andi

    Vm are constant, and Mi is

    the rest of the memory percentage before VM_i executing

    the tasks.

    The forth parameter within group is the consumption of

    CPU needed to execute each request on average. Therefore,

    after accomplishing a group of tasks, the CPU load of the ith

    VM, VM_i, is:

    / 100%i i i i

    Cl C Vmc Vc= + (4)

    In this formula, Vmpi and Vmi are constant, and Ci is

    the rest of the CPU percentage before executing the tasks.

    On the basis of consumption of memory and CPU, theoverall load Vli on VM_i can be calculated according to

    following formula:

    i i iVl w M l v Cl = + (5)

    In this formula, w and v are weighting factors, and

    meet w+v=1. Thus, the overall load,Hlj on the host is:

    0 0 ( )

    j jm m

    j ji ji jii iH l V l w M l v C l= =

    = = +

    (6)

    In this formula, Vlji represents the load of the ith MV of

    thejth host.Mlji indicates the memory load of the ith MV of

    jth host, and Clji is the CPU load ofith MV on thejth host.M represents the number of VM activated on the jth hosts

    physical machine. Calculate the average loadElof all of the

    hosts within the DC.

    0 0 0 0 0

    ( )j j

    m mo o o

    j ji ji ji

    j j i j i

    Hl Vl w Ml v Cl

    Elo o o

    = = = = =

    +

    = = =

    (7)

    In this formula, O is the number of the host physicalresources, and mj represents the number of MV resources.

    The load difference between each host and system average

    load is: |Hlj El|. Therefore, the fitness function can be setas:

    0

    o

    j

    j

    f H l E l=

    = (8)

    Restriction condition:

  • 8/2/2019 APSCC206

    4/6

    1 & 1ji ji

    M l C l< < (9)

    The goal is to make the functionfas small as possible

    under the restriction condition.

    B. EncodingThere are several encoding methods in GA, such as onedimensional encoding, multidimensional encoding, binary

    encoding, decimal encoding, and floating-point encoding.

    All these encoding approaches are also suitable to MAGA.For the sake of convenient operation, we exploit binary

    encoding [12] which is the simplest and most commonly

    used.

    Suppose we have 10 user groups {Group0Group1

    Group9}, and simultaneously have 30 VMs {VM_0

    VM_1VM_29}. Each user group is treated as one

    dimension.

    These ten dimensions are set as {x0, x1, , x9},

    respectively.xi corresponds to Group i. In that way, thereare 30 alternative VMs available forxi, 30 < 2

    5. We take

    each xi encoding length as five. Among 00001VM_0,

    00010VM_1, , 11110VM_29, if adopting binary

    multiple dimensions encoding fashion, then the encoding forthe entire system is {00000, 00000, , 00000}. A possiblesolution for a system with 10 user groups and 30 VMs

    resources is as follows:

    {00001, 00100,10010, 00110,10010,

    11000,11100,00010, 01000,10000}

    Therefore, for a system with n user groups and M VMs

    virtual resources, the number of solution dimensions of the

    issue is n, and the encoding length of each dimension for

    each individual is log2M.

    V. ALGORITHM PROCEDUREAlgorithm execution flow is shown as follows:

    Step 1: Randomly generateLsize2

    agents, and initiateL

    0,

    and then updateBest0, assuming t0.

    Step 2: Execute neighborhood competition operator for

    each agent inLt, and then obtainL

    t+1/3.

    Step 3: If U (0, 1) < Pc, then apply neighborhood

    orthogonal crossover operator into each agent in Lt+1/3

    to

    generateL

    t+2/3

    .Step 4: If U (0, 1) < Pm, then applies the mutation

    operator into each agent in Lt+2/3

    , and then achieves Lt+1

    .

    Step 5: Determine the CBestt+1

    fromLt+1

    and apply theself-learning operator into CBest

    t+1.

    Step 6: IfEnergy (CBestt+1

    ) > Energy (CBestt), then

    assumeBestt+1CBest

    t+1; otherwise, Best

    t+1Best

    t

    CBestt+1Bestt.

    Step 7: If it meets the termination conditions, output

    Besttand terminate; otherwise, set tt+1 and then resume

    at Step 2.

    Lt represents the tth generation agent network, and

    Lt+1/3

    and Lt+2/3

    is the middle generation agent network ofLt

    and Lt+1

    . Besttis the optimal agent amongL

    0,L

    1, ,L

    t, and

    CBestt

    represents the optimal agent among Lt

    . Theparameters, Pc and Pm, are preset, and represent the

    executive probability of neighborhood orthogonal crossoveroperator and mutation operator, respectively.

    VI. SIMULATION EXPERIMENT RESULT AND ANALYSISIn the experiment, the parameters of the MAGA are set

    as follows: Lsize=5 represents agent grid size or population

    size; Po=0.25 represents occupation strategy of

    neighborhood competition operator; Pc=0.1 representsexecution probability of neighborhood orthogonal crossover

    operator; Pm=0.1 represents execution probability ofmutation operator; in self-learning, sLsize=2 refers to

    population size; searching radius is 0.2; sPm=0.05 representsmutation rate, and the number of iterations is 10.

    The experiment has been divided into three parts. Eachpart will apply Min_min scheduling and MAGA scheduling

    respectively, and then compare and analyze the utilization

    rate of CPU and memory.

    In the first part of the experiment, set total 20

    heterogeneous VMs to 10 hosts. The number of user groupsis 100 with weighting factors [ ] of w=0.5 and v=0.5. In the

    second part, we adjust weighting factors to w=0.01and

    v=0.99. The last part of the experiment is for testing single-

    point failure rate, where weighting factors are w=0.01 and

    v=0.99. Figure 3, 4, 5, 6 shows the first part of the

    experiment.

    40.00%

    60.00%

    80.00%

    100.00%

    Min_min Algorithm CPU Sampling

    Figure 3. Sampling Result of CPU by Using Min_min

    40.00%

    60.00%

    80.00%

    100.00%

    MAGA Algorithm CPU Sampling R

    Figure 4. Sampling Result of CPU by Using MAGA

  • 8/2/2019 APSCC206

    5/6

    40.00%

    60.00%

    80.00%

    100.00%

    Min_min Algorithm Memory Sampli

    Figure 5. Sampling Result of Memory by Using Min_min

    40.00%

    60.00%

    80.00%

    100.00%

    MAGA Memory Sampling Result

    Figure 6. Sampling Result of Memory by Using MAGA

    When w= 0.5, v= 0.5, MAGA has a significantadvantage over the Min_min algorithm on the load

    balancing of CPU utilization, but not on the load balancing

    of RAM usage. For this reason, we adjust the weight to

    w=0.01 and v=0.99 in part 2. In this part, Figure 7 and 8

    show the sampling result of CUP and memory usage by use

    of MAGA.

    40.00%

    60.00%

    80.00%

    100.00%

    MAGA CPU Sampling Result

    Figure 7. Sampling Result of CPU by Using MAGA

    40.00%

    60.00%

    80.00%

    100.00%

    MAGA Memory Sampling Result

    Figure 8. Sampling Result of Memory by Using MAGA

    From Figure 7 and 8 we can see that MAGA canachieve effective load balancing of both CPU and memory

    usage when weighting factors are w=0.01 and v=0.99, andits degree of load balancing is still better than Min_min

    algorithm. Generally, we usually consider CPU utilizationas a focus in practical applications, and disregard the

    memory usage condition, or only consider memory

    utilization rate and disregard CPU utilization. At this point,

    we can adjust corresponding parameter values according to

    the actual situation for achieving the desired load balancing

    state.

    Figure 9 shows a comparison of two algorithms ofsingle-point failure rate when weighting factors are

    w=0.01and v=0.99. It can be seen from Figure 9 that the

    single-point failure rate of MAGA is much smaller than

    Min_min algorithm.

    40

    60

    80

    Number of failure of two scheduling poli

    Figure 9. Number of Single-Point of Failure

    For high performance cloud computing, it mainlyconsiders the efficiency of request and can ignore the

    influence of the memory, and then the value ofw can be setto be bigger. Some cloud computing systems do not need

    high computing power, but need to consume a large amount

    of memory. At this time the value ofv should be bigger.

    VII.CONCLUSIONThis paper experimentally proves that MAGA is more

    appropriate than GA to handle high-dimensional function

    optimization problems. Then, establishing a cloud

    computing load balancing model, Min_min and MAGA

    algorithms were applied for resource schedulingrespectively. By adjusting the parameters, the scheduling

    results show that both CPU utilization and memory load

    balancing for MAGA is much better than Min_minscheduling on average, and the comprehensive load

    balancing effect can be achieved by adjusting weightingfactor. Moreover, the MAGA scheduling algorithm canresult in a smaller single-point failure rate. This shows that

    this method, used for solving load balancing strategy based

    on virtualized cloud computing, is feasible and effective.

    REFERENCE

    [1] Jinhua Hu; Jianhua Gu; Guofei Sun; Tianhai Zhao; , "A SchedulingStrategy on Load Balancing of Virtual Machine Resources in CloudComputing Environment," Parallel Architectures, Algorithms andProgramming (PAAP), 2010 Third International Symposium on ,pp.89-96, 18-20 Dec. 2010

    [2] Chunye Gong; Jie Liu; Qiang Zhang; Haitao Chen; Zhenghu Gong; ,"The Characteristics of Cloud Computing," Parallel ProcessingWorkshops (ICPPW), 2010 39th International Conference on ,pp.275-279, 13-16 Sept. 2010

    [3] Zhongni Zheng; Rui Wang; Hai Zhong; Xuejie Zhang; , "Anapproach for cloud resource scheduling based on Parallel GeneticAlgorithm," Computer Research and Development (ICCRD), 20113rd International Conference on, vol.2, pp.444-447, 11-13 March2011

    [4] Shirero, S.; Takashi, M.; Kei, H.; , "On the schedulability conditionson partial time slots," Real-Time Computing Systems and

  • 8/2/2019 APSCC206

    6/6

    Applications, 1999. RTCSA '99. Sixth International Conference on,pp.166-173, 1999

    [5] Kant Soni, V.; Sharma, R.; Kumar Mishra, M.; , "An analysis ofvarious job scheduling strategies in grid computing," SignalProcessing Systems (ICSPS), 2010 2nd International Conference on,vol.2, pp.V2-162-V2-166, 5-7 July 2010

    [6] Alizadeh, G.; Baradarannia, M.; Yazdizadeh, P.; Alipouri, Y.; , "Serialconfiguration of genetic algorithm and particle swarm optimizationto increase the convergence speed and accuracy," Intelligent Systems

    Design and Applications (ISDA), 2010 10th InternationalConference on, pp.272-277, Nov. 29 2010-Dec. 1 2010

    [7] John H. Holland. 1992.Adaptation in Natural and Artificial Systems.MIT Press, Cambridge, MA, USA.

    [8] Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris,Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Warfield. 2003.Xen and the art of virtualization. In Proceedings of the nineteenthACM symposium on Operating systems principles (SOSP '03). ACM,New York, NY, USA, 164-177.

    [9] Deb, K, and H G Beyer. Self-adaptive genetic algorithms withsimulated binary crossover.Evolutionary Computation 9.2 (2001) :197-221.

    [10] Licheng Jiao, Jing Liu, and Weicai Zhong. CoevolutionaryComputation and Multi-agent systems. Beijing: Science Press.August, 2006.

    [11] Yuxia Du, Fangai Liu, and Lei Guo. Research and improvement ofMin-Min scheduling algorithm. Computer Engineering and

    Applications20104624107-109.

    [12] R. Caruana and J. D. Schaffer, Representation and hidden bias:Gray vs. binary coding for genetic algorithms, in Proceedings of the5th International Conference on Machine Learning, ICML 1998,1988, pp. 153161.

    [13] Baowen Xu; Yu Guan; Zhenqiang Chen; Leung, K.R.P.H.; , "Parallelgenetic algorithms with schema migration," Computer Software andApplications Conference, 2002. COMPSAC 2002. Proceedings. 26thAnnual International, vol., no., pp. 879- 884, 2002