Evaluation of hierarchical desktop grid scheduling algorithms

Future Generation Computer Systems 28 (2012) 871–880

Contents lists available at SciVerse ScienceDirect

Future Generation Computer Systems

journal homepage: www.elsevier.com/locate/fgcs

Evaluation of hierarchical desktop grid scheduling algorithms✩

Z. Farkas ∗, P. KacsukMTA SZTAKI, 1518 Budapest P.O. Box 63., Hungary

a r t i c l e i n f o

Article history:Received 20 May 2010Received in revised form24 November 2010Accepted 28 December 2010Available online 4 January 2011

Keywords:Desktop gridSchedulingVolunteer computingModelSimulation

a b s t r a c t

Desktop grids as opposed to service grids are the cost-effective way to gather large amount of volunteercomputing resources for solving scientific problems. It is possible to create desktop grids with the help ofonly one singlemachine, where volunteer desktopswill connect to processwork.MTA SZTAKI has createdthe hierarchical desktop grid concept, where not only single computers, but also desktop grids may joinother systems increasing their performance significantly. In the paper we investigate scheduling issuesof hierarchical desktop grid systems, present scheduling algorithms focusing on different properties, andcompare them using HierDGSim, the Hierarchical Desktop Grid Simulator.

© 2011 Elsevier B.V. All rights reserved.

1. Introduction

Desktop grids are the cost-effective way of gathering a largeamount of computing resources from the operator’s point ofview: a small set (even one machine is feasible) of centralservices is used to store the applications and their workunitsto process and compute resources offered by volunteers, whocan join by installing a client application and registering in thecentral services (project servers). Afterwards, volunteer clientsperiodically communicate with the project servers to fetch work,process them, and upload the results back to the servers. Attractivescientific projects may have connected clients of the orderof one million, for example SETI@home [1] has over 2, andEinstein@Home [2] has over one million machines connectedaccording to Boinstats [3]. The work performed is rewarded bycredits, a virtual measurement of howmuch work the given clienthas performed in favor of the project.

It follows from the nature of desktop grids that attached clientsare likely to behave unexpectedly: they may produce false resultsdue to CPU, memory or any other hardware problems, or they maywant to achieve higher credits without actually performing thecomputation. In order to filter out such behavior, desktop gridsuse redundant computing; a given piece of work is computed by a

✩ The EDGeS (Enabling Desktop Grids for e-Science) project receives communityfunding from the European Commission within Research Infrastructures initiativeof FP7 (grant agreement Number 211727).∗ Corresponding author. Tel.: +36 13297864; fax: +36 13297864.

E-mail addresses: [email protected] (Z. Farkas), [email protected] (P. Kacsuk).

0167-739X/$ – see front matter© 2011 Elsevier B.V. All rights reserved.doi:10.1016/j.future.2010.12.013

given number of clients, and if a given amount of reported resultsare equivalent, the credit is granted to those clients that havereported the correct result.

Access to the desktop grid project servers is limited a small setof scientists and administrators, thus applications run by desktopgrids is limited to these persons’ needs.

Examples for desktop grid implementations are BOINC [4],XtremWeb [5] or OurGrid [6].

1.1. Hierarchical desktop grid concept

A natural extension of a given desktop grid’s (e.g. A) perfor-mance is the addition of new client machines. This might be incon-venient, as each client may have to register in the given desktopgrid. In the case of a heterogenous infrastructure and a big num-ber ofmachines this requires notable effort from the infrastructureadministrators. However, if the new attaching clients already takepart in a desktop grid project (e.g. B), it would be practical to con-nect this desktop grid (B) to the one we would like to increase theperformance of (A). This scenario is a simple use-case of a hierar-chical desktop grid system, where a desktop grid (B) is processingworkunits of another desktop grid (A).

SZTAKI Desktop Grid created by SZTAKI is based on BOINC,but adds various enhancements to desktop grid computing whilethe aim of SZDG remains Volunteer Computing. One of theenhancements is support for the hierarchical desktop grid conceptas described by Kacsuk et al. [7], which allows a set of projectsto be connected to form a directed acyclic graph where work isdistributed among the edges of this graph. The hierarchy concept

http://dx.doi.org/10.1016/j.future.2010.12.013

http://www.elsevier.com/locate/fgcs

http://www.elsevier.com/locate/fgcs

mailto:[email protected]

mailto:[email protected]

http://dx.doi.org/10.1016/j.future.2010.12.013

872 Z. Farkas, P. Kacsuk / Future Generation Computer Systems 28 (2012) 871–880

Fig. 1. Hierarchical system example.

is solved with the help of a modified BOINC client application, theHierarchy Client.

The Hierarchy Client is always running beside any child project,and its only task is to connect to the parent desktop grid,report itself as a powerful client consisting of a given number ofprocessors, and inject fetched workunits into the local desktopgrid’s database. Generally, a project acting as a parent does nothave to be aware of the hierarchy, it only sees the child desktopgrid as one powerful client. Additional details of this solutionare described in our papers [8,9]. Marosi et al. [9] show howto implement automatic application deployment in hierarchicaldesktop grid systems, thus administrators of lower level desktopgrids do not have to deal with deploying applications of higherlevel parent desktop grids. An example of a hierarchical system canbe seen in Fig. 1.

1.2. Outline of the paper

The main aim of this paper is to examine how child desktopgrids can determine their performance, and reflect this in thenumber of processors reported by the Hierarchy Client. Thereported CPU number should result in as low a number ofdeadline violations and makespans as possible. Minimizationof deadline violations is important as this metric reflects howmuch unnecessary work has been performed. On the other hand,makespan is important as it indicates how ‘‘fast’’ a given algorithmis under certain circumstances.

The paper is organized as follows: Section 2 shows relatedwork in hierarchical structure scheduling, Section 3 introducesalgorithms considered, in Section 4 we evaluate schedulingalgorithms, finally in Section 5 we conclude our work.

2. Related work in hierarchical structure scheduling

The hierarchical desktop grid concept is a relatively newconcept, and as such, there hasn’t really been any research thatfocuses on scheduling questions related to these kinds of systems.Moreover, the desktop grid concept is different from the traditionalgrid concept: in the latter case schedulers have to send a job withspecified requirements to one of the services that most satisfiesthem, so we can think of them as a system that implementsthe push model when considering job execution. In the case ofdesktop grids, the situation changes: resources (clients) contacta central service (the desktop grid server) and fetch some work,thus implement the pull model. As a consequence, the schedulingquestion changes: how much work should a client fetch forprocessing, howdoes the central server concept of BOINC influencescheduling?

Anderson et al. [10] described the task server component ofBOINC, and proved using measurements that a single machine

can distribute as much as 8.8 million tasks a day, which is muchmore than the 100000 jobs the EGEE infrastructure processes aday [11]. So we can say that the single central server conceptdoesn’t introduce a bottleneck.

Regarding client side scheduling, Anderson and McLeod [12]describe BOINC’s local scheduling policies, Kondo et al. [13]present some algorithms and compare them with the help ofsimulation using different properties. Domingues et al. [14] focuson scheduling techniques that improve the turnaround time ofdesktop grid applications. For this they use a 39-days trace ofcomputer availability of 32 machines in two classrooms, andcompare the resultswith the ideal execution time. Also Domingueset al. [14] have created a tool called DGSchedSim [15] that canbe used to evaluate different scheduling algorithms on desktopgrids using an existing trace. In paper [16] the authors present ascheduler for desktop grids that is based on stochastic modelingof client availability. Besides this, in his Ph.D. thesis Kondo [17]introduces the cluster equivalence ratio M/N of desktop grids,where this ratio shows howmany (M) dedicated cluster machinescan be used to represent the performance of a desktop gridconsisting of N machines. This ratio depends on applicationcharacteristics.

There has been a notable amount of work regarding service gridscheduling. Fibich et al. [18] present the Grid Scheduling Problem:a set of heterogeneous resources, different jobs, constraints and anobjective. They also present a model for the scheduling problem.Spooner et al. [19] present the TITAN scheduling architecture thatuses performance prediction (performance analysis and character-ization environment—PACE [20]), and focus on local scheduling.Weng et al. [21] show a cost-based online scheduling algorithmwith a scheduling framework, and analyze the performance of thepresented algorithm theoretically, compared to the optimal offlinealgorithm. The two variants are compared using simulation, too.The work presented by Chapman et al. [22] uses a formal frame-work based on Kalman filter theory to predict the CPU utilizationin clusters. Using the presented predictions the author measured aprecision of 15%–20%. Xhafa and Abraham [23] overview compu-tational models for the grid scheduling problem, with service gridsin focus.

3. Scheduling algorithms

In this section we overview some possible scheduling algo-rithms for hierarchical desktop grid systems. We have already de-scribed the algorithms in our paper [24] in detail.

An important property of these algorithms is that they arelocal, that is each child desktop grid runs an instance of one ofthe scheduling algorithms. The task of the scheduling algorithmis not to send workunits to attached clients, but to determinea number of CPU cores reflecting the performance of the givendesktop grid reported by the Hierarchy Client. As the child desktopgrid connects to its parent, it will represent itself as powerful clientconsisting of so many cores, so it will process at most so manyworkunits originating from its parent in parallel. The reported CPUcore number is denoted with N .

Within the scheduling algorithmweassume that each client hasone CPU core.

The algorithms are grouped based to the property they operateon: client properties, workunit processing properties, and the localdesktop grid’s status.

We use the following notations: DG for a desktop grid entity,DG.nperf for N ,DG.dgset for the set of child desktop grids ofDG,DG.CLset for the client set of a desktop grid, DG.ldl for thedeadline of the last workunit fetched by DG from its parent, Cl fora client entity, Cl.lconn for a client’s last connection time, and Cl.dgfor a client’s desktop grid.

Z. Farkas, P. Kacsuk / Future Generation Computer Systems 28 (2012) 871–880 873

3.1. Algorithms using client properties

Algorithms of this group try to determineN using properties ofclients attached to the desktop grid. Two algorithms are consideredin this paper: one depending on the total number of clients (NC),and one depending on the number of active clients (NAC).

The NC algorithm simply summarizes the number of clientsattached to the desktop grid, and reports this number as N :

DG.nperf =

dg∈DG.DGset

dg.nperf + |DG.CLset|.

The NAC algorithm takes only those clients in the account thatare connecting to the desktop grid periodically. The length of theperiod considered is the maximum deadline of fetched workunits,or a predefined value if there are no workunits currently fetchedfrom the parent desktop grid. The total number of clients activewithin this period is reported as N :

isAct(Cl) =

Cl.lconn < Cl.dg.ldl if Cl.dg.ldl = 0Cl.lconn < 86 400 if Cl.dg.ldl = 0 (1)

DG.nperf =

dg∈DG.DGset

dg.nperf + |{cl ∈ DG.CLset|isAct(cl)}|. (2)

3.2. Algorithms using workunit properties

The other algorithm group operates using properties ofworkunits. The three examined algorithms are the timeout (TO),the client timeout (CTO) and the active client timeout (ACTO)algorithms.

The TO algorithm keeps a record of workunit processing timesand number of workunits processed. Using this values it calculatesthe average turnaround time (ATT), that is the average time elapsedbetween the workunit being fetched and its result being reportedtowards the parent desktop grid. It periodically checks the ATTvalue, and if it is below the last fetched workunit’s deadline, N isincreased by one. If ATT is above the deadline, N is decreased byone, and ATT is decreased by 1%. N is limited by the total numberof clients assigned to the desktop grid:

DG.nperf =

DG.nperf + 1 if DG.ATT < DG.ldlDG.nperf − 1 if DG.nperf > 1 and

DG.ATT > DG.ldlDG.nperf otherwise.

(3)

The CTO algorithm is similar to the TO algorithm, the differenceis that per-client ATT values are recorded, and N is set to thenumber of clients for which their ATT is below the last fetchedworkunit’s deadline. If a client’s ATT value is above this deadline,then its ATT value is decreased by 1%. It follows from the operationof this algorithm that the N is limited by the total number ofassigned clients:

DG.nperf = |{Cl ∈ DG.CLset|Cl.ATT < DG.ldl}|

+

dg∈DG.DGset|dg.ATT<DG.ldl

dg.nperf . (4)

The ACTO algorithm is exactly the same as the CTO algorithm,the only difference is that N is limited by the number of activeclients, that is the ACTO algorithm is the combination of the CTOand NAC algorithms: NACTO = min{NCTO, NNAC}:

DG.nperf = |{Cl ∈ DG.CLset|Cl.ATT < DG.ldl ∧ isAct(Cl)}|

+

dg∈DG.DGset|dg.ATT<DG.ldl

dg.nperf . (5)

3.3. Algorithms using local status

Finally, algorithms of the last algorithm group consider thelocal status of the desktop grid. To be precise they try to calculatethe estimated execution time of fetched workunits taking intoconsideration workunits originally assigned to the desktop grid.No workunits are fetched as long as the last fetched one cannotbe processed within its deadline. Actually, this algorithm groupcontains one skeleton algorithm: based on the estimation functionused, different variants of this algorithm can be created. Forexample, a pessimistic version would estimate the execution timeby allocatingworkunits to the slowest clients, an optimistic versionwould do so using the fastest clients.

Algorithms of this group aren’t examined within this paper, aswe assume child desktop grids have been added to a hierarchybecause their resources are not underutilized as long as they areworking on some workunits.

4. Evaluation of scheduling algorithms

Within this section we will overview the evaluation of thealgorithms described in Section 3 using simulation. First weintroduce HierDGSim, the simulation framework used. Next, weexamine the algorithms in different scenarios.

4.1. HierDGSim, the hierarchical desktop grid scheduling framework

Evaluation of algorithms introduced in Section 3 has beenperformed with the help of simulation. Before we have started theimplementation of the simulation system, we have examined thefollowing grid simulation frameworks: GridSim [25], SimGrid [26]and SimBOINC [27].

GridSim is a mature open-source grid simulation frameworklooking back over eight years of development. It has beenimplemented in the Java programming language, and has beenused multiple times for simulating algorithms on the grid. Besidesbasic features like modeling and simulation of grid entities (users,applications, resources, resource brokers, etc.) it offers advancedfeatures like auction model, datagrid or network extensions.

SimGrid is an open-source toolkit that provides core func-tionalities for the simulation of distributed applications in het-erogeneous distributed environments, implemented in C++. Well-defined system components provide the core functionalities in theframework. It offers complete tutorials on how to use the systemin order to achieve different goals during the simulation. SimGridis continuously developed with new features added.

Finally, SimBOINC is a simulator for heterogeneous and volatiledesktop grids and volunteer computing systems, based on theSimGrid framework. It has been created to examine differentscheduling algorithms for the client CPU scheduler and work fetchpolicies of BOINC. However, the project has been freezed due tomajor changes within BOINC and SimGrid itself.

Out of the three simulation frameworkswe have selected to useGridSim: although both GridSim and SimGrid offermore or less thesame features, we found that it is more convenient to use whenintegrated within the Eclipse [28] development framework.

4.1.1. Entities supported by HierDGSimOur simulation framework offers the following entities: clients,

desktop grids, and workunits.Clients. For client entities, it is desirable to have the possibilityto specify the following attributes: CPU performance contains theprocessing power of the client, measured in million operationsper seconds, network bandwidth stores the input bandwidth of theclient, measured in bytes per seconds, availability ratio is a number


in the [0 . . . 1] interval reflecting how efficient the client is fromtheworkunit processing point of view; the value 0means the clientwill never finish aworkunit, 1means the client is on theworkunitsusing its full computing power, failure ratio is a number in the[0 . . . 1] interval, representing the possibility the client will reportan error; 0 meaning the client never fails, 1 meaning the clientwill report every workunit as failed, first appearance is the firstmoment from the start of the simulation when the client becomesactive, last operation time is the last moment from the start of thesimulation the client shows any signs of activity (e.g. asks for work,or reports results), finally desktop grid is the desktop grid entity theclient is connected to.Desktop grids. Desktop grid entities should have the followingattributes: workunits is the set of workunits belonging to thedesktop grid, clients is the set of clients processing workunitsof the given desktop grid, scheduling algorithm is the schedulingalgorithm the desktop grid uses to determine its performance,current performance reflects the performance as measured by thescheduling algorithm (N ), finally parent is the parent desktop gridthis desktop grid fetched workunits from.Workunits. Workunits should have the following attributes: lengthis the required amount of processing power, measured in millionoperations, size is the size of input files belonging to the workunit,finally deadline is the amount of time a client has to process theworkunit once the workunit has been assigned to it.

4.1.2. Features of HierDGSimHierDGSim has been created with keeping the above properties

in mind. It has been implemented in the Java programminglanguage based on the GridSim grid simulation framework, andoffers the following features:

• reading the structure of the hierarchy from an XML file, in orderto allow easy modification of the structure,

• reading client and workunit data from plain-text files in orderto reduce the amount of data to be processed in case of a hugenumber of clients or workunits (over one million),

• a client can attach to at most one desktop grid,• a desktop grid can fetch workunits from at most one desktop

grid,• multiple desktop grids can fetch workunits from a given

desktop grid,• the scheduling algorithms of desktop grids can be set on a per-

desktop grid basis,• the logging level of different entities can be set as requested.

An example hierarchy description used by HierDGSim anddescribing a very simple scenario is the following:

<?xml version="1.0" encoding="UTF-8"?><simulation>

<dgs><dg>

<name>DG_1</name><wufile>data/workunits_long_longdeadline</wufile><loglevel>ERROR</loglevel>

<dg><name>DG_2</name><alg>SchedAlgNAC</alg><clientfile>data/clients_fast_reliable</clientfile><loglevel>DEBUG</loglevel><parent>DG_1</parent>

</dg></dgs>

</simulation>

In the above example, the desktop grid calledDG_1 contains a set ofworkunits, but no clients at all. On the other hand, DG_2 has someclients attached, but doesn’t have any workunits to process. DG_2is attached to DG_1, the performance of DG_2 is determined withthe help of the NAC algorithm as described in 3.1. Theworkunit filereferenced by the above XML contains workunits enumerated innew lines, and three values in each line: the first value representsthe requested number of million operations for the workunit, thesecond number represents the size of input sandbox belongingto the workunit, and the last number contains the deadline ofthe workunit. Finally, the client file referenced by the above XMLcontains a client’s description in a line of the file, and each linestores 6 properties of the client: theMIPS performance of the client,the input bandwidth, the client’s availability, the client’s error rate,the first appearance of the client (−1 if the client is available fromthe start of the simulation), and the last activity time of the client(−1 if the client is available throughout the simulation).

HierDGSim implements the following policy regarding workdistribution to clients and child desktop grids: if a desktop gridreceives a work fetch request from one of its ancestor desktopgrids, the child will receive work given that there are availableworkunits. In case a client asks for aworkunit, the desktop gridwillsend a workunit the client is able to process within its deadline,that is the following should be true miwu

mipsclient< deadlinewu. Thus, a

given client receives a workunit from a desktop grid if and only ifthere is at least one unassigned workunit satisfying this formula.

HierDGSim is an open-source project that is available fordownload and use from SourceForge: http://sourceforge.net/projects/hierdgsim/.

4.2. Evaluation methodology

The first thing we have to declare is that we do not considerworkunit input sandbox sizes, our evaluation focuses on clientand workunit computing properties. Next, the ‘‘goodness’’ of onealgorithm can be measured by using different metrics. We haveselected to use the following two metrics: total processing timeof one scenario, and number of workunit deadline violations.The total processing time represents the total makespan, that isthe time elapsed between the start of the simulation and theprocessing of the last workunit in the system. The number ofdeadline violations is also important as deadline violations resultin processing a given workunit multiple times.

We have created different client and workunit sets for theevaluation of the scheduling algorithms. In the case of workunits,the set of workunits belonging to a desktop grid usually havemore or less the same computing, data transfer and deadlinerequirements, thus properties (mi, isb) of workunits belongingto a given set follow a normal distribution, that is: mi ∼

N (µmi, σ2mi), isb ∼ N (µisb, σ

2isb) and deadlines are the same

(deadline ∼ N (µdeadline, 0)). Similarly, important properties ofclients (mips, ibw) also follow normal distribution, that is: mips ∼

N (µmips, σ2mips) and isb ∼ N (µisb, σ

2isb). Other properties, like

failure ratio, first appearance or last activity time are selectedspecifically for a given experiment. Different performance groupscan be created for clients depending on µmips. We have selectedto create three performance groups: slow, average performanceand fast clients. Workunit deadline groups have been created sothat based on client processing speeds and workunit processingrequirements clients are divided in two groups: clients capableand not capable of processing workunits within a deadline. Table 1summarizes our selected values for the µ and σ 2 variables.

For testing the performance of the scheduling algorithms,we have prepared the following scenarios: no hierarchy setup,one-level hierarchy with one child desktop grid, and one-levelhierarchy with two children. All of these evaluation scenarios

http://sourceforge.net/projects/hierdgsim/






Table 1µ and σ 2 values for creating entities.

Entity case µ σ 2

Fast client MIPS 2 ∗ 105 2 ∗ 104

Slow client MIPS 104 103

Long workunit MI (about one day processing with fast clients) 8 ∗ 109 107

Short workunit MI (two hours processing with fast clients) 109 106

Loose workunit deadline (one week) 6 ∗ 105 0Strict workunit deadline (two hours) 7 ∗ 103 0

Fig. 2. Scenario 1: no hierarchy.

have been executed for each scheduling algorithm, using the sameinputs (client and workunit sets) for the simulation on a per-scenario basis. Each case has been executed 10 times, and theaverage values of metrics mentioned in the beginning of thissection have been calculated. We also ran the simulation usingdifferent input data sets to examine effects of special cases, likeexiting clients or the presence of clients unable to process any ofthe workunits.

4.3. Measurement of algorithms

Within this subsectionwewill examine the different schedulingalgorithms in the scenarios mentioned above. All the scenarioshave been executed on a single machine consisting of 8 CPU coresand 12 GBmemory. Each sub-scenario has been executed 10 timesfor each algorithm, summary tables contain the average values ofmeasurements. We have executed 600 simulations altogether, theshortest simulation ran for 3 min, the longest for 45 min. We haveused Condor as a local job scheduler on the machine to make useof each CPU core in an efficient way.

4.3.1. Scenario 1: no hierarchyThe first evaluation case is the basic test using execution on

one desktop grid. This case simply measures the execution time ofworkunits on one desktop grid, no real examination of schedulingalgorithms is performed.We expect that in this case themakespandoesn’t depend on the scheduling algorithm used, simply becausethe performance of the desktop grid isn’t relevant in this case. Thisscenario is shown in Fig. 2.

All the tests of this scenario have been executed with aworkunit set consisting of short workunits with strict deadlines.The following sub-scenarios have been executed: fast clients (1),fast and appearing fast clients (2), fast and exiting fast clients (3),finally fast and fast, but low-availability (10%) clients (4) as shownif Table 1. The size of the workunit and individual client sets was105 and 102, respectively.Fast clients (1_1). In this sub-scenario 102 clients were processing105 workunits using the NC, NAC, TO, CTO and ACTO algorithms.The expected execution time can be calculated using the followingformula:

pTime(NWUs,NCls,miWU,mipsCl) =NWUs

NCls∗

miWU

mipsCl. (6)

Table 2Scenario 1/1.

Algorithm Average valuesMakespan Deadline WUs

NC 4989428.1 0NAC 4989412.5 0TO 4989399.6 0CTO 4989600.9 0ACTO 4989504.8 0



NC 2997845.9 0NAC 2997776.2 0TO 2997711.6 0CTO 2997731.7 0ACTO 2997720.1 0

In the case of our scenario, pTime(106, 102, 109, 2 ∗ 105) = 5 ∗

106. Our experiences are summarized in Table 2. As expected, thedifference of average makespan values compared to the expectedpTime value is negligible, and each algorithm processed workunitswithout any deadline violations.Fast and appearing fast clients (1_2). In this sub-scenario 102

permanent and 102 appearing clients were processing 105

workunits, where the appearing clients entered the system after106 s.

In order to calculate the expected execution time, we firstcalculate the number of workunits processed until the additionalclients appear using the formula:

pNum(t,NCls,miWU,mipsCl) = NCls ∗t

miWUmipsCl

. (7)

Thus, in our case pNum(106, 102, 109, 2 ∗ 105) = 2 ∗ 104.Afterwards, the total time to process (pTime∗) the remainingworkunits (8∗104) can be calculated using (6): pTime∗

= pTime(8∗

104, 2 ∗ 102, 109, 2 ∗ 105) = 2 ∗ 106. That is, the expected totalprocessing time is pTime = 106

+ pTime∗= 3 ∗ 106 s. Our

experiences are summarized in Table 3. As expected, the differenceof averagemakespan values compared to the expected pTime valueis negligible, and each algorithm processed workunits without anydeadline violations.Fast and exiting fast clients (1_3). In this sub-scenario 102

permanent and 102 exiting clients were processing 105 workunits,where the exiting clients left the system after 106 s.

Just as in the case of the previous scenario, we first calculatethe number of workunits processed until the clients exited using(7), and the required amount of time to process the remainingworkunits with the still available clients using (6). Our calculationsare as follows:

pTime1 = 106

pNum1 = pNum(pTime1, 2 ∗ 102, 109, 2 ∗ 105) = 4 ∗ 104




NC 3991608 100NAC 3992199.6 100TO 3991845 100CTO 3991849.5 100ACTO 3992150.1 100



NC 5052273.7 10044NAC 5048522.9 10042TO 5050106.1 10042CTO 5048786.6 10041ACTO 5050577.3 10042

pTime2 = pTime(105− pNum1, 10

2, 109, 2 ∗ 105)

= pTime(6 ∗ 105, 102, 109, 2 ∗ 105) = 3 ∗ 106

pTime = pTime1 + pTime2 = 4 ∗ 106.

The number of deadline violationswill be the number of exitingclients (102), as so many clients haven’t returned results for anassigned workunit after they have exited.

Our experiences are summarized in Table 4. As expected, thedifference of average makespan values compared to the expectedpTime value is negligible, and each algorithm processed workunitswith as many deadline violations as clients exited.Fast and low-availability (10%) fast clients (1_4). In this sub-scenario2 ∗ 102 clients, out of which 102 were available 10% of their timewere processing 105 workunits.

The processing time canbe calculated depending on the relationof client MIPS and availability, and workunit MI and deadlineproperties. If for a given client

miWU

mipsCl ∗ avaCl< deadlineWU (8)

doesn’t stand, then the given client doesn’t perform useful workfrom thewhole system’s point of view. Thus, the total computationtime can be calculated taking those clients in account for which (8)stands and use (6). Additionally, clients for which (8) doesn’t standstill receivework, andmay extrude the total processing time if theyreceive workunits close to the end of the simulation; the amountof this extension depends on howmany clients satisfy (8) and howmany do not; the fewer that satisfy (8) the bigger the likelihoodthat deadline violated workunits will be assigned to clients notsatisfying (8). Thus, we can say that the minimum processing timeis calculated using (6), taking clients into account for which (8)stands, and for this scenario it is 5 ∗ 106.

The minimum number of deadline violations can also becalculated based on the processing time of the scenario using thefollowing formula:

pTimemiWUmipsCl

∗ NumCl (9)

taking those clients in account for which (8) doesn’t stand. For thisscenario the value is 104.

Our experiences are summarized in Table 5. As expected, thedifference of average makespan values compared to the expectedpTime value is negligible, as is the number of deadline violations.

Fig. 3. Scenario 2: one-level hierarchy with one DG.



NC 5000600.6 0NAC 5000549.1 0TO 5005293.1 0CTO 5000487.4 0ACTO 5000728.6 0

4.3.2. Scenario 2: one-level hierarchy with one childThe second evaluation case is the one-level hierarchy, using

two desktop grids. In this case the top-level desktop grid has thesame workunits as in the first scenario, but has no clients. On theother hand the low-level desktop grid has all the clients in thefirst scenario, but has no workunits to process. In this scenario wetest how this hierarchy’s performance compares to the first case.This scenario is shown in Fig. 3. In this scenario we have executedexactly the same sub-scenarios as in the case of the first scenario:fast clients, fast, and appearing fast clients, fast, and exiting fastclients, and finally fast, and low-availability (10%) fast clients. Wesummarize our results in tables and charts where relevant.Fast clients (2_1). In this case we expect that all algorithms willbehave similarly, as clients contact within workunit deadlineexpiration, so they will report the same number of CPU cores.Moreover, the exact number of CPU cores will be reported almostimmediately, thus we expect that the makespan will be asdetermined in the case of the first scenario. Moreover, no deadlineviolations will occur, as none of the algorithms will report moreclients than the size of the active client set. However, it is importantto note that there will be a minor delay in the makespan whencompared to the same sub-scenario in scenario 1. This delayis introduced by the communication delay in the DG ↔ DGcommunication. Table 6 summarizes our measurements.Fast and appearing fast clients (2_2). In this case the NC and NACalgorithms will behave as in the first scenario, the only differencewill be some additional delay in the makespan. However the TO,CTO and ACTO algorithms will react slower, as the CPU number isupdated periodically based on the ATT values. Thus, in the case ofthese algorithms we expect a slightly bigger increase in makespanwhen compared to the first scenario. The increase depends on thelength of the update period set in these algorithms. In the case ofour tests we have set a period of 100 s. Table 7 summarizes ourmeasurements.Fast and exiting fast clients (2_3). A more interesting case is howalgorithms handle clients leaving the system. Our expectationsare that the makespan will be longer than in the case of the firstscenario, and deadline violations will also occur. The reason isvery simple: in the first scenario exiting clients caused deadlineviolations at most once, but in this case if the scheduling algorithm




NC 3002204.1 0NAC 3002545.6 0TO 3007935.9 0CTO 3002380.2 0ACTO 3002447.1 0

Fig. 4. Workunit processing and deadline violations of the NC algorithm.

considers inactive clients as active, it might fetch more workunitsthe active clients are able to process within a deadline. Thus, weexpect that algorithms considering the activity of clients (NAC,ACTO) will perform much better than ones that don’t (NC, TO,CTO). Fig. 4 shows theworkunit processing performance of theNACalgorithm (the figure is almost identical for the ACTO algorithm).As can be seen in the figure, the performance of the NAC algorithmhas been halved once the clients have left the system.

Table 8 summarizes our measurements. As expected, deadlineviolations of NAC and ACTO algorithms are minimal. However, themakespan and deadline violations of NC, TO and CTO algorithmsare relatively high. This can be explained with the help of Fig. 5:as it can be seen in the figure, workunit processing throughput hasbeen decreased dramatically, and deadline violation rate is muchhigher than workunit processing rate. The cause is that once aworkunit is sent to a child desktop grid, the workunit’s deadlinestarts to tick at the parent level, but not at the child level: theinherited workunit’s deadline starts to tick only if a client hasfetched the inherited workunit. Moreover, the inherited workunithas the original deadline of the parentworkunit’s deadline. Besidesthis, all of the NC, TO and CTO algorithms report exited clients asactive, thus an inherited workunit has to wait longer in the childdesktop grid’s queue to be assigned to a client. It follows from thesethat the successful workunit processing rate will be very low, onthe other hand the workunit deadline violation rate will be veryhigh, as fetched workunits spend more time in the child desktopgrid’s queue, thus it is more likely that they cannot be processedwithin a deadline from the parent desktop grid’s point of view.Fast and low-availability (10%) fast clients (2_4). This case can beconsidered as a mixture of the appearing and exiting fast clients;once a low-availability client becomes active, it receives work, butisn’t able to report the resultwithin a deadline, thuswe can think ofit as an exiting client.When the client reports the result, it becomesactive again, just like an appearing client.

From the computation’s point of view such clients simplygenerate deadline violations, and how many deadline violationswill happenuntil everyworkunit is processeddepends on the otherclients’ performance. Table 9 summarizes our measurements.

As it can be seen in Table 9, the NAC and ACTO algorithmsoutperform the other algorithms in both makespan and deadline

Fig. 5. Workunit processing and deadline violations of the NC algorithm.



NC 120261634.2 254538NAC 4005660.0 104TO 12153021.7 256791CTO 11886605.3 249660ACTO 4005949.5 101



NC 17818087.1 402715NAC 5475426.4 10760TO 18021713.2 410147CTO 17624891.9 394445ACTO 5473037.1 10759

Fig. 6. NAC and ACTO N values 2/4.

violations. This follows from the relation of low-availability clientresponse times and workunit deadlines: as in this case deadlinesare 7000 s, and low-availability clients interact with the desktopgrid every 50000 s (on average). Low-availability clients won’t beconsidered by the scheduling algorithms in 86% of the simulation’slifetime, thus on longer terms we expect that the NAC and ACTOalgorithms report around 14 CPUs on average. Fig. 6 shows adiagram of N values computed by the NAC and ACTO algorithmsduring the simulation that proves our expectations: the averagereported value is somewhere between 15 and 20.


Fig. 7. Scenario 3: one-level hierarchy with two DGs.



NC 2502653.3 0NAC 2502978.2 0TO 2508160.5 0CTO 2503013.9 0ACTO 2503212.4 0

4.3.3. Scenario 3: one-level hierarchy with two childrenThe third evaluation scenario uses a one-level hierarchy, with

one top-level desktop grid (A) and two children (B, C). This scenariois shown in Fig. 7.

Our assumption was that in this case both makespan anddeadline violations can be computed based on experiences ofscenario one and two, that is if we have an existing hierarchy, wecan predict the computation time if we add a new child desktopgrid.

In each examined sub-scenario, we assume that the one ofthe child desktop grids (B) contains only fast and reliable clients,always ready to process work. The variable performance isrepresented by clients of the other child desktop grid (C): fastclient, fast appearing client, fast exiting clients, and finally, fastunavailable clients.Fast clients at C(3/1). In this sub-scenario both child desktopgrids have the same number of clients attached with the samecharacteristics. The natural assumption is that the makespanwill be half of the values measured in case of scenario 2/1.This assumption is confirmed by our measurements as shown inTable 10.Fast appearing clients at C(3/2). Based on our experiences in theprevious scenarios, and using Eqs. (6) and (7), we can predictthe total makespan for this sub-scenario. First, the number ofworkunits processed until the appearance of the additional clientscan be computed using (7) and results of sub-scenario 2/1: at mostpNum(106, 102, 109, 2∗105) = 2∗104 workunitwill be processed.Based on results of sub-scenario 2/2, we can state that oncethe new clients have appeared the whole system’s performancewill double, thus we can compute the processing time with (6):pTime(8 ∗ 104, 2 ∗ 102, 109, 2 ∗ 105) = 2 ∗ 106, so the totalexpected processing time will be 3 ∗ 106. Table 11 summarizesour measurements. As it can be seen in Table 11, the makespanvalues are almost the same as in the case of sub-scenario 2/2. Animportant deduction is that it doesn’t matter if the new clients areattached to an existing desktop grid, or a new desktop grid at thesame level, the average makespan will be the same.Fast exiting clients at C(3/3). Within this sub-scenario we can alsomake use of Eqs. (6) and (7) to compute the expected makespan.First, we can compute the number of workunits processed up tothe time when the first clients exit: pNum(106, 2 ∗ 102, 109, 2 ∗



NC 3002880.5 0NAC 3002666.6 0TO 3008566.7 0CTO 3002753.1 0ACTO 3002958.9 0



NC 4027925.1 43255NAC 4003204.3 112TO 4025473.7 43278CTO 4026588.4 43275ACTO 4003263.2 112



NC 5107994.2 71199NAC 5084525.5 71226TO 5107370.5 69595CTO 5108279.9 67753ACTO 5084248.6 11624

105) = 2 ∗ 104. Afterwards, the C child will only produce deadlineviolations, thus real work will be performed only by clients of B,and the processing timewill be: pTime(8∗104, 102, 102, 2∗105) =

4 ∗ 106, thus the total expected makespan is at least 5 ∗ 106 s.Regarding deadline violations, we expect this value to be close

to the number of exiting clients for the NAC and ACTO algorithms,with much bigger values for the NC, TO and CTO. Actually, thesealgorithms will result only in deadline violated workunits duringthe last 4 ∗ 106 s of the simulation. The value of N reported bythe NC, TO and CTO algorithms is 100, so the predicted maximumnumber of deadline violations is: 102

∗4∗106

7∗103= 57 000. Table 12

summarizes our measurements.Fast low-availability (10%) clients at C(3/4). This sub-scenariowill be similar to 3/3, however the periodically activating low-availability clients will result in more deadline violations. Also,the total makespan will be longer, as low-availability clients willexpand the total time spent with computation close to the endof the computation. However, makespan will be more or less thesame for each algorithm, as the set of clients able to process theworkunits is the same. Table 13 summarizes our measurements.

As it can be seen, there is now a big difference between the NACand ACTO algorithm, that can be explained using Fig. 8. This figureshows the N values reported by the NAC and ACTO algorithmsfor the C desktop grid during a given period of the simulation.As it can be seen, there are high peaks (100) in case of the NACalgorithms,whereas the ACTO algorithm is relatively balanced. Thereason for this is that in the case of the NAC algorithm, as no clientshave reported any workunit results within deadline, the fetchedworkunits are removed from the local queue of desktop grid C , sothe algorithmwill use a predefined deadline value (86400 s—1 dayper default) to determine client activity. As this ismuch bigger thanthe originally used 7000 s deadline, all clients will be consideredas active, and the C desktop grid will report 100 for N , and fetchnew workunits. As a consequence, the deadline used by the NACalgorithm will be 7000 again which results in the shortness ofthe peak. Unfortunately, this behavior results in a constant, high


Fig. 8. NAC and ACTO N values for scenario 3/4.

rate of deadline violation during the whole simulation for the NACalgorithm.

4.4. Summary of algorithm evaluation

Based on the evaluation of scheduling algorithms in differentscenarios, we can state that the NAC and ACTO algorithmsoutperform the other algorithms both in makespan and deadlineviolations.

In case of appearing clients, there is no real difference betweenany of the algorithms: all of them adapt to the changes almostimmediately.

In case of exiting clients, the NAC and ACTO algorithms are ableto filter out clients that have left the system, unlikeNC, TO and CTO.The speed of adaptation of NAC and ACTO algorithms depends onthe deadline of fetched workunits: the longer the deadline is, thelater a client is considered as active.

In the case of low-availability clients, we can state that theNAC and ACTO algorithms are able to outperform the NC, TO andCTO algorithms, as they are able to mark low-availability clientsas inactive as needed. Also, as we have seen in Scenario 2/4, theNAC and ACTO algorithms report low-availability clients in theratio of their activity and workunit deadlines. However, in case ofdesktop grids consisting only of clients unable to process fetchedworkunits, the ACTO outperforms the NAC algorithm.

In order to summarize our measurements, we recommend touse the ACTO algorithm in hierarchical desktop grid systems, asalthough both NAC and ACTO provide more or less the sameresults, the ACTO algorithm is able to adapt better in case none ofthe desktop grid clients’ availability period is within the fetchedworkunit deadlines.

5. Conclusions and future work

Within the paper we have examined a number of schedulingalgorithms for hierarchical desktop grids with the help ofHierDGSim, the Hierarchical Desktop Grid Simulator.

As desktop grids are based on the work pull mechanism(contrary to the push model of service grids), the aim of ourscheduling algorithms is to ask for as much work from higher leveldesktop grids as the attached local desktop grid is able to process,with as few deadline violations as possible. Thus, the task is todetermine N , the reported CPU number, so it reflects the localdesktop grid’s performance.

We have examined five different algorithms, two (NC andNAC) considering the number of clients, and three (TO, CTO andACTO) operating with average processing times of workunits. Thealgorithms have been evaluated in different scenarios focusingon the following metrics: makespan and number of deadline

violations. The lower these numbers are for a given algorithm, thebetter the algorithm is.

The evaluation has been performed using HierDGSim, an open-source Hierarchical Desktop Grid Simulator, based on GridSim.HierDGSim is able to simulate work distribution in hierarchicaldesktop grid systems,where one desktop gridmay fetchworkunitsfrom at most one parent desktop grid. The hierarchical system canbe described using a very simple XML file, whereas workunits andclients can be specified in text files. HierDGSim is able to recorddifferent metrics, like workunit processing rate, N determinedby scheduling algorithms, and deadline violation rate. Schedulingalgorithms can be set on a per-desktop grid basis.

Based on the algorithm evaluation results we recommend usingthe ACTO algorithm, as it outperforms the other algorithms in bothmakespan and number of deadline violations.

5.1. Future work

Based on the results and tools presented in this paper, it ispossible to create a generic framework for simulating pull modelwork distribution in tree-like structures, and not only hierarchicaldesktop grids. Within this paper we have focused only on twoproperties within desktop grids: clients and workunit processing,but it would be possible to investigate the effect of focusingon other properties, like relation of client input bandwidths andworkunit input sandbox sizes, or additional computing poweroffered by GPUs. Also, it would be feasible to examine morecomplex hierarchies.

Keeping desktop grids in mind, it is possible to examinedifferent desktop grid → client work distribution policies usingHierDGSim. In order to perform such experiments, only theimplementation of clients and desktop grids has to be modified,thus HierDGSim is a good framework for investigating problemsout of its focus.

Finally, HierDGSim can be used to prove theoretical assump-tions in tree structures, as we have done during the evaluation ofscheduling algorithms: in some cases first we have created a for-mula to determine the makespan or number of violated deadlines,and afterwards have used HierDGSim to prove their correctness.

References

[1] Seti@home, 2010. http://setiathome.berkeley.edu/.[2] Einstein@home, 2010. http://www.einsteinathome.org/.[3] Boincstats, 2010. http://boincstats.com/.[4] D.P. Anderson, Boinc: a system for public-resource computing and storage,

in: Proceedings of the 5th IEEE/ACM International Workshop on GridComputing, IEEE Computer Society, 2004, pp. 4–10.

[5] G. Fedak, C. Germain, V. Neri, F. Cappello, Xtremweb: a generic globalcomputing system, in: Proceedings of the 1st International Symposium onCluster Computing and the Grid, CCGrid’01, p. 582.

[6] N. Andrade, W. Cirne, F. Brasileiro, P. Roisenberg, Ourgrid: an approach toeasily assemble grids with equitable resource sharing, in: Job SchedulingStrategies for Parallel Processing, in: Lecture Notes in Computer Science,Springer, Berlin, Heidelberg, 2003, pp. 61–86.

[7] P. Kacsuk, N. Podhorszki, T. Kiss, Scalable desktop grid system, in: High Per-formance Computing for Computational Science—VECPAR 2006, in: LectureNotes in Computer Science, Springer, Berlin, Heidelberg, 2007, pp. 27–38.

[8] A. Marosi, G. Gombas, Z. Balaton, P. Kacsuk, T. Kiss, Sztaki desktopgrid: building a scalable, secure platform for desktop grid computing,in: Proceedings of the CoreGRIDWorkshop on Programming Models Grid andP2P System Architecture Grid Systems, Tools and Environments, Springer US,2007, pp. 365–376.

[9] A. Marosi, G. Gombas, Z. Balaton, Secure application deployment in thehierarchical local desktop grid, in: Proceedings of the 6th Austrian–HungarianWorkshop on Distributed and Parallel Systems, DAPSYS 2006, pp. 145–154.

[10] D.P. Anderson, E. Korpela, R. Walton, High-performance task distribution forvolunteer computing, in: Proceedings of the First International Conference one-Science and Grid Computing, IEEE Computer Society, 2006, pp. 196–203.

[11] Egee in numbers, 2008.http://egee-na2.web.cern.ch/egee-NA2/numbers.html.

http://setiathome.berkeley.edu/

http://www.einsteinathome.org/

http://boincstats.com/

http://egee-na2.web.cern.ch/egee-NA2/numbers.html


[12] D.P. Anderson, J. McLeod, Local scheduling for volunteer computing, in:Workshop on Large-Scale, Volatile Desktop Grids, PCGrid 2007, Held inConjunction with the IEEE International Parallel & Distributed ProcessingSymposium, IPDPS, 2007.

[13] D. Kondo, D.P. Anderson, J. McLeod, Performance evaluation of schedulingpolicies for volunteer computing, in: 3rd IEEE International Conference on e-Science and Grid Computing, 2007, pp. 415–422.

[14] P. Domingues, A. Andrzejak, L. Silva, Scheduling for fast turnaround time oninstitutional desktop grid, Technical Report, Institute on System Architecture,CoreGRID—Network of Excellence, 2006.

[15] P. Domingues, P. Marques, L. Silva, Dgschedsim: a trace-driven simulatorto evaluate scheduling algorithms for desktop grid environments, in:Proceedings of the 14th Euromicro International Conference on Parallel,Distributed, and Network-Based Processing, PDP’06, pp. 83–90.

[16] E. Byun, S. Choi, M. Baik, J. Gil, C. Park, C. Hwang, Mjsa: Markov jobscheduler based on availability in desktop grid computing environment,Future Generation Computer Systems 23 (2007) 616–622.

[17] D. Kondo, Scheduling task parallel applications for rapid turnaround onenterprise desktop grids, Ph.D. Thesis, University of California at San Diego,2005.

[18] P. Fibich, L. Matyska, H. Rudová, Model of grid scheduling problem, in:Exploring Planning and Scheduling for Web Services, Grid and AutonomicComputing, pp. 17–24.

[19] D.P. Spooner, S.A. Jarvis, J. Cao, S. Saini, G.R. Nudd, Local grid schedulingtechniques using performance prediction, IEE Proceedings Computers andDigital Techniques 15 (2003) 87–96.

[20] G.R. Nudd, D.J. Kerbyson, E. Papaefstathiou, S.C. Perry, J.S. Harper, D.V. Wilcox,Pace—a toolset for the performance prediction of parallel and distributedsystems, International Journal of High Performance Computing Applications14 (2000) 228–251.

[21] C. Weng, M. Li, X. Lu, An online scheduling algorithm for assigning jobs inthe computational grid, IEICE Transactions on Information and Systems E89-D(2006) 597–604.

[22] C. Chapman, M. Musolesi, W. Emmerich, C. Mascolo, Predictive resourcescheduling in computational grids, in: Parallel and Distributed ProcessingSymposium, IPDPS 2007.

[23] F. Xhafa, A. Abraham, Computational models and heuristic methods forgrid scheduling problems, Future Generation Computer Systems 26 (2010)608–621.

[24] Z. Farkas, A.C. Marosi, P. Kacsuk, Job scheduling in hierarchical desktop grids,in: Remote Instrumentation and Virtual Laboratories, Springer US, 2010,pp. 79–97.

[25] A. Sulistio, U. Cibej, S. Venugopal, B. Robic, R. Buyya, A toolkit formodelling andsimulating data grids: an extension to gridsim, Concurrency and Computation:Practice and Experience (CCPE) 20 (2008) 1591–1609.

[26] H. Casanova, A. Legrand, M. Quinson, Simgrid: a generic framework for large-scale distributed experiments, in: 10th IEEE International Conference onComputer Modeling and Simulation.

[27] Simboinc, 2007. http://simboinc.gforge.inria.fr/.[28] Eclipse, 2010. http://www.eclipse.org/.

Z. Farkas has been a research fellow at the Laboratory ofthe Parallel and Distributed Systems in the Computer andAutomation Research Institute of the Hungarian Academyof Sciences since 2003. He received his M.Sc. from theEotvos Lorand Science University of Budapest in 2004.He started Ph.D. studies at the Eotvos Lorand ScienceUniversity in 2005. His research interests include gridcomputing, interoperability soltions, and using standards.He is the lead developer of the open source grid, P-GRADEportal. Within the Enabling Desktop Grids for e-Science(EDGeS) project he has created the core component of the

grid interoperability solution, the 3G Bridge and is coordinating its development.He is a coauthor in more than 25 scientific papers in journals and conferenceproceedings on grid computing.

P. Kacsuk is the Head of the Laboratory of the Paralleland Distributed Systems in the Computer and AutomationResearch Institute of the Hungarian Academy of Sciences.He received his M.Sc. and university doctorate degreesfrom the Technical University of Budapest in 1976 and1984, respectively. He received the kandidat degree (Ph.D.)from the Hungarian Academy in 1989. He habilitated atthe University of Vienna in 1997. He received his professortitle from the Hungarian President in 1999 and the Doctorof Academy degree (DSc) from the Hungarian Academy ofSciences in 2001. He has been a part-time full professor

at the Cavendish School of Computer Science of the University of Westminsterand the Eötvös Lóránd University of Science since 2001. He has published twobooks, two lecture notes and more than 200 scientific papers on parallel computerarchitectures, parallel software engineering and Grid computing. He is co-editor-in-chief of the Journal of Grid Computing published by Springer.

http://simboinc.gforge.inria.fr/

http://www.eclipse.org/

Documents

Evaluation of hierarchical desktop grid scheduling algorithms