14
IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 1 A Combinatorial Auction-Based Mechanism for Dynamic VM Provisioning and Allocation in Clouds Sharrukh Zaman, Student Member, IEEE and Daniel Grosu, Senior Member, IEEE Abstract—Cloud computing providers provision their resources into different types of virtual machine (VM) instances which are then allocated to the users for specific periods of time. The allocation of VM instances to users is usually determined through fixed-price allocation mechanisms which cannot guarantee an economically efficient allocation and the maximization of cloud provider’s revenue. A better alternative would be to use combinatorial auction-based resource allocation mechanisms. This argument is supported by the economic theory; when the auction costs are low, as is the case in the context of cloud computing, auctions are especially efficient over the fixed-price markets since products are matched to customers having the highest valuation. The existing combinatorial auction- based VM allocation mechanisms do not take into account the user’s demand when making provisioning decision, that is, they assume that the VM instances are statically provisioned. We design an auction-based mechanism for dynamic VM provisioning and allocation that takes into account the user demand when making provisioning decisions. We prove that our mechanism is truthful (i.e., a user maximizes its utility only by bidding its true valuation for the requested bundle of VMs). We evaluate the proposed mechanism by performing extensive simulation experiments using real workload traces. The experiments show that the proposed mechanism yields higher revenue for the cloud provider and improves the utilization of cloud resources. Index Terms—Cloud Computing, VM Allocation, VM Provisioning, Dynamic Resource Configuration, Combinatorial Auctions. 1 I NTRODUCTION C LOUD providers provision their resources into vir- tual machine (VM) instances and allocate them to the users for specific periods of time. Provisioning, al- locating and pricing these VM instances are challenging issues that have to be addressed by the cloud providers. The fixed-price allocation mechanisms employed by commercial cloud providers (e.g., Microsoft Azure [1], Amazon EC2 [2]) cannot allocate the VM instances effi- ciently or price the resources reflecting the dynamically changing user demands. Economic theory suggests that when the auction costs are low, auctions are efficient over the fixed-price mechanisms since products are matched to customers having the highest valuation [3]. In partic- ular, combinatorial auction-based mechanisms are best suited for resource allocation in clouds because of the nature of the allocation requests. However, we have to overcome certain challenges while using combinatorial auction-based mechanisms for VM provisioning and al- location in clouds. The winner determination in a combi- natorial auction is an NP-hard problem [4] and therefore, solving it for large number of users and resources will require considerable amount of time. Since the majority of current cloud providers serve a large number of users and have a large amount of resources available for allocation they will need to employ approximation The authors are with the Department of Computer Science, Wayne State University, 5057 Woodward Avenue, Detroit, MI 48202. E-mail: [email protected], [email protected] algorithms in order to solve the winner determination problem in a reasonable amount of time. In our previous work [5], we designed two combi- natorial auction-based approximation mechanisms for VM instance allocation. Although these mechanisms are able to increase the allocation efficiency of VM instances and also increase the cloud provider’s revenue, they assume static provisioning of VM instances. That is, they require that the VM instances are already provisioned and would not change. Static provisioning leads to in- efficiencies due to under-utilization of resources if the mechanism cannot accurately predict the user demand. Since a regular auction computes the price of the items based on user demands, a very low demand may require the auctioneer to set a reserve price to prevent losses. In this paper, we address the VM provisioning and allocation problem by designing a combinatorial auction- based mechanism that produces an efficient allocation of resources and high profits for the cloud provider. The mechanism extends one of the mechanisms we proposed in [5] to include dynamic configuration of VM instances and reserve prices. The proposed mechanism, called CA-PROVISION, treats the set of available computing resources as ‘liquid’ resources that can be configured into different numbers and types of VM instances depending on the requests of the users. Each user desires a specific bundle of VM instances and bids only on one such bundle (i.e., the users are single-minded). The mechanism determines the allocation based on the users’ valuations until all resources are allocated. It involves a reserve price determined by the operating cost of the resources.

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. …dgrosu/pub/tcc13-comb.pdf · IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 1 A Combinatorial Auction-Based Mechanism

  • Upload
    vunhi

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 1

A Combinatorial Auction-Based Mechanism forDynamic VM Provisioning and Allocation in

CloudsSharrukh Zaman, Student Member, IEEE and Daniel Grosu, Senior Member, IEEE

Abstract—Cloud computing providers provision their resources into different types of virtual machine (VM) instances which are thenallocated to the users for specific periods of time. The allocation of VM instances to users is usually determined through fixed-priceallocation mechanisms which cannot guarantee an economically efficient allocation and the maximization of cloud provider’s revenue.A better alternative would be to use combinatorial auction-based resource allocation mechanisms. This argument is supported by theeconomic theory; when the auction costs are low, as is the case in the context of cloud computing, auctions are especially efficientover the fixed-price markets since products are matched to customers having the highest valuation. The existing combinatorial auction-based VM allocation mechanisms do not take into account the user’s demand when making provisioning decision, that is, they assumethat the VM instances are statically provisioned. We design an auction-based mechanism for dynamic VM provisioning and allocationthat takes into account the user demand when making provisioning decisions. We prove that our mechanism is truthful (i.e., a usermaximizes its utility only by bidding its true valuation for the requested bundle of VMs). We evaluate the proposed mechanism byperforming extensive simulation experiments using real workload traces. The experiments show that the proposed mechanism yieldshigher revenue for the cloud provider and improves the utilization of cloud resources.

Index Terms—Cloud Computing, VM Allocation, VM Provisioning, Dynamic Resource Configuration, Combinatorial Auctions.

1 INTRODUCTION

C LOUD providers provision their resources into vir-tual machine (VM) instances and allocate them to

the users for specific periods of time. Provisioning, al-locating and pricing these VM instances are challengingissues that have to be addressed by the cloud providers.The fixed-price allocation mechanisms employed bycommercial cloud providers (e.g., Microsoft Azure [1],Amazon EC2 [2]) cannot allocate the VM instances effi-ciently or price the resources reflecting the dynamicallychanging user demands. Economic theory suggests thatwhen the auction costs are low, auctions are efficient overthe fixed-price mechanisms since products are matchedto customers having the highest valuation [3]. In partic-ular, combinatorial auction-based mechanisms are bestsuited for resource allocation in clouds because of thenature of the allocation requests. However, we have toovercome certain challenges while using combinatorialauction-based mechanisms for VM provisioning and al-location in clouds. The winner determination in a combi-natorial auction is an NP-hard problem [4] and therefore,solving it for large number of users and resources willrequire considerable amount of time. Since the majorityof current cloud providers serve a large number ofusers and have a large amount of resources availablefor allocation they will need to employ approximation

• The authors are with the Department of Computer Science, Wayne StateUniversity, 5057 Woodward Avenue, Detroit, MI 48202.E-mail: [email protected], [email protected]

algorithms in order to solve the winner determinationproblem in a reasonable amount of time.In our previous work [5], we designed two combi-

natorial auction-based approximation mechanisms forVM instance allocation. Although these mechanisms areable to increase the allocation efficiency of VM instancesand also increase the cloud provider’s revenue, theyassume static provisioning of VM instances. That is, theyrequire that the VM instances are already provisionedand would not change. Static provisioning leads to in-efficiencies due to under-utilization of resources if themechanism cannot accurately predict the user demand.Since a regular auction computes the price of the itemsbased on user demands, a very low demand may requirethe auctioneer to set a reserve price to prevent losses.In this paper, we address the VM provisioning and

allocation problem by designing a combinatorial auction-based mechanism that produces an efficient allocationof resources and high profits for the cloud provider. Themechanism extends one of the mechanisms we proposedin [5] to include dynamic configuration of VM instancesand reserve prices. The proposed mechanism, calledCA-PROVISION, treats the set of available computingresources as ‘liquid’ resources that can be configured intodifferent numbers and types of VM instances dependingon the requests of the users. Each user desires a specificbundle of VM instances and bids only on one suchbundle (i.e., the users are single-minded). The mechanismdetermines the allocation based on the users’ valuationsuntil all resources are allocated. It involves a reserveprice determined by the operating cost of the resources.

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 2

The reserve price ensures that a user pays a minimumamount to the cloud provider so that the provider doesnot suffer any losses from the VM provisioning andallocation.

1.1 Our Contribution

Our main contribution is the design of a combinatorialauction-based mechanism for dynamic provisioning andallocation of VM instances in clouds. The existing com-binatorial auction-based VM allocation mechanisms donot take into account the user’s demand when makingprovisioning decisions, that is, they assume that the VMinstances are statically provisioned. Our design is novelin the sense that it eliminates the static provisioningrequirement and takes into account the changing userdemand when making allocation decisions. Our designalso includes reserve prices and a method for determin-ing them. We prove that the proposed mechanism istruthful, that is, it guarantees that a participating usermaximizes its utility only by bidding its true valuationfor the bundle of VMs. We evaluate our mechanismby performing extensive simulation experiments usingtraces of real workloads from the Parallel WorkloadsArchive [6]. These extensive experiments show that theproposed mechanism yields higher revenue for the cloudprovider and improves the utilization of cloud resources.Finally, we analyze the drawbacks and benefits of em-ploying the proposed mechanism and provide guide-lines for its implementation.

1.2 Organization

The rest of the paper is organized as follows. In Section 2,we formulate the problem of dynamic VM provisioningand allocation in clouds. In Section 3, we discuss therelated work. In Section 4, we present our proposedmechanism for solving the VM provisioning and alloca-tion problem and characterize its theoretical properties.In Section 5, we perform extensive simulations usingreal workload traces to investigate the properties of ourproposed mechanism. In Section 6, we conclude thepaper and discuss possible future research directions.

2 DYNAMIC VM PROVISIONING ANDALLOCATION PROBLEM

Virtualization technology allows the cloud computingproviders to configure computational resources into vir-tually any combination of different types of VMs. Hence,it is possible to determine the best combination ofVM instances through a combinatorial auction and thendynamically provision them. This will ensure that thenumber of VM instances of different types are deter-mined based on the market demand and then allocatedefficiently to the users. We formulate the Dynamic VMProvisioning and Allocation Problem (DVMPA) as follows.

A cloud provider offers computing services tousers through m different types of VM instances,

VM1, . . . , VMm. The computing power of a VM instanceof type VMi, i = 1, . . . ,m is wi, where w1 = 1 andw1 < w2 < . . . < wm. We denote by w = (w1, w2, . . . , wm)the vector of computing powers of the m types of VMinstances. In the rest of the paper we will refer to thisvector as the ‘weight vector’. As an example of howwe use this vector, let us consider a cloud provideroffering three types of VM instances: VM1, consistingof one 2 GHz processor, 4 GB memory, and 500 GBstorage; VM2, consisting of two 2 GHz processors, 8GB memory, and 1 TB storage; and VM3, consisting offour 2 GHz processors, 16 GB memory, and 2 TB storage.The weight vector characterizing the three types of VMinstances is thus, w = (1, 2, 4). We consider the weight-based model for VM instances in order to make thebidding and allocation more practical. First, consideringthe CPU, memory and storage separately for composingthe user’s bundles will make the users’ task of biddingvery complex. Complex bidding will be a deterrent tomany of the users. On the other hand, the currentcloud providers are bundling different combinations ofresources into different types of VM instances (e.g.,Amazon EC2’s High-Memory, High-CPU instances [7])that the users could choose based on their specific needs.Since the provider knows the computational power of itsVM types, all she needs to do is calculate a weight factorfor all types of VMs and consider them as componentsof the weight vector w. Following the practice of currentcloud providers, such as Amazon EC2, we consider thatthe weight is determined mainly by the number of cores.Since the current cloud providers allocate matching re-sources for memory and storage, we also consider thisin our model.We assume that the cloud provider has enough re-

sources to create a maximum of M VM instances of theleast powerful type, V M1. The cloud provider can pro-vision the VM instances in several ways according to thespecified types given by VM1, . . . , VMm. Let’s denote byki the number of V Mi instances provisioned by the cloudprovider. The provider can provision any combination ofinstances given by the vector (k1, k2, . . . , km) as long as

m∑

i=1

wiki ≤ M. (1)

We consider n users u1, . . . , un who request computingresources from the cloud provider specified as bundlesof VM instances. A user uj requests VM instances bysubmitting a bid Bj = (rj

1, . . . , rjm, vj) to the cloud

provider, where rji is the number of instances of type

V Mi requested and vj is the price user uj is willingto pay to use the requested bundle of VMs for a unitof time. An example of a bid submitted by a user toa cloud provider that offers three types of VMs can beBj = (2, 1, 4, 10). This means that the user is biddingten units of currency for using two instances of typeV M1, one instance of type V M2, and four instances oftype V M3 for one unit of time. Here, we assume that

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 3

the users are single-minded, i.e., a user bids for onlyone bundle. Based on the auction outcome, the cloudprovider will either allocate the entire bundle to the useror not provide any VM instance at all. The provider runsa mechanism, in our case an auction, periodically (e.g.,once an hour) to provision and allocate the VM instancessuch that its profit is maximized. Thus, the users bid forobtaining the requested VM bundles for one unit of time.If the user’s job requires a bundle for more than oneunit of time, the user has to bid again in the next roundof mechanism execution. In order to define the profitobtained by the cloud provider we need to introduceadditional notation. Let’s denote by pj the amount paidby user uj for using her requested bundle of VMs. Notethat depending on the pricing and allocation mechanismused by the cloud provider, pj and vj can have differentvalues, usually pj < vj .

Let us assume that the time interval between twoconsecutive auctions is one unit of time. Let cR and cI

be the costs associated with running, respectively idlinga V M1 instance for one unit of time. Obviously, cR > cI .The cloud provider’s cost of running all available re-sources (i.e., all M V M1 instances) is M · cR, while thecost of keeping all the available resources idle is M · cI .We denote by x = (x1, x2, . . . , xn) the allocation vector,where xj = 1 if the bundle (rj

1, . . . , rjm) requested by

user uj is allocated to her, and xj = 0, otherwise. Givena particular allocation vector and payments, the cloudprovider’s profit is given by

Π =

n∑

j=1

xjpj − cR

n∑

j=1

xjsj − cI

M −

n∑

j=1

xjsj

(2)

where sj =∑m

i=1wir

ji is the amount of ‘unit’ computing

resources requested by user uj . The ‘unit’ computingresource is equivalent to one VM instance of type V M1

(i.e., the least powerful instance offered). The first termof the equation gives the revenue, the second term givesthe running cost of the VM instances that are allocatedto the users, and the third term gives the cost of keepingthe remaining resources idle.

The purpose of considering the costs cR and cI is toreduce the cloud provider’s losses when the demand isvery low or when the user valuations are too low. Inboth cases, allocating the resources to the users via anauction will lead to very low user payments which willimplicitly result in loss of revenue for the auctioneer. Aswe will see later in the paper, considering these costshelps determine a ‘reserve price’ that will prevent userswith very low bids from participating in the auction andimplicitly guarantees that the provider is still makingsome profit when the demand is low.

The problem ofDynamic VM Provisioning and Allocation(DVMPA) in clouds is defined as follows

max Π (3)

subject to:

n∑

j=1

xjsj ≤ M (4)

xj ∈ {0, 1}, j = 1, . . . , n (5)

0 ≤ pj ≤ vj , j = 1, . . . , n (6)

The solution to this problem consists of allocationxj and price pj for each user uj who requested thebundle (rj

1, . . . , rjm), j = 1, . . . , n. The allocation will

determine the number of VMs of each type that needs tobe provisioned as follows. We compute ki =

∑n

j=1xjr

ji ,

for each type V Mi and provision ki VM instances oftype V Mi.Current cloud service providers use a fixed-price

mechanism to allocate the VM instances and rely onstatistical data to provision the VMs in a static manner. Inour previous work [5], we have shown that combinato-rial auction-based mechanisms can allocate VM instancesin clouds generating higher revenue than the currentlyused fixed-price mechanisms. However, the combina-torial auction-based mechanisms we explored in ourprevious work [5] require that the VMs are provisionedin advance, that is, they require static provisioning. Weargue that the overall performance of the system can beincreased by carefully selecting the set of VM instancesin a dynamic fashion which reflects the market demandat the time when an auction is executed. In Section 4,we propose a combinatorial auction-based mechanismthat solves the DVMPA problem by determining theallocation, pricing, and the best configuration of VMsthat need to be provisioned by the cloud provider inorder to obtain higher profits. Since very little is knowabout profit maximizing combinatorial auctions [4], wecannot provide theoretical guarantees that our auction-based mechanism maximizes the profit. The only guar-antee we can provide is that the mechanism determinesan approximately efficient allocation (i.e., approximatelymaximizes the sum of the users’ valuations). In design-ing our mechanism we also use reserve prices which areknown to increase the revenue of the auctioneer, in ourcase, the revenue of the cloud provider.

3 RELATED WORK

Researchers approached the problem of VM provisioningin clouds from different points of view. Shivam et al. [8]presented two systems called Shirako and NIMO thatcomplement each other to obtain on-demand provision-ing of VMs for database applications. Shirako does theactual provisioning and NIMO guides it through activelearning models. The CA-PROVISION mechanism wepropose in this paper performs both demand trackingand provisioning via a combinatorial auction. Dynamicprovisioning of computing resources was investigatedby Quiroz et al. [9] who proposed a decentralized onlineclustering algorithm for VM provisioning based on the

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 4

workload characteristics. The authors proposed a model-based approach to generate workload estimates on along-term basis. Our proposed mechanism provisionsthe VMs dynamically and it does not require the predic-tion of the workload characteristics, rather the currentdemand for VMs is captured and the provisioning isdecided by a combinatorial auction-based mechanism.Vecchiola et al. [10] proposed a deadline-driven provi-sioning mechanism supporting the execution of scientificapplications in clouds.Recently, researchers investigated economic models

for resource allocation in computational grids. Wolski etal. [11] compared commodities markets and auctions ingrids in terms of price stability and market equilibrium.Das and Grosu [12] proposed a combinatorial auction-based protocol for resource allocation in grids. They con-sidered a model where different grid providers can pro-vide different types of computing resources. An ‘externalauctioneer’ collects the information about resources andruns a combinatorial auction-based allocation mecha-nism where users participate by requesting bundles ofresources.Several researchers have investigated the economic

aspects of cloud computing from different points of view.Wang et al. [13] studied different economic and systemimplications of pricing resources in clouds. Altmann etal. [14] proposed a marketplace for resources where theallocation and pricing are determined using an exchangemarket of computing resources. In this exchange, the ser-vice providers and the users both express their ask andbid prices and matching pairs are granted the allocationand removed from the system. Risch et al. [15] designeda testbed for cloud services that enables the testingof different mechanisms on clouds. They deployed theexchange mechanism described by Altmann et al. [14]on this platform. In this paper, we consider designinga combinatorial auction mechanism with reserve priceinstead of an exchange. In this case, instead of specifyingan asking price, the cloud provider determines a reserveprice that is based on its cost parameters. Also, theoutcome of the auction determines the configuration ofVM instances that needs to be provisioned.Amazon EC2 Spot Instances [16] is an example of

auction-based mechanism used in a commercial cloud.Although Amazon publishes historic spot prices, themethod used for determining the prices is not dis-closed [17]. Therefore, most research work about the SpotInstances revolve around the issue of how Spot Instances(SI) can be utilized in order to reduce the costs of runningdifferent applications on Amazon EC2. Wee [18] showedthat the SIs are about 52% cheaper and that shifting thetime of computation in SIs can save about 4% of the cost.Zhang et al. [19] addressed the problem of dynamicallyallocating computing resources for different VM typesin Amazon Spot Instances in order to maximize therevenue. Ben-Yehuda et al. [17] showed that the SI pricesare not necessarily market-driven.The complexity of solving combinatorial auctions,

specifically the winner determination problem, wasfirst addressed by Rothkopf et al. [20]. Sandholm [21]proved that solving the winner determination problemis computationally hard. Zurel and Nisan [22] proposedan approximation algorithm for solving combinatorialauctions. The book by Cramton et al. [23] providesgood foundational knowledge on combinatorial auc-tions. Lehmann et al. [24] initiated the study of combina-torial auctions with single-minded bidders and devised agreedy mechanism for combinatorial auctions. Mu‘alemand Nisan [25] showed how to obtain truthful mecha-nisms for single-minded settings by combining approx-imation algorithms. Briest et al. [26] and Chekuri andGamzu [27] proposed additional construction techniquesfor the design of truthful mechanisms in single-mindedsettings.In our previous work [5], we extended the mechanism

proposed by Lehmann et al. [24] and developed CA-GREEDY, a combinatorial auction-based mechanism toallocate VM instances in clouds. We showed that CA-GREEDY obtains an allocation of VM instances that isapproximately efficient and generates higher revenuethan the currently used fixed-price mechanisms. How-ever, CA-GREEDY requires that the VMs are provisionedin advance, that is, it requires static provisioning. Themechanism we propose in this paper is different fromCA-GREEDY in that it selects the set of VM instances ina dynamic fashion which reflects the market demand atthe time when the mechanism is executed.Several researchers addressed the design of auction

mechanisms for resource allocation in clouds, the closestworks to ours being by Lampe et al. [28] and Prasad etal. [29]. Lampe et al. [28] proposed an equilibrium priceauction mechanism considering the physical machinecapacities in determining the number of VMs to beprovisioned. Their proposed mechanism provides anapproximate solution to the procurement auction, but incontrast to our work, it does not guarantee truthfulness.Prasad et al. [29] formulated the resource allocationproblem as a procurement auction. In their model, theusers express their requirements to an auction brokerand cloud providers participate in an auction run bythe broker agent. This approach assumes the existenceof several cloud providers with the auction taking placeamong them. This is different from our setting whichfocuses on auction mechanisms run by individual cloudproviders. Another important difference from our workis that their mechanism is not truthful.

4 COMBINATORIAL AUCTION-BASEDDYNAMIC VM PROVISIONING ANDALLOCATION MECHANISM

We present a combinatorial auction-based mechanism,called CA-PROVISION, that computes an approximatesolution to the DVMPA problem. That is, it determinesthe prices the winning users have to pay, and the set ofVM instances that need to be provisioned to meet the

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 5

winning users’ demand. The mechanism also ensuresthat the maximum possible number of resources areallocated and no VM instance is allocated for less thanthe reserve price. The design of the mechanism is basedon the ideas presented in [24].CA-PROVISION uses a reserve price to guarantee that

users pay at least a given amount determined by thecloud provider. Thus, the cloud provider needs to set thereserve price, denoted by vres, to a value which dependson its costs associated with running the VMs. To do thatwe observe that the reserve price should be the break-even point between cR and cI , which is given by cR−cI .This is because if a unit resource is not allocated, it incursa loss of cI . Again, if this resource is allocated for a pricecR − cI , the loss is cR − (cR − cI) = cI . In other words,the minimum price a user has to pay for using the leastpowerful VM for a unit of time is equal to the differencebetween the cost of running and the cost of keeping theresource idle. An auction with reserve price vres can bemodeled by an auction without reserve price in whichwe artificially introduce a dummy bidder u0 having asits valuation the reserve price, i.e., v0 = vres. The dummyuser u0 bids B0 = (1, 0, . . . , 0, vres), i.e., r0

1 = 1, r0i = 0

for all i = 2, . . . ,m, and v0 = vres. CA-PROVISION usesthe density of the bids to determine the allocation. Useruj ’s bid density is dj = vj/sj , where sj =

∑m

i=1wir

ji ,

j = 0, . . . , n. The bid density is a measure of how mucha user bids per unit of allocation. In our case the unitof allocation corresponds to one VM instance of typeV M1. To guarantee that the users are paying at least thereserve price, the mechanism will discard all users forwhich dj < d0.CA-PROVISION first collects bids from users, calcu-

lates the bid density for all bids, and sorts the bids ac-cording to their bid density. It then calculates the reserveprice and discards bids whose bid density falls below thereserve price. Next, it allocates computing resources tothe users in the sorted order and provisions the resourcesaccordingly. Finally, it calculates the payment of eachwinning user, that is, the amount they must pay to thecloud provider. The payment is the minimum value awinning user must bid in order to obtain the resourcesshe requests (i.e., the critical payment [24]). All losingusers pay zero.CA-PROVISION is given in Algorithm 1. The mecha-

nism requires some information from the system such asthe total amount of computing resources M , expressedas the total number of VMs of type V M1 that can beprovisioned by the cloud provider. The mechanism alsorequires as input the number of available VM types, m,and their weight vector w. It also needs to know cR, thecost of running a VM instance of type V M1, and cI , thecost of keeping idle a VM instance of type V M1.

The mechanism works in three phases. In Phase 1, itcollects the users’ bids Bj (lines 1 to 4). In Phase 2, themechanism determines the winning bidders and the VMconfiguration that needs to be provisioned by the cloudprovider as follows. It adds a dummy user u0 with a

Algorithm 1 CA-PROVISION Mechanism

Require: M ; m; wj : j = 1, . . . , n; cR; cI ;Ensure: W ; pj : j = 1, . . . , n; ki : i = 1, . . . , m;1: {Phase 1: Collect bids}2: for j = 1, . . . , n do3: collect bid Bj = (rj

1, . . . , rjm, vj) from user uj

4: end for5: {Phase 2: Winner determination and provisioning}6: W ← ∅ {set of winners}7: vres ← cR − cI

8: add dummy user u0 with bidB0 = (1, 0, 0, . . . , 0, vres)

9: for j = 0, . . . , n do10: sj ←

∑m

i=1rj

i wi

11: dj ← vj/sj {‘bid density’}12: end for13: re-order users u1, . . . , un such that

d1 ≥ d2 ≥ . . . ≥ dn

14: let l be the index such thatdj ≥ d0 if j ≤ l, anddj < d0 otherwise

15: discard users ul+1, . . . , un

16: rename user u0 as ul+1

17: set n← l + 118: R←M19: for j = 1, . . . , n− 1 do {leave out dummy user}20: if sj ≤ R then21: W ←W ∪ uj

22: R← R− sj

23: end if24: end for25: for i = 1, . . . , m do {determine VM configuration}26: ki ←

j:uj∈Wrj

i

27: end for28: {Phase 3: Payment}29: for all uj ∈W do30: W ′

j ← {ul : ul /∈W ∧ (vj = 0⇒ ul ∈W )}31: l← lowest index in W ′

j

32: pj ← dlsj

33: end for34: for all uj /∈W do35: pj ← 036: end for37: return (W ; pj : j = 1, . . . , n; ki : i = 1, . . . , m)

bid that contains only one instance of VM1 and has avaluation of vres = cR − cI (line 8). This dummy useris only used to model the auction with reserve priceand will not receive any allocation. It then computes thebundle size sj and bid density dj of all users (lines 9to 12). Then, all users except the dummy user are orderedin decreasing order of their bid densities and all usersuj with dj < d0 are discarded (lines 13 to 15). Thedummy user u0 is then moved to the end of the listof the remaining users since it has the lowest density inthe current set of users. The mechanism reassigns n to bethe total number of users under consideration, includingthe dummy user (lines 16 and 17).

Next, the mechanism determines the winning users ina greedy fashion. It allocates the requested bundles tousers in decreasing order of their bid density, as long asthere are resources available (lines 18 to 24). However,the dummy user is not considered for allocation. Once

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 6

the winners are determined, the mechanism determinesthe VM configuration that needs to be provisioned byaggregating the bundles requested by the winning users(lines 25 to 27).In Phase 3, the mechanism determines the payment

for all users. For each winning bidder uj the mechanismfinds the set of losing bidders W ′

j who would otherwisewin if vj = 0, i.e., when user uj is not participating(line 30). From this set, user ul with the highest biddensity is selected. This is determined by taking thelowest indexed user from set W ′

j , since the set of usersis already sorted in non-decreasing order of users’ biddensities (line 31). User uj ’s payment is then calculatedby multiplying her bundle size sj with the bid density dl

of ul. All losing bidders pay zero. This type of paymentis known in the mechanism design literature as thecritical payment [24]. The reason we choose this type ofpayment is that it is a necessary condition for obtaininga truthful mechanism, (i.e., a mechanism that providesincentives to the users to bid their true valuations forthe requested bundles). In the next subsection, we showthat our proposed mechanism is truthful.

4.1 Properties of CA-PROVISION

We now investigate the properties of the proposed mech-anism. An important property of a mechanism is incen-tive compatibility, which is also called truthfulness. Thisis important because the mechanism computes the allo-cation and payment based on the information reportedby the users (i.e., bids), which is private information. Arational user may manipulate the mechanism by biddingfalse valuations if it benefits her to do so. The challengeof designing a mechanism, therefore, involves designingthe winner determination and payment functions thatgive the users incentives to bid truthfully. This is veryimportant since the users participating in a truthful allo-cation mechanism do not have to employ sophisticatedbidding strategies to maximize their utilities. They justneed to bid their true valuation for the bundle of VMs.In the following, we denote by B = (B1, . . . , Bn),

the vector representing the bids of all users and, byB−j = (B1, . . . , Bj−1, Bj+1, . . . , Bn) the vector of alluser’s bids except the bid Bj of user uj . Hence, B canalso be represented as B = (Bj , B−j). We also assumethat Bj = (rj

1, . . . , rjm, vj) is the ‘true bid’ of the user, i.e.,

the user requires the bundle (rj1, . . . , r

jm) and she values

it at vj . We denote by Bj = (rj1, . . . , r

jm, vj), the bid the

user submits to the mechanism, which may or may notbe the same as Bj . We denote by B = (B1, . . . , Bn) thevector of all user’s bids reported to the mechanism.Here, we also abuse the notations for the set of

winners W and the payments p1, . . . , pn. We will usethem as the winner determination function W (.) and thepayment functions p1(.), . . . , pn(.). W (B) computes theset of winners from the bid vector B and pj(B) computesthe payment for user uj from B. We express the fact thatuser uj values her requested bundle at vj by the valuation

function

Vj(W (B), Bj) =

{

vj if uj ∈ W (B)0 otherwise

(7)

That is, user uj obtains a valuation of vj if her requestedbundle is allocated and a valuation of 0, otherwise.The utility that user uj derives from obtaining the

requested bundle is the difference between her valuationVj and payment pj (i.e., ‘quasi-linear’ utility) as follows

Uj(W (B), Bj) = Vj(W (B), Bj) − pj(B). (8)

We assume that the users are rational, that is, theirgoals are to maximize their utilities. A truthful mecha-nism guarantees that a user maximizes her utility onlyby bidding her true valuation for the bundle. In thefollowing we define the concept of truthful mechanism.Definition 1 (Truthful mechanism): A mechanism de-

fined by the winner determination function W (.) andpayment functions p1(.), . . . , pn(.) is truthful if for all uj ,Bj , and B−j ,

Uj(W (Bj , B−j), Bj) ≥ Uj(W (Bj , B−j), Bj) (9)

That is, a user participating in a truthful mechanismmaximizes her utility only by bidding her true valuationfor the bundle regardless of the other users’ bids.Truthfulness was well investigated and characterized

in the mechanism design literature [4]. One such usefulcharacterization gives the conditions under which amechanism is truthful. Stated informally, a mechanismis truthful if the allocation function is monotone and thepayments are the critical payments [25]. We define theseproperties in the context of CA-PROVISION below.Definition 2 (Monotonicity): An allocation function

W (.) is monotone if for every user uj and every B−j ,Bj = (rj

1, . . . , rjm, vj) is a winning bid, then every

B′

j = (r′j1, . . . , r

′jm, v′

j) with s′j ≤ sj and v′

j ≥ vj is also a

winning bid. Here s′j =∑m

i=1wir

′ji and sj =

∑m

i=1wir

ji .

In other words, an allocation function is monotone if awinning user also wins if she bids a higher valuation fora smaller size bundle.Definition 3 (Critical value): The critical value vc

j for

user uj ∈ W (B) is defined as the unique value such thatBj = (rj

1, . . . , rjm, vj) is a winning bid for any vj ≥ vc

j

and a loosing bid for any vj ≤ vcj .

Thus, the critical value is the minimum valuation a usermust declare in order to obtain her requested bundle.Definition 4 (Critical payment): The critical value pay-

ment function p associated with monotone allocationfunction W (B) is defined by: pj = vc

j , if user uj wins,and pj = 0, otherwise.In other words, under the critical payment function awinning user pays her critical value, while a losinguser pays zero (i.e., the mechanism is a normalizedmechanism).Next, we present two lemmas and one theorem to

prove that CA-PROVISION is truthful.

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 7

Lemma 1: CA-PROVISION implements a monotoneallocation function.

Proof: CA-PROVISION allocates resources to usersin non-increasing order of dj = vj/sj , where sj is thesum of the weights of VMs in the requested bundle.Hence, a bid with higher vj and lower sj is preferableto the mechanism. Assume user uj gets the allocationby bidding Bj = (rj

1, . . . , rjm, vj). If she changes her bid

to Bj = (rj1, . . . , r

jm, vj) where vj ≥ vj , she stays at

least at the same rank in the ordered list. Since she isrequesting the same resource, this implies that her bidis a winning bid. On the other hand, if user uj bidsB = (rj

1, . . . , rjm, vj), where sj =

∑m

i=1wir

ji ≤ sj , then

dj increases and user uj stays at least at the same rankin the greedy order of users (Algorithm 1, line 13). Sinceshe is requesting fewer resources, her bid Bj is a win-ning bid. By Definition 2, CA-PROVISION implementsa monotone allocation function.

Lemma 2: CA-PROVISION charges the winning userstheir critical payments.

Proof: To compute the payment for a winning useruj , CA-PROVISION finds a losing user ul who wouldwin if user uj would not participate. That means user uj

needs to defeat user ul with her bid to get her requiredbundle (i.e., dj ≥ dl). This means that vj/sj ≥ dl, andtherefore vj ≥ dl ·sj . CA-PROVISION charges pj = dl ·sj

to user uj (line 32 of Algorithm 1) which is the minimumvaluation uj must bid to obtain her required bundle.The losing users pay zero (Lines 34-35 of Algorithm 1).Therefore, CA-PROVISION implements the critical valuepayment (Definition 4).

Theorem 1: CA-PROVISION is truthful.Proof: According to Lemma 1 and 2, CA-

PROVISION implements a monotone allocation functionand charges the winning users their critical payments.Following the results of Mu’alem and Nisan [25],CA-PROVISION is a truthful mechanism. The reserveprices do not affect the truthfulness of the mechanismsince they are basically bids put out by the dummy usercontrolled by the cloud provider and truthful biddingis still a dominant strategy for the users.Now, we investigate the complexity of CA-

PROVISION. The loops in lines 19-24 and lines 29-33constitute the major computational load of Algorithm 1.The first loop has a worst case complexity of O(M). Theworst case is when all winning bidders bid for bundlescontaining exactly one unit of VM1 instances. The totalexecution time of the loop in lines 29-33 is O(n). Thisis because it iterates over the set of winning biddersand the search is performed on the losing bidders.Since the bidders are already sorted, the search for acritical payment for a winner uj+1 actually starts fromthe ‘critical payment bidder’ ul of uj (without loss ofgenerality, we assume both uj and uj+1 are winners inthis case). Hence, the overall worst case complexity ofthis loop is O(n), whereas the sorting in line 13 costsO(n log n). Thus, the complexity of CA-PROVISION isO(M + n log n).

5 EXPERIMENTAL RESULTS

We perform extensive simulation experiments with realworkload data to evaluate the CA-PROVISION mecha-nism. We compare the performance of CA-PROVISIONwith the performance of a combinatorial auction-basedmechanism, called CA-GREEDY, that uses static VMprovisioning [5]. In our previous work [5], we investi-gated the performance of CA-GREEDY against the per-formance of the fixed-price VM allocation mechanism inuse by current cloud providers. The mechanism showedsignificant improvements over the fixed-price allocationmechanism thus, making it a good candidate for ourcurrent experiments.CA-GREEDY mechanism uses the same type of pay-

ment determination as CA-PROVISION. The differencebetween the two is in how they allocate the VM in-stances. The CA-GREEDY assumes that the VM in-stances are already provisioned (i.e., static provisioning),while CA-PROVISION makes the provisioning decisionsdynamically.We perform a total of 264 experiments with data

generated using eleven workload logs from the ParallelWorkloads Archive [6] and 24 different combination ofother parameters for each workload. In this section,we describe the experimental setup and discuss theexperimental results.

5.1 Experimental Setup

The experiments consist of generating job submissionsfrom a given workload and then running both CA-GREEDY and CA-PROVISION concurrently to allocatethe jobs and provision the VMs. For setting up theexperiments we have to address several issues such asworkload selection, bid generation, and setting up theauction. We discuss all these issues in the followingsubsections.

5.1.1 Workload selection

To the best of our knowledge, standard cloud computingworkloads were not publicly available at the time ofwriting this paper. Thus, to overcome this limitationwe rely on well studied and standardized workloadsfrom The Parallel Workloads Archive [6]. This archivecontains a rich collection of workloads from various gridand supercomputing sites. Out of the twenty-six realworkloads available, we selected eleven logs that wererecorded most recently. These logs are: 1) ANL-Intrepid-2009, from a Blue Gene/P system at Argonne NationalLab; 2) DAS2-fs0-2003 - DAS-fs4-2003, from a researchgrid of five clusters at the Advanced School of Comput-ing and Imaging in the Netherlands; 3) LLNL-Atlas-2006and LLNL-Thunder-2007 from two Linux clusters (Atlasand Thunder) located at Lawrence Livermore NationalLab; 4) LLNL-uBGL-2006, from a Blue Gene/L system atLawrence Livermore National Lab; 5) LPC-EGEE-2004,from a Linux cluster at The Laboratory for Corpuscular

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 8

TABLE 1: Workload logsLogfile Duration Jobs ProcessorsANL-Intrepid-2009 8 months 68,936 163,840DAS2-fs0-2003 12 months 225,711 144DAS2-fs1-2003 12 months 40,315 64DAS2-fs2-2003 12 months 66,429 64DAS2-fs3-2003 12 months 66,737 64DAS2-fs4-2003 11 months 33,795 64LLNL-Atlas-2006 8 months 42,725 9,216LLNL-Thunder-2007 5 months 121,039 4,008LLNL-uBGL-2006 7 months 112,611 2,048LPC-EGEE-2004 9 months 234,889 140SDSC-DS-2004 13 months 96,089 1,664

Physics, Univ. Blaise-Pascal, France; and 6) SDSC-DS-2004, from a 184-node IBM eServer pSeries 655/690called DataStar located at the San Diego SupercomputerCenter.

In Table 1 we provide a brief description of the work-loads we use in our experiments. The table containsthe name of the log file, the length of time the logswere recorded, the total number of submitted jobs, andthe total number of processors available in the system.The log file name generally contains the acronym of theorganization, the name of the system, and the year ofits generation. From the duration column, we see thatthe logs were generated for long periods of time, aslong as thirteen months for the SDSC log. The numberof jobs submitted ranges from many thousands to morethan a couple of hundred thousands, while the numberof processors ranges from 64 to 163,840. These largevariations in the number of processors and the numberof submitted jobs make these logs very suitable forexperimentation, providing us with a wide range ofsimulation scenarios.

The workloads are given in the Standard WorkloadFormat (swf) described in [30]. In this format, the infor-mation corresponding to every job submitted to the sys-tem is stored as a record with eighteen fields. To generatethe workload for our simulation experiments, we needthe information from six fields of the log files as follows:(1) Job Number: stores the job’s identifier; (2) SubmitTime: stores the job submission time; (3) Run Time: storesthe time the job needs to complete its execution. Weuse this as the time required to complete the job. Weround this up to the nearest hour because we run hourlyauctions in the experiments. (4) Number of AllocatedProcessors: for our purposes this represents the numberof requested processors; (5) Average CPU Time Used:Average time a CPU was running. We use this fieldin conjunction with the preceding two parameters todetermine the amount of communication and the parallelspeedup of the job. (6) User ID: stores the ID of the userwho submitted the job. We use this ID to place users intodifferent classes having different bidding behaviors. Welist some statistics of the workload files in Table 2.

The logs from the Parallel Workloads Archive [6] werecollected from different heterogeneous sources and thenconverted into the standard format. Therefore, in somelogs, some of the fields are not specified since the original

TABLE 2: Statistics of workload logsDuration Jobs / Avg. Avg procs.

Logfile (hours) hour Runtime per jobANL-Intrepid-2009 5759 12 2.09 5063DAS2-fs0-2003 8744 26 1.09 10DAS2-fs1-2003 8633 5 1.23 8DAS2-fs2-2003 8760 8 1.29 9DAS2-fs3-2003 8712 8 1.17 5DAS2-fs4-2003 7963 4 1.67 4LLNL-Atlas-2006 4308 10 2.52 401LLNL-Thunder-2007 3605 34 1.52 43LLNL-uBGL-2006 5339 21 1.25 576LPC-EGEE-2004 5728 41 1.80 1SDSC-DS-2004 9387 10 2.88 62

files had missing information. Some records in a log filemay also have fewer fields than the other records fromthe same file. We make corrections on these records asfollows:

• If the job starting time is missing, we consider itto be equal to the previous job’s start time. Thelogs record the jobs in order of their arrival times.Matching a missing arrival time with the previousjob maintains the job order.

• If the execution time is missing, we randomly gen-erate an execution time between one and two hoursfrom a uniform distribution. As can be seen inTable 2, most of the workloads have an averageruntime within this range.

• If the number of processors is missing, we gener-ate a number between 10 and 60 randomly, froma uniform distribution. Since the average numberof processors per job differs widely among theworkload logs (from 1 processor/job up to 5063processors/job), we select a distribution that has amean (35 processors/job) approximately equal tothe average of the two-digit numbers in the list(i.e., 10, 43, and 63 processors/job).

• If the average CPU time is missing we generatea random number between 50% and 100% of thetotal run time using the uniform distribution. Thisgenerates jobs with communication to computationratios between 0 and 0.5.

• We assign user IDs randomly in cases in which theyare not provided.

5.1.2 Job and bid generation

For each record in a log file we generate a job thata user needs to execute and create a bid for it. Thereare two important parameters associated with a job thatwe need to generate, the requested bundle of VMs andthe associated bid. First, to generate the bundle of VMinstances for a job j, we determine its communication to

computation ratio, ρj = 1 −T CP U

j

T Rj

, where TCPUj is the

average CPU time and TRj is the total run time of the

job. The communication to computation ratio measuresthe fraction of the total runtime that is spent by thejob on communication and synchronization among itsprocesses. Based on this value, we categorize the job intoone of m categories, where m is the number of VM types

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 9

available. The job category specifies a ‘first choice’ of VMtype for the job. This works as follows. We define a factorµ that characterizes how many of the total requestedVMs will be requested as ‘first choice’ type VM instances.A job of category i requesting Pj processors will create abundle comprising a number of VMi instances requiredto allocate µPj processors. The rest of the processors willbe requested by arbitrarily choosing other VM types.After creating the bundle, we generate the associatedbid. To do that we first determine the speedup of the

job, Sj = Pj ×T CP U

j

T Rj

, where Pj is the number of CPUs

used, TCPUj is the average CPU time, and TR

j is thetotal run time of the job. This speedup is multiplied bya ‘valuation factor’ to generate the bid. This valuationfactor is linked to the type of user. We divide the usersinto five categories using their user ID, modulo five. Thelast parameter we set for a job is its deadline. Since thereis no deadline information provided in the workloadlogs, we assume that the deadline is between 4 and 8times the time required to complete the job. Hence, weset the deadline of a job to the required time multipliedby a random number between 4 and 8. We would liketo mention here that the deadline is solely an attributeof the user and that the auction is not aware of theindividual deadlines. They are only used to model theuser behavior, that is, to determine when the user stopsbidding if her requested bundle is not allocated, andwhether a user’s submitted job is completed or not.We run CA-GREEDY and CA-PROVISION mecha-

nisms concurrently and independently considering theusers who have jobs available for execution. A user (orjob) participates in the auction until her job completesor it becomes certain that the job cannot finish by thedeadline. A user is ‘served’ if her job completes itsexecution and ‘not served’, otherwise. Without loss ofgenerality, in the rest of the paper, we assume that eachuser is submitting only one job and we will use ‘user’and ’job’ interchangeably.

5.1.3 Auction setup

We consider a cloud provider that offers four differenttypes of virtual machine instances, VM1, VM2, VM3, andVM4. These VM types are characterized by the weightvector w = (1, 2, 4, 8). From each workload file, weextract N , the total number of users and M , the totalnumber of processors available. The number of usersparticipating in a particular auction is determined dy-namically as the auction progresses. That is, n is thenumber of users that have been generated, not yet beenserved, and whose job deadlines have not been exceededyet.We setup few parameters to generate bundles specific

to the jobs submitted by a user. The vector (C1, C2, C3)determines the communication ratios used to categorizethe jobs. We use (C1, C2, C3) = (0.05, 0.15, 0.25), asfollows. A job having communication ratio below 0.05is a job of type 1 and the majority of its needed VM

TABLE 3: Simulation ParametersName Description Value(s)

N Total users From log fileM Total CPUs From log fileT Simulation hours From log file

(cI , cR) Idle and running (.05, .1), (.1, .25),cost of unit VM (.15, .5)

µ Factor for CPUs for ‘firstchoice’ VM type

0.5, 0.75

h Static distribution of (.25, .25, .25, .25),processors among VM types (.07, .13, .27, .53)

f Valuation factors (.5, 1, 1.5, 2, 2.5),for types of users (1, 1.5, 2, 3, 4)

C1, C2, C3 Boundaries of communica-tion ratios

(.05, .15, .25)

instances µpj will be requested as VM1, where pj is thenumber of processors requested by user uj . We considerthe following values for µ, 0.5 and 0.75. The rest of thebundle is arbitrarily determined using the other typesof VM instances. We use the user ID field of the logfile to determine the valuation range of the user. Thereare five classes of users submitting jobs. The class tof a user is determined by ((user ID) mod 5). The logshave real user IDs, therefore this classification virtuallycreates a realistic distribution of users. Each class t ofusers is associated with a ‘valuation factor’ ft. Havingdetermined that a user is of class t, we determine thevaluation of her bundle using the speedup (as shownin the previous subsection) and the ‘valuation factor’ ft

from the vector f . The vector f has five elements (equalto the number of classes of users), each representing themean value of how much a user of that class ‘values’each ‘unit of speedup’. In particular, a user uj havinga speedup of Sj for her job is willing to pay ftSj onaverage for each hour of her requested bundle of VMs,given that uj falls in class t. We generate a randomvalue between 0 and 2ft, and then multiply it with Sj

to generate valuations with a mean of ftSj . We use twosets of vectors for f , as shown in Table 3.

CA-PROVISION determines by itself the configura-tion of the VMs that needs to be provisioned by thecloud provider, whereas CA-GREEDY assumes staticVM provisioning, and thus, needs the VM configurationprovisioned in advance. To generate the static provisionof VMs required by CA-GREEDY we use a vector h asfollows. We use two instances of h in our simulationexperiments. The first one, h = (0.07, 0.13, 0.27, 0.53)ensures that, given the weight vector w, the number ofVM instances of each type is not the same. However,this vector ensures that about the same amount of com-puting resources as in the case of CA-PROVISION areprovisioned. This same amount of resources are thenprovisioned as different types of VMs in the case ofCA-GREEDY. This could be verified by multiplying eachcomponent of h with the corresponding component of w,which gives 0.56, 0.52, 0.54, and 0.53. The other instanceof vector h = (0.25, 0.25, 0.25, 0.25) that we use in oursimulations ensures that the total number of processorsare equally provisioned into different types of VMs.We also evaluate CA-PROVISION against a modified

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 10

0

0.2

0.4

0.6

0.8

1

DAS2-fs4 (0.40)

LPC-EGEE (0.53)

LLNL-Thunder (0.54)

DAS2-fs3 (0.69)

DAS2-fs1 (0.75)

ANL-Intrepid (0.77)

LLNL-Atlas (1.09)

SDSC-DS (1.10)

DAS2-fs2 (1.44)

DAS2-fs0 (2.01)

LLNL-uBGL (7.41)

Ave

rage

pro

fit p

er p

roce

ssor

-hou

r

Workload file (normalized load)

CA-PROVISIONCA-GREEDY

(a)

0

0.2

0.4

0.6

0.8

1

1.2

DAS2-fs4 (0.40)

LPC-EGEE (0.53)

LLNL-Thunder (0.54)

DAS2-fs3 (0.69)

DAS2-fs1 (0.75)

ANL-Intrepid (0.77)

LLNL-Atlas (1.09)

SDSC-DS (1.10)

DAS2-fs2 (1.44)

DAS2-fs0 (2.01)

LLNL-uBGL (7.41)

Ave

rage

rev

enue

per

pro

cess

or-h

our

Workload file (normalized load)

CA-PROVISIONCA-GREEDY

(b)

0

0.05

0.1

0.15

0.2

0.25

0.3

DAS2-fs4 (0.40)

LPC-EGEE (0.53)

LLNL-Thunder (0.54)

DAS2-fs3 (0.69)

DAS2-fs1 (0.75)

ANL-Intrepid (0.77)

LLNL-Atlas (1.09)

SDSC-DS (1.10)

DAS2-fs2 (1.44)

DAS2-fs0 (2.01)

LLNL-uBGL (7.41)

Ave

rage

cos

t per

pro

cess

or-h

our

Workload file (normalized load)

CA-PROVISIONCA-GREEDY

(c)

Fig. 1: CA-PROVISION vs. CA-GREEDY: (a) Average profit per processor-hour; (b) Average revenue per processor-hour; (c) Average cost per processor-hour. Horizontal axis shows the log file name with normalized load inparentheses.

version of CA-GREEDY which considers forecasting thedemand of each type of VM instance. The forecasteddemand is determined by executing CA-PROVISION forthe z percentage of the past user requests and calculatingthe average of the number of VM instances of eachtype provisioned by CA-PROVISION. This number isthen used to statically provision the VM instances whenrunning CA-GREEDY. This types of experiments willgive us insight on whether the demand forecast using amoving average can generate comparable performanceto that obtained by CA-PROVISION. When running theexperiments we select four values for z: 10%, 25%, and50%.We list all simulation parameters in Table 3. With all

combinations of values, we perform 24 experiments witheach log file. As an implementation note, the frequencyof running the mechanisms is to be determined by thesystem designers based on the performance of the un-derlying software layer. In our experiments we executedthe mechanisms every hour just to follow the standardpractice in Amazon EC2. In a real cloud system, theactual time for provisioning must be considered whendetermining the frequency of running the mechanisms.

5.2 Analysis of Results

We investigate the performance of the two mechanismsfor different workloads. Since the workloads are hetero-geneous in several dimensions, we first define a metricin order to characterize the workloads, and thus, be ableto establish an order among them. Then, we normalizethe performance metrics of the mechanisms and com-pare them with respect to the workload characteristics.Finally, we try to gain more insight by comparing theallocation determined by the two mechanisms side byside.We define a metric for comparing the workload logs

as follows. Looking at the workload characteristics listedin Tables 1 and 2, we determine that the best metric tocompare the workloads is the normalized load ηω, definedas ηω = Jω×Tω×Pω

Mω, where ηω is the normalized load of

workload ω, Jω is the average number of jobs submitted

per hour, Tω is the average runtime of the jobs, Pω

is the average number of processors required per job,and Mω is the total number of processors in the systemcorresponding to workload ω. The number of jobs perhour multiplied by the average processors per job deter-mines how many processors are requested by the jobsarriving each hour. Multiplying this with the averageruntime gives an estimate of the average number ofprocessors requested by all jobs in an hour. Essentially,the normalized load measures the average amount ofload per processor. When analyzing the results, we usethe normalized load to rank the otherwise heterogeneouslog files.

From each set of simulation experiments, we computethe total revenue generated, the total cost incurred, andthe total profit earned by each mechanism. Since theworkloads were generated for different durations of timefor systems with different number of processors wescale the profit, revenue, and cost with respect to thetotal simulation hours and the number of processors.We define the profit per processor-hour as Πph

ω = Πω

Mω×Lω,

where Πω is the profit computed on workload ω, Mω isthe total number of processors, and Lω is the number ofhours of data provided in workload ω. We define revenueper processor-hour and cost per processor-hour in a similarfashion.

We plot the average profit, the average revenue, andthe average cost per processor-hour versus the workloadlogs in Figures 1a to 1c. In these figures, the workloadsare sorted in ascending order of their normalized load.Note that the CA-PROVISION mechanism yields higherrevenue in most of the cases. For workloads with nor-malized loads greater than 1.44, the revenue obtainedby CA-PROVISION steadily increases exceeding thatobtained by CA-GREEDY by up to 40%. This leads usto conclude that CA-PROVISION is capable of gener-ating higher revenue where there is high demand forresources.

In Figure 1c, we observe that CA-PROVISION in-curs a higher total cost for all workloads. Since CA-PROVISION decides about the number of VMs dy-

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 11

0

20

40

60

80

100

DAS2-fs4 (0.40)

LPC-EGEE (0.53)

LLNL-Thunder (0.54)

DAS2-fs3 (0.69)

DAS2-fs1 (0.75)

ANL-Intrepid (0.77)

LLNL-Atlas (1.09)

SDSC-DS (1.10)

DAS2-fs2 (1.44)

DAS2-fs0 (2.01)

LLNL-uBGL (7.41)

Per

cent

age

of r

esou

rces

util

ized

Workload file (normalized load)

CA-PROVISIONCA-GREEDY

(a)

0

20

40

60

80

100

DAS2-fs4 (0.40)

LPC-EGEE (0.53)

LLNL-Thunder (0.54)

DAS2-fs3 (0.69)

DAS2-fs1 (0.75)

ANL-Intrepid (0.77)

LLNL-Atlas (1.09)

SDSC-DS (1.10)

DAS2-fs2 (1.44)

DAS2-fs0 (2.01)

LLNL-uBGL (7.41)

Per

cent

age

of u

sers

ser

ved

Workload file (normalized load)

CA-PROVISIONCA-GREEDY

(b)

Fig. 2: CA-PROVISION vs. CA-GREEDY: (a) Resource utilization; (b) Percentage of users served. Horizontal axisshows the log file name with normalized load in parentheses.

namically, it can allocate a higher number of VM in-stances than CA-GREEDY in an auction with identicalbidders. This explains the higher cost incurred by CA-PROVISION; a unit VM instance costs cI per unit timewhen idle, and cR > cI per unit time while running(i.e., allocated to a user), as we assumed in Section 2.Therefore, by provisioning and allocating more VM in-stances, CA-PROVISION incurs higher costs to the cloudprovider.

Now, the question is whether the interplay betweenincreased revenue and increased cost can generate ahigher profit. Utilizing more resources means servingmore customers hence, selecting more bidders as win-ners. This interplay has two mutually opposite effectson the revenue. Obviously, increasing the number ofwinners has a positive effect on the revenue. On theother hand, selecting more winners pushes down theircritical values, and thus, individual payments decrease.If the net effect is positive, we get a higher revenueand when it surpasses the increase in cost, we obtaina higher profit, and thus, achieve economies of scale.From Figure 1a, we see that for normalized loads greaterthan 1.44, CA-PROVISION consistently generates higherprofit than CA-GREEDY and the difference in profitgrows rapidly. We also observe that for the workloadshaving load factors below 1.44, CA-PROVISION andCA-GREEDY obtain higher profit in equal number ofcases. This suggests that for low loads the relative out-come of the mechanisms depends on other parameters.

In Figures 2a and 2b, we compare the resource uti-lization and the percentage of served users obtained bythe two mechanisms. CA-PROVISION achieves highervalues for both utilization and percentage of servedusers. We want to draw the attention of the readerto the fact that in most of the cases the difference inutilization is around 30%. This is where we can improvea lot if we switch from static to dynamic provisioningand allocation. Since combinatorial auctions are alreadyestablished tools for efficient allocation, combining them

with dynamic provisioning can lead to a highly efficientresource allocation mechanism for clouds.

The number of users served is higher for CA-PROVISION because the VM instances are not stati-cally provisioned. Therefore, a user requesting two VM1

instances will not be left unallocated if there are noVM1 instances available but a VM2 instance is availableas in the case of CA-GREEDY. Rather, CA-PROVISION‘sees’ the available resource as a computing resourceequivalent to two VM1 instances and will allocate this,for instance, to a user bidding for two VM1 instances or auser bidding for one VM2 instance, depending on whosereported valuation is higher. This approach increases thenumber of users served by CA-PROVISION.

We now go into the details of the VM allocation byCA-GREEDY and CA-PROVISION for the DAS2-fs3-2003 workload. We pick a sample scenario from variouscombination of input parameters. In this experiment,the static VM allocation consists of 16 instances of typeV M1, 8 instances of type V M2, 4 instances of type V M3,and 2 instances of type V M4, This is equivalent to 64instances of unit size (i.e., type V M1). For this workload,a total of 4100 auctions were held and in Figures 3 to 6,we show the allocation of different VM instances in allthese auctions. The figures corresponding to the CA-PROVISION mechanism show the number of the VMinstances that are provisioned by the mechanism as thinvertical lines, where each line corresponds to an auction.For example, in Figure 3a, we see that in many auctions,all 64 processors are configured as VM1 instances. On theother hand, there are auction outcomes where no VM1

instances are provisioned, as evident by the white stripstouching the horizontal axis. The plots in the figurescorresponding to the CA-GREEDY mechanism show (asthin vertical lines) the number of the VM instances thatare allocated to the users. In both categories of plots, weshow the static allocation line to compare the differencesbetween static and dynamic provisioning.

Figure 3a is particularly interesting because it shows

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 12

0

10

20

30

40

50

60

70

80

0 500 1000 1500 2000 2500 3000 3500 4000

Num

ber

of V

M1

inst

ance

s

Auction number

CA-PROVISION vs CA-GREEDY: Allocation of VM1

Actual allocationStatic allocation

(a)

0

10

20

30

40

50

60

70

80

0 500 1000 1500 2000 2500 3000 3500 4000

Num

ber

of V

M1

inst

ance

s

Auction number

CA-GREEDY: Actual vs Static Allocation of VM1

Actual allocationStatic allocation

(b)

Fig. 3: Allocation of VM1 instances: (a) by CA-PROVISION; (b) by CA-GREEDY. Workload file: DAS2-fs3-2003.

0

5

10

15

20

0 500 1000 1500 2000 2500 3000 3500 4000

Num

ber

of V

M2

inst

ance

s

Auction number

CA-PROVISION vs CA-GREEDY: Allocation of VM2

Actual allocationStatic allocation

(a)

0

5

10

15

20

0 500 1000 1500 2000 2500 3000 3500 4000

Num

ber

of V

M2

inst

ance

s

Auction number

CA-GREEDY: Actual vs Static Allocation of VM2

Actual allocationStatic allocation

(b)

Fig. 4: Allocation of VM2 instances: (a) by CA-PROVISION; (b) by CA-GREEDY. Workload file: DAS2-fs3-2003.

0

2

4

6

8

10

0 500 1000 1500 2000 2500 3000 3500 4000

Num

ber

of V

M3

inst

ance

s

Auction number

CA-PROVISION vs CA-GREEDY: Allocation of VM3

Actual allocationStatic allocation

(a)

0

2

4

6

8

10

0 500 1000 1500 2000 2500 3000 3500 4000

Num

ber

of V

M3

inst

ance

s

Auction number

CA-GREEDY: Actual vs Static Allocation of VM3

Actual allocationStatic allocation

(b)

Fig. 5: Allocation of VM3 instances: (a) by CA-PROVISION; (b) by CA-GREEDY. Workload file: DAS2-fs3-2003.

that at times the demand for VM1 goes far beyond whatwe would even think of allocating in advance. In someauctions demands for VM1 instances are much higherand therefore they push the allocation to the boundary.On the other hand, if we compare it with Figure 3b, wesee that CA-GREEDY indeed can capture the demandand allocate all sixteen available instances of VM1 inmost of the auctions, but is limited to the availabilityof statically provisioned VMs. Eventually it has to serveother less valued bids and looses revenue. Also, CA-GREEDY suffers from under-allocation as it is clear fromFigures 5a and 5b. We see that the actual demand of VM3

instances is lower than what we allocate statically (Fig-ure 5a) and the VM instances indeed remain unallocatedin many cases (Figures 5b).

In the last set of experiments, we evaluate CA-PROVISION against a modified version of CA-GREEDYwhich considers forecasting the demand of each typeof VM instance (as described in Subsection 5.1.3). InFigure 7a, we plot the overall profit for CA-GREEDYwhere 10% of previous requests are used to generatethe average number of VM instances to provision. Weobserve that even with demand forecasting CA-GREEDYcannot obtain the profit obtained by CA-PROVISION.

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 13

0

1

2

3

4

5

0 500 1000 1500 2000 2500 3000 3500 4000

Num

ber

of V

M4

inst

ance

s

Auction number

CA-PROVISION vs CA-GREEDY: Allocation of VM4

Actual allocationStatic allocation

(a)

0

1

2

3

4

5

0 500 1000 1500 2000 2500 3000 3500 4000

Num

ber

of V

M4

inst

ance

s

Auction number

CA-GREEDY: Actual vs Static Allocation of VM4

Actual allocationStatic allocation

(b)

Fig. 6: Allocation of VM4 instances: (a) by CA-PROVISION; (b) by CA-GREEDY. Workload file: DAS2-fs3-2003.

0

0.2

0.4

0.6

0.8

1

DAS2-fs4 (0.40)

LPC-EGEE (0.53)

LLNL-Thunder (0.54)

DAS2-fs3 (0.69)

DAS2-fs1 (0.75)

ANL-Intrepid (0.77)

LLNL-Atlas (1.09)

SDSC-DS (1.10)

DAS2-fs2 (1.44)

DAS2-fs0 (2.01)

LLNL-uBGL (7.41)

Ave

rage

pro

fit p

er p

roce

ssor

-hou

r

Workload file (normalized load)

CA-PROVISIONCA-GREEDY

(a)

0

20

40

60

80

100

DAS2-fs4 (0.40)

LPC-EGEE (0.53)

LLNL-Thunder (0.54)

DAS2-fs3 (0.69)

DAS2-fs1 (0.75)

ANL-Intrepid (0.77)

LLNL-Atlas (1.09)

SDSC-DS (1.10)

DAS2-fs2 (1.44)

DAS2-fs0 (2.01)

LLNL-uBGL (7.41)

Per

cent

age

of r

esou

rces

util

ized

Workload file (normalized load)

CA-PROVISIONCA-GREEDY

(b)

Fig. 7: CA-PROVISION vs. CA-GREEDY, where CA-GREEDY employs demand forecast over 10% of the pastrequests: (a) Average profit per processor-hour; (b) Resource utilization. Horizontal axis shows the log file namewith normalized load in parenthesis.

This is due mainly to the large variations in the numberof user requested VM instances in the log files consid-ered for the experiments. Since similar performance isobtained for the cases in which the moving average iscalculated over 25% and 50% of the previous user’s re-quests, we do not present it here. When we examine theutilization obtained by the two mechanisms (Figure 7b),we observe that CA-GREEDY with demand forecast isnot able to obtain the level of utilization achieved by CA-PROVISION. Since the number of VM instances of eachtype varies greatly over time in the considered logs, theaverage over a window of time cannot capture well thedemand and cannot improve the utilization of resources.The demand forecast using the moving average mayimprove the performance of CA-GREEDY in cases wherethe users’ requests do not exhibit large variations overtime.

We can summarize the experimental results as follows.The CA-GREEDY mechanism is capable of generatinghigher revenue than CA-PROVISION when there ismatching demand with the supply. Also, in an auctionwhere items are not ‘configurable’, CA-GREEDY is avery efficient auction. But when we have reconfigurable

items, as is the case in clouds, it is very hard to predictthe demand very well in advance. In that case, CA-PROVISION is a better option and as today’s technologysupports, it can be deployed as a stand-alone configura-tion and allocation tool.

6 CONCLUSION

We addressed the problem of dynamically provisioningVM instances in clouds in order to generate higher profit,while determining the VM allocation with a combina-torial auction-based mechanism. We designed a mech-anism called CA-PROVISION to solve this problem.We performed extensive simulation experiments withreal workloads to evaluate our mechanism. The resultsshowed that CA-PROVISION can effectively capture themarket demand, provision the computing resources tomatch the demand, and generate higher revenue thanCA-GREEDY, especially in high demand cases. In someof the low demand cases, CA-GREEDY performs betterthan CA-PROVISION in terms of profit but not in termsof utilization and percentage of served users. We con-clude that a highly effective VM instance provisioningand allocation system can be designed combining these

IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. X, NO. X, XXXX 14

two combinatorial auction-based mechanisms. We lookforward to setting up a private cloud and implementingsuch a system in the near future.

ACKNOWLEDGMENTS

This paper is a revised and extended version of [31]presented at the 3rd IEEE International Conference onCloud Computing Technology and Science (IEEE Cloud-Com 2011). This work was supported in part by NSFgrants DGE-0654014 and CNS-1116787.

REFERENCES

[1] Microsoft, “Windows Azure platform,http://www.microsoft.com/windowsazure/.”

[2] Amazon, “Amazon Elastic Compute Cloud (Amazon EC2),http://aws.amazon.com/ec2/.”

[3] R. Wang, “Auctions versus posted-price selling,” The AmericanEconomic Review, vol. 83, no. 4, pp. 838–851, 1993.

[4] N. Nisan, T. Roughgarden, E. Tardos, and V. V. Vazirani, Algorith-mic Game Theory. Cambridge University Press, 2007.

[5] S. Zaman and D. Grosu, “Combinatorial auction-based allocationof virtual machine instances in clouds,” in Proc. 2nd IEEE Intl.Conf. on Cloud Comp. Technology and Science, 2010, pp. 127–134.

[6] D. G. Feitelson, “Parallel Workloads Archive: Logs,”http://www.cs.huji.ac.il/labs/parallel/workload/logs.html.

[7] Amazon, “Amazon EC2 instance types,”http://aws.amazon.com/ec2/instance-types/.

[8] P. Shivam, A. Demberel, P. Gunda, D. Irwin, L. Grit,A. Yumerefendi, S. Babu, and J. Chase, “Automated and on-demand provisioning of virtual machines for database applica-tions,” in Proc. ACM SIGMOD International Conference on Manage-ment of Data, 2007, pp. 1079–1081.

[9] A. Quiroz, H. Kim, M. Parashar, N. Gnanasambandam, andN. Sharma, “Towards autonomic workload provisioning for en-terprise grids and clouds,” in Proc. 10th IEEE/ACM InternationalConference on Grid Computing, 2009, pp. 50–57.

[10] C. Vecchiola, R. N. Calheiros, D. Karunamoorthy, and R. Buyya,“Deadline-driven provisioning of resources fro scientific applica-tions in hybrid clouds with aneka,” Future Generation ComputerSystems, vol. 28, pp. 58–65, 2012.

[11] R. Wolski, J. S. Plank, J. Brevik, and T. Bryan, “Analyzing market-based resource allocation strategies for the computational grid,”The International Journal of High Performance Computing Applica-tions, vol. 15, no. 3, pp. 258–281, 2001.

[12] A. Das and D. Grosu, “Combinatorial auction-based protocols forresource allocation in grids,” in Proc. 19th International Paralleland Distributed Processing Symposium, 6th Workshop on Parallel andDistributed Scientific and Engineering Computing, 2005.

[13] H. Wang, Q. Jing, R. Chen, B. He, Z. Qian, and L. Zhou, “Dis-tributed systems meet economics: Pricing in the cloud,” in Proc.2nd USENIX Workshop on Hot Topics in Cloud Computing, 2010.

[14] J. Altmann, C. Courcoubetis, G. D. Stamoulis, M. Dramitinos,T. Rayna, M. Risch, and C. Bannink, “GridEcon: A market placefor computing resources,” in Proc. Workshop on Grid Economics andBusiness Models, 2008, pp. 185–196.

[15] M. Risch, J. Altmann, L. Guo, A. Fleming, and C. Courcoubetis,“The GridEcon platform: A business scenario testbed for com-mercial cloud services,” in Proc. Workshop on Grid Economics andBusiness Models, 2009, pp. 46–59.

[16] Amazon, “Amazon EC2 spot instances,”http://aws.amazon.com/ec2/spot-instances/.

[17] O. A. Ben-Yehuda, M. Ben-Yehuda, A. Schuster, and D. Tsafrir,“Deconstructing amazon ec2 spot instance pricing,” in Proc. 3rdIEEE Int’l Conf. on Cloud Computing Technology and Science, 2011.

[18] S. Wee, “Debunking real-time pricing in cloud computing,” inProc. 11th IEEE/ACM International Symposium on Cluster, Cloud andGrid Computing, 2011, pp. 585–590.

[19] Q. Zhang, Q. Zhu, and R. Boutaba, “Dynamic resource allocationfor spot markets in cloud computing environments,” in Proc. 4thIEEE Intl. Conf. on Utility and Cloud Computing, 2011, pp. 178–185.

[20] M. H. Rothkopf, A. Pekec, and R. M. Harstad, “Computationallymanageable combinatorial auctions,” Management Science, vol. 44,no. 8, pp. 1131–1147, 1998.

[21] T. Sandholm, “Algorithm for optimal winner determination incombinatorial auctions,” Artificial Intelligence, vol. 135, no. 1-2, pp.1–54, 2002.

[22] E. Zurel and N. Nisan, “An efficient approximate allocation algo-rithm for combinatorial auctions,” in Proc. 3rd ACM Conference onElectronic Commerce, 2001, pp. 125–136.

[23] P. Cramton, Y. Shoham, and R. Steinberg, Combinatorial Auctions.The MIT Press, 2005.

[24] D. Lehmann, L. I. O’Callaghan, and Y. Shoham, “Truth revelationin approximately efficient combinatorial auctions,” Journal of theACM, vol. 49, no. 5, pp. 577–602, 2002.

[25] A. Mu’alem and N. Nisan, “Truthful approximation mechanismsfor restricted combinatorial auctions,” in Proc. 18th National Con-ference on Artificial Intelligence, 2002, pp. 379–384.

[26] P. Briest, P. Krysta, and B. Vocking, “Approximation techniquesfor utilitarian mechanism design,” SIAM Journal on Computing,vol. 40, no. 6, pp. 1587–1622, 2011.

[27] C. Chekuri and I. Gamzu, “Truthful mechanisms via greedy itera-tive packing,” in Proc. 12th Workshop on Approximation Algorithmsfor Combinatorial Optimization Problems, 2009, pp. 56–69.

[28] U. Lampe, M. Siebenhaar, A. Papageorgiou, D. Schuller, andR. Steinmetz, “Maximizing cloud provider profit from equilib-rium price auctions,” in Proc. 5th IEEE Intl. Conf. on Cloud Com-puting, 2012, pp. 83–90.

[29] V. Prasad, S. Rao, and A. Prasad, “A combinatorial auction mech-anism for multiple resource procurement in cloud computing,” inProc. 12th Intl. Conf. on Intelligent Systems Design and Applications,2012, pp. 337–344.

[30] D. G. Feitelson, “Parallel Workloads Archive: Standard Work-load Format,” http://www.cs.huji.ac.il/labs/parallel/workload/swf.html.

[31] S. Zaman and D. Grosu, “Combinatorial auction-based dynamicVM provisioning and allocation in clouds,” in Proc. 3rd IEEE Intl.Conf. on Cloud Comp. Technology and Science, 2011, pp. 107–114.

Sharrukh Zaman received his Bachelors ofComputer Science and Engineering degree fromBangladesh University of Engineering and Tech-nology, Dhaka, Bangladesh. He is currently aPh.D. candidate in the Department of ComputerScience, Wayne State University, Detroit, Michi-gan. His research interests include cloud com-puting, distributed systems, game theory andmechanism design. He is a student member ofthe IEEE.

Daniel Grosu received the Diploma in engineer-ing (automatic control and industrial informatics)from the Technical University of Iasi, Romania, in1994 and the MSc and PhD degrees in computerscience from the University of Texas at San An-tonio in 2002 and 2003, respectively. Currently,he is an associate professor in the Departmentof Computer Science, Wayne State University,Detroit. His research interests include distributedsystems and algorithms, resource allocation,computer security and topics at the border of

computer science, game theory and economics. He has published morethan eighty peer-reviewed papers in the above areas. He has served onthe program and steering committees of several international meetingsin parallel and distributed computing. He is a senior member of the ACM,the IEEE, and the IEEE Computer Society.