· International Journal of Cloud Computing (ISSN 2326-7550) Vol. 1, No. 2, October-December 2013 i IJCC Editorial Board Editors-in-Chief Hemant Jain, University of Wisconsin–Milwa

International Journal of Cloud Computing (ISSN 2326-7550) Vol. 1, No. 2, October-December 2013

i

IJCC Editorial Board Editors-in-Chief Hemant Jain, University of Wisconsin–Milwaukee, USA Rong Chang, IBM T.J. Watson Research Center, USA

Associate Editor-in-Chief Bing Li, Wuhan University, China

Editorial Board Danilo Ardagna, Politecnico di Milano, Italy Janaka Balasooriya, Arizona State University, USA Roger Barga, Microsoft Research, USA Viraj Bhat, Yahoo, USA Rajdeep Bhowmik, Cisco Systems, Inc., USA Jiannong Cao, Hong Kong Polytechnic University, Hong Kong Buqing Cao, Hunan University of Science and Technology, China Keke Chen, Wright State University, USA Haopeng Chen, Shanghai Jiao Tong University, China Malolan Chetlur, IBM India, India Alfredo Cuzzocrea, ICAR-CNR & University of Calabria, Italy Ernesto Damiani, University of Milan, Italy De Palma, University Joseph Fourier, France Claude Godart, Nancy University and INRIA, France Nils Gruschka, University of Applied Sciences, Germany Paul Hofmann, Saffron Technology, USA Ching-Hsien Hsu, Chung Hua University, Taiwan Patrick Hung, University of Ontario Institute of Technology, Canada Hai Jin, HUST, China Li Kuang, Central South University, China Grace Lin, Institute for Information Industry, Taiwan Xumin Liu, Rochester Institute of Technology, USA Shiyong Lu, Wayne State University, USA J.P. Martin-Flatin, EPFL, Switzerland Vijay Naik, IBM T.J. Watson Research Center, USA Surya Nepal, Commonwealth Scientific and Industrial Research Organisation, Australia Norbert Ritter, University of Hamburg, Germany Josef Schiefer, Vienna University of Technology, Austria Jun Shen, University of Wollongong, Australia Weidong Shi, University of Houston, USA Liuba Shrira, Brandeis University, USA Kwang Mong Sim, University of Kent, UK Wei Tan, IBM T.J. Watson Research Center, USA Tao Tao, IBM T. J. Watson Research Center, USA Kunal Verma, Accenture Technology Labs, USA Raymond Wong, University of New South Wales & NICTA, Australia Qi Yu, Rochester Institute of Technology, USA Jia Zhang, Carnegie Mellon University – Silicon Valley, USA Gong Zhang, Oracle Corporation, USA

International Journal of Cloud Computing (ISSN 2326-7550) Vol. 1, No. 1, July-September 2013

ii

Call for Articles International Journal of Cloud Computing

Mission Cloud Computing has become the de facto computing paradigm for Internet-scale service development, delivery, brokerage, and consumption in the era of Services Computing, fueling innovative business transformation and connected human society. 15 billion smart devices would be communicating dynamically over inter-connected clouds by 2015 as integral components of various industrial service ecosystems. The technical foundations of this trend include Service-Oriented Architecture (SOA), business & IT process automation, software-defined computing resources, elastic programming model & framework, and big data management and analytics. In terms of the delivered service capabilities, a cloud service could be, among other as-a-service types, an infrastructure service (managing compute, storage, and network resources), a platform service (provisioning generic or industry-specific programming API & runtime), a software application service (offering email-like ready-to-use application capabilities), a business process service (providing a managed process for, e.g., card payment), a mobile backend service (facilitating the integration between mobile apps and backend cloud storage and capabilities) or an Internet-of-things service (connecting smart machines with enablement capabilities for industrial clouds). The International Journal of Cloud Computing (IJCC) aims to be a reputable resource providing leading technologies, development, ideas, and trends to an international readership of researchers and engineers in the field of Cloud Computing. IJCC only considers extended versions of conference papers published at reputable conferences such as IEEE International Conference of Cloud Computing.

Topics The International Journal of Cloud Computing (IJCC) covers state-of-the-art technologies and best practices of Cloud Computing, as well as emerging standards and research topics which would define the future of Cloud Computing. Topics of interest include, but are not limited to, the following: - ROI Model for Infrastructure, Platform, Application, Business, Social, Mobile, and IoT Clouds - Cloud Computing Architectures and Cloud Solution Design Patterns - Self-service Cloud Portal, Business Dashboard, and Operations Management Dashboard - Autonomic Process and Workflow Management in Clouds - Cloud Service Registration, Composition, Federation, Bridging, and Bursting - Cloud Orchestration, Scheduling, Autoprovisioning, and Autoscaling - Cloud Enablement in Storage, Data, Messaging, Streaming, Search, Analytics, and Visualization - Software-Defined Resource Virtualization, Composition, and Management for Cloud - Security, Privacy, Compliance, SLA, and Risk Management for Public, Private, and Hybrid Clouds - Cloud Quality Monitoring, Service Level Management, and Business Service Management - Cloud Reliability, Availability, Serviceability, Performance, and Disaster Recovery Management - Cloud Asset, Configuration, Software Patch, License, and Capacity Management - Cloud DevOps, Image Lifecycle Management, and Migration - Cloud Solution Benchmarking, Modeling, and Analytics - High Performance Computing and Scientific Computing in Cloud - Cloudlet, Cloud Edge Server, Cloud Gateway, and IoT Cloud Devices - Cloud Programming Model, Paradigm, and Framework - Cloud Metering, Rating, and Accounting - Innovative Cloud Applications and Experiences - Green Cloud Computing and Cloud Data Center Modularization - Economic Model and Business Consulting for Cloud Computing


iii

International Journal of Cloud Computing

October-December 2013, Vol. 1, No.2

Table of Contents

EDITOR-IN-CHIEF PREFACE iv Hemant Jain, University of Wisconsin–Milwaukee, USA Rong Chang, IBM T.J. Watson Research Center, USA

RESEARCH ARTICLES 1 Cost-Driven Optimization of Cloud Resource Allocation for Elastic Processes

Stefan Schulte, Vienna University of Technology, Austria Philipp Hoenisch, Vienna University of Technology, Austria Schahram Dustdar, Vienna University of Technology, Austria Dieter Schuller, Technische Universität Darmstadt, Germany Ulrich Lampe, Technische Universität Darmstadt, Germany Ralf Steinmetz, Technische Universität Darmstadt, Germany

15 Recommending Optimal Cloud Configuration based on Benchmarking in Black‐Box Clouds Gueyoung Jung, Xerox Research Center, USA Naveen Sharma, Xerox Research Center, USA Tridib Mukherjee, Xerox Research Center, USA Frank Goetz, Xerox Research Center, USA Julien Bourdaillet, Xerox Research Center, USA

28 Taming the Uncertainty of Public Clouds Maxim Schnjakin, Hasso Plattner Institute Christoph Meinel, Potsdam University, Germany

48 Cloud Standby System and Quality Model Alexander Lenk, FZI Forschungszentrum Informatik, Germany Frank Pallas, FZI Forschungszentrum Informatik, Germany

60 Efficient Private Cloud Operation using Proactive Management Service Dapeng Dong, University College Cork John Herbert, University College Cork

72 Call for Papers: IEEE CLOUD/ICWS/SCC/MS/BigData/SERVICES 2014 Call for Articles: International Journal of Services Computing (IJSC) Call for Articles: International Journal of Big Data (IJBD)


iv

Editor-in-Chief Preface: Cloud Management and Assessment

Hemant Jain Rong Chang University of Wisconsin–Milwaukee, USA IBM T.J. Watson Research, USA

Welcome to the second issue of the inaugural volume of the International Journal of Cloud Computing (IJCC), the first open access on-line journal on cloud computing. The increasing importance of cloud computing is evidenced from the rapid adoption of this technology in businesses around the globe. The cloud computing is redefining the business model of various industries from video rental (Netflix is enabled by cloud) to small start-up companies (companies can be started with very little investment using cloud infrastructure). The potential of cloud computing is even more promising. The cloud computing combined with developments like internet of things can significantly change the life as we know today. However, to deliver on these promises and to prevent clouding computing from becoming a passing fad significant technical, economic, and business issues need to be addressed. IJCC is designed to be an important platform for disseminating high quality research on above issues in a timely manner and provide an ongoing platform for continuous discussion on research published in this journal. We aim to publish high quality research that addresses important technical challenges, economics of sustaining this environment, and business issues related to use of this technology including privacy and security concerns, legal protection, etc. We seek to publish original research articles, expanded version of papers presented at high quality conferences, key survey articles that summarizes the research done so far and identify important research issues, and some visionary articles. We will make every effort to publish articles in a timely manner. This issue collects the extended version of five IEEE CLOUD 2013 articles in the general area of managing cloud computing environment. The first article is titled “Cost-Driven Optimization of Cloud Resource Allocation for Elastic Processes” by Schulte, Schuller, Hoenisch, Lampe, Steinmetz, and Dustdar. Based upon the notion of Elastic Processes, they present an automated cost-driven approach to optimally leasing and releasing cloud-based computational resources for each process step, thus avoiding over-provisioning fixed resources for each process. Empirical study and analysis are presented. The second article is titled “Recommending Optimal Cloud Configuration based on Benchmarking in Black-Box Clouds” by Jung, Sharma, Mukherjee, Goetz, and Bourdaillet. They present a benchmark-based modeling approach to recommending optimal cloud configuration for deploying user workloads, in terms of various non-standardized configuration options offered by cloud providers. The approach employs a workload-specific VM performance assessment process and an efficient search algorithm recommending the near optimal cloud configuration that would cost the least while meeting the throughput goal. Experimental results are reported. The third article is titled “Taming the Uncertainty of Public Clouds” by Schnjakin and Meinel. They present a framework featuring improved availability, confidentiality, and reliability of data stored in the cloud. User data is encrypted together with the RAID technology to manage data distribution across cloud storage providers. Experiments are conducted to evaluate the performance and cost effectiveness of the presented approach. The fourth article is titled “Cloud Standby System and Quality Model” by Lenk and Pallas. The authors present a cost-effective approach to deploying cloud standby systems, which enhance the availablity of a specific IT infrastructure by replicating that to the Cloud. The approach uses a Markov-based quality model to assist analyzing and configuring cloud standby systems. The fifth article is titled “Efficient Private Cloud Operation Using Proactive Management Service” by


v

Dong and Herbert. The authors present a distributed service architecture aiming to provide an automated, shared, and off-site operation management service for private clouds. A prototype system and empirical study are presented. We would like to thank the authors for their effort in delivering those five quality articles. We would also like to thank the reviewers as well as the Program Committee of IEEE CLOUD 2013 for their help with the review process. Finally, we are grateful for the effort Jia Zhang and Liang-Jie Zhang made to this inaugural issue of International Journal of Cloud Computing (IJCC). About the Editors-in-Chief

Dr. Hemant Jain is the Interim Director of Biomedical and Health Informatics Research Institute, Roger L. Fitzsimonds Distinguished Scholar and Professor of Information Technology Management at University of Wisconsin–Milwaukee. Dr. Jain specializes in information system agility through web services, service oriented architecture and component based development. His current interests include development of systems to support real time enterprises which have situational awareness, can quickly sense-and-respond to opportunities and threats, and can

track-and-trace important items. He is also working on issues related to providing quick access to relevant knowledge for cancer treatment and to providing medical services through a virtual world. Dr. Jain is an expert in architecture design, database management and data warehousing. He teaches courses in database management, IT infrastructure design and management, and process management using SAP. Dr. Jain was the Associate Editor-in-Chief of IEEE Transactions on Services Computing and is Associate Editor of Journal of AIS, the flagship journal of the Association of Information Systems.

Dr. Rong N. Chang is Manager & Research Staff Member at the IBM T.J. Watson Research Center. He received his Ph.D. degree in computer science & engineering from the University of Michigan at Ann Arbor in 1990 and his B.S. degree in computer engineering with honors from the National Chiao Tung University in Taiwan in 1982. Before joining IBM in 1993, he was with Bellcore researching on B-ISDN realization. He is a holder of the ITIL Foundation Certificate in IT Services

Management. His accomplishments at IBM include the completion of a Micro MBA Program, one IEEE Best Paper Award, and many IBM awards, including four corporate-level Outstanding Technical Achievement Awards and six division-level accomplishments. He is an Associate Editor of the IEEE Transactions on Services Computing and the International Journal of Services Computing. He has chaired many conferences & workshops in cloud computing and Internet-enabled distributed services and applications. He is an ACM Distinguished Member/Engineer, a Senior Member of IEEE, and a member of Eta Kappa Nu and Tau Beta Pi honor societies.


1

COST‐DRIVENOPTIMIZATIONOFCLOUDRESOURCEALLOCATIONFORELASTICPROCESSES

StefanSchulte1,DieterSchuller2,PhilippHoenisch1,UlrichLampe2,RalfSteinmetz2,SchahramDustdar11DistributedSystemGroup,ViennaUniversityofTechnology,Austria

Email:{s.schulte,p.hoenisch,dustdar}@infosys.tuwien.ac.at2MultimediaCommunicationsLab(KOM),TechnischeUniversitätDarmstadt,Germany

Email:{firstName.lastName}@KOM.tu‐darmstadt.deAbstractToday'sextensivebusinessprocesslandscapesmakeitnecessarytohandletheexecutionofalargenumberofbusi‐nessprocessesandindividualprocesssteps.Especiallyifprocessstepsrequiretheinvocationofresource‐intensiveapplicationsoralargenumberofapplicationsneedtobeexecutedconcurrently,processownersmayhavetoallocateextensivecomputational resources,leadingtohighfixedcost.In theworkathand,weproposeanalternative totheprovisionof fixedresources,basedonautomatic leasingandreleasing ofCloud‐basedcomputational resources. For this,wepresent an integrated approachwhichaddresses thecost‐drivenoptimizationofCloud‐basedcomputational resources forbusinessprocesses inordertorealizeso‐calledElasticProcesses.Throughanevaluation,weshowthepracticalapplicabilityandbenefitsofourcontributions.Specif‐ically,wefindthatourapproachsubstantiallyreducesthecostcomparedtoanadhocapproach.Keywords:ElasticProcesses,CloudComputing,BusinessProcessExecution__________________________________________________________________________________________________________________1. INTRODUCTION

Nowadays, IT-support for the execution of business pro-cesses is an essential prerequisite in many industries. For example, in the finance industry, trade settlement or execu-tion control processes are executed automatically (Gewald, Dibbern, 2009). In the energy domain, computational re-sources are needed to carry out essential decision processes and a particular necessity is to support the processing of large amounts of data in so-called Smart Grids (Rohjans et al., 2012).

Especially in large companies, the number of different business process models available can become extensive (Breu et al. 2013; Jin et al., 2013). Correspondingly, a busi-ness process landscape may comprise a very large number of running business process instances, all of which are made up from single tasks (i.e., process steps) with differing com-putational resource demands. Over time, the invocation of new process instances and the completion of running pro-cess instances lead to ever-changing computational resource demands, which need to be met by a company. Apparently, computational resource demands during peak times (i.e., when many and/or resource-intensive tasks need to be car-ried out concurrently) will be much higher than in normal times – especially in volatile domains (Maurer et al., 2013).

On the one hand, permanently providing computational resources that can cover the demand during peak times leads not only to high fixed cost, but the resources will not be utilized most of the time (overprovisioning). On the other hand, providing computational resources which can cover only part of the processes' resource demand, will lead to lower fixed cost, but also to the risk that some processes

cannot be carried out during peak times (underprovisioning) or will suffer from low Quality of Service (QoS).

To avoid the drawbacks arising due to over- and under-provisioning, computational resources should be scalable, i.e., the available resources should be in- or decreased based on the demands of the running and future business process instances. Applying Cloud technologies to provide the needed resources exactly allows this – (i) leasing and releas-ing computational resources in an on-demand, utility-like fashion, (ii) rapid elasticity through scaling the infrastruc-ture up and down if necessary, and (iii) pay-per-use through metered service (Armbrust et al., 2010; Buyya et al., 2009).

So far, only few researchers have provided methods and solutions to facilitate Elastic Processes, i.e., processes which are carried out using elastic Cloud resources (Dustdar et al., 2011; Andrikopoulous et al., 2013). Current Business Process Management Systems (BPMS) do not only “lack the ability to learn, mine, and reason suitable resource allo-cation knowledge in business process execution” (Pesic and van der Aalst, 2007; Huang et al., 2011), but are also not able to make use of Cloud-based computational resources. In our former work (Schulte et al., 2013; Hoenisch et al., 2013; Hoenisch et al., 2013a), we have presented the Vienna Platform for Elastic Processes (ViePEP), which combines the functionalities of a BPMS with that of a Cloud resource management system. ViePEP is able to schedule complete processes as well as the involved single tasks, and lease and release Cloud-based computational resources in terms of Virtual Machines (VMs) while taking into account Service Level Objectives (SLOs) defined by the process owners.

Within this paper, we extend our former work by ad-dressing the problem of online Cloud resource allocation for


2

Elastic Processes based on process requests from various clients (process owners). In this scenario, it is necessary to schedule task executions and lease and release Cloud re-sources in order to carry out the single tasks under given SLOs. To encounter the complexity of this scenario, it is necessary to predict the resource demands of tasks, develop a cost model, predict the cost, and perform a cost/performance analysis. This has to be done continuous-ly, as new process requests arrive, software services repre-senting single process tasks do not behave as predicted, or a process instance is changed by the process owner. The goal is to provide cost-efficient process scheduling that takes into account the given SLOs and leases and releases Cloud-based computational resources (i.e., VMs) in order to opti-mize cost. Hence, the main contributions in this paper are: We formulate a model for scheduling and resource

allocation for Elastic Processes. We design a heuristic based on the proposed model. We integrate this work into the Vienna Platform for

Elastic Processes. The remainder of this paper is organized as follows: In Sec-tion 2, we introduce the overall scenario, ViePEP, and some prerequisites for our research work. Afterwards, we present scheduling and resource allocation solution for Elastic Pro-cesses – for this, we define an integer linear optimization problem and a corresponding heuristic. We evaluate the scheduling and resource allocation algorithms through Vie-PEP-based testbed experiments (Section 4). In Section 5, we discuss the related work. The paper closes with a summary and an outlook on our future work.

2. PRELIMINARIES2.1 GENERALSCENARIO

Within this paper, we assume that a business process landscape is made up from a large number of business pro-cesses, which can be carried out automatically. The part of a business process which can be executed using machine-based computational resources is also known as a workflow (Ludäscher et al., 2009). Therefore, in the remainder of this paper, we will use the term workflow in order to identify an executable process. The automated processing of such workflows is a prominent field of research and has resulted in various concepts, methodologies, and frameworks (Mutschler et al., 2008). In recent years, the focus of this research was primarily on the composition of workflows from software services (Dustdar and Schreiner, 2005; Schuller et al., 2012).

While we also assume that workflows are composed of software services, we do not expect that companies are willing to outsource important services fully to external providers, since these services will then be outside the con-trol domain of the process owner. In contrast, making use of private and public Cloud resources to host service instances, which are then invoked as workflow steps, leaves the con-trol with the process owner: If particular service instances or

VMs fail, the process owner or the admin of the business process landscape is able to deploy further service instances.

Making use of VMs to host particular services allows sharing resources among workflows, as the same service instance might be invoked within different workflows at the same time. Notably, it is important to distinguish between services, which are the basic building blocks of workflows, service instances, which are hosted by VMs, and service invocations, which denotes the unique execution of a service instance in order to serve a particular workflow request, i.e., a workflow instance.

In our scenario, workflows may be requested at any time by process owners and may be carried out in regular inter-vals or nonrecurring. It is the duty of a software framework (in our case, ViePEP), which combines the functionalities of a BPMS and a Cloud resource management system, to ac-cept incoming workflow requests, schedule the workflow steps, and lease and release computational resources based on the workflow scheduling plan.

Process owners are able to define different QoS con-straints as SLOs on the level of workflow instances. With-out a doubt, the execution deadline is the most important SLO: Some business-uncritical workflows may have very loose deadlines, while other workflows need to be carried out immediately and finished as soon as possible. Usually, workflow instances will have a defined maximum execution time or a defined deadline. Process owners are able to define complex workflows which feature AND, XOR, or loop patterns. However, we assume that the next steps in a par-ticular workflow instance are always known.

We assume that the business process landscape is vola-tile, i.e., ever-changing, since workflow requests may arrive anytime. Furthermore, changes may have to be necessary since services or VMs are not delivering the expected QoS, or the next steps in a workflow instance are not as planned.

For achieving an efficient scheduling and invocation of workflows and corresponding services, Cloud-based compu-tational resources in terms of VMs required for invoking respective services have to be leased and released such that the total cost arising from leasing aforementioned Cloud resources is minimized. In addition, it has to be made sure that given QoS constraints such as deadlines, until which corresponding workflows have to be finished, are satisfied. The closer the deadline is for a certain workflow instance, the higher is the importance to schedule and execute corre-sponding services accomplishing its tasks. If not carefully considered and scheduled, further additional Cloud re-sources will have to be leased and paid in order to execute workflow instances that cannot be delayed any further. For avoiding such situations in which extra resources have to be leased due to an inefficient scheduling strategy, the execu-tion of workflow instances along with the leasing and re-leasing of Cloud resources has to be optimized. This makes it necessary to facilitate self-adaptation and -optimization of the overall business process landscape through replanning of workflow scheduling and resource allocation.


3

2.2 SELF‐ADAPTATIONFORELASTICPROCESSESSelf-adaptation is a common concept from the field of

Autonomic Computing and includes self-healing, i.e., the ability of a system to detect and recover from potential problems and continue to function smoothly, self-configuration, i.e., the ability of a system to configure and reconfigure itself under varying and unpredictable condi-tions, and self-optimization, i.e., the ability to detect subop-timal behavior and optimize itself to improve its execution (Kephart and Chess, 2003). The focus of this paper is on self-optimization.

In order to motivate the functionalities and components needed to provide self-optimization of a Cloud-based pro-cess landscape, we make use of the well-known MAPE-K cycle shown in Figure 1. As the scenario at hand is highly dynamic due to permanently arriving workflow requests and changing Cloud resource utilization, a continuous alignment to the new system status is necessary. Using the MAPE-K cycle, the process landscape is continuously monitored and optimized based on knowledge about the current system status. In the following, we will briefly discuss the four phases of this cycle: Monitor: In order to adapt a system, it is first of all

necessary to monitor the system status. In the scenario at hand, this includes the monitoring of the status of single VMs in terms of CPU, RAM, and network bandwidth utilization, and the non-functional behavior of services in terms of response time and availability.

Analyze: To achieve Elastic Process execution, it is necessary to analyze the monitored data and reason on the general knowledge about the system. In short, this analysis is done in order to find out if there is currently under- and overprovisioning regarding the computa-tional resources (VMs) and to detect Service Level Agreement (SLA) violations in order to carry out corre-sponding countermeasures (e.g., provide further VMs or re-invoke a service instance).

Plan: While the analysis of the monitored data and further knowledge about the system aims at the current system status, the planning also takes into account the future resource needs derived from the knowledge about future workflow steps, their SLOs, and the esti-mated resource requirements and runtimes. For this, a workflow scheduling and resource allocation plan needs to be generated.

Execute: As soon as the plan is set up, each workflow step is executed corresponding to this plan.

Knowledge Base: While not really a part of the cycle, the Knowledge Base stores information about the sys-tem configuration. In our case, this is the knowledge about which service instances are running on which VMs, how many VMs are currently part of the system, and of course the knowledge about requested workflow and services instances.

In order to provide self-adaptation, ViePEP will have to support all four phases of the cycle and provide a Knowledge Base.

FIGURE1:MAPE‐KCYCLE(KEPHARTANDCHESS,2003)

2.3 THEVIENNAPLATFORMFORELASTICPROCESSESFigure 2 (using FMC notation1) depicts the high-level

components of ViePEP and the message flow between them. The core functionalities of ViePEP are: Provisioning of an interface to the Cloud, allowing

leasing and releasing Cloud-based computational re-sources (VMs) on demand.

Execution of workflow steps by instantiating services on VMs and invoking the service instances in work-flows instances.

Dynamical scheduling of incoming requests for work-flows based on their QoS requirements, i.e., timeliness in terms of maximum execution time or a deadline.

Monitoring the deployed VMs in terms of resource utilization and monitor the QoS of service invocations.

ViePEP is made up from two major components and three helper components: The BPMS VM hosts the central func-tionalities of ViePEP, i.e., the functionalities for resource allocation and workflow scheduling. Resource allocation is done at the PaaS level, i.e., ViePEP is able to lease and release VMs from Cloud providers and allocate these re-sources to particular workflow steps. Workflows share re-sources as they are able to concurrently invoke the same service instance.

The Workflow Manager is the subcomponent both re-sponsible for receiving workflow requests from the process owners via the Client API (see below) and for invoking single service instances running in a Backend VM. Further-more, it monitors service invocations in order to control whether the service instance delivers the expected QoS and starts corresponding countermeasures if necessary.

The information which services need to be invoked at what point of time is generated by the Scheduler and the Reasoner. The former subcomponent is responsible to de-rive a detailed scheduling plan corresponding to service and workflow deadlines, while the latter subcomponent esti-mates the needed resources (VMs) based on the computed scheduling and sends corresponding requests to the Cloud

1 http://www.fmc-modeling.org/


4

FIGURE2:THEVIENNAPLATFORMFORELASTICPROCESSES(VIEPEP)

providers. The functionality of these two subcomponents is defined through the optimization approach and heuristic presented in Section 3. The Reasoner interacts with the Load Balancer in order to estimate which Backend VM provides free resources for particular service instances and therefore could be used to invoke a service instance in the future.

The subcomponents of the ViePEP BPMS are placed in a VM to avoid that it becomes a bottleneck. Through verti-cal scaling, it is possible to provide ViePEP with additional resources if the business process and Cloud landscape be-comes increasingly complex.

Whenever the Reasoner issues a request to the Cloud, ei-ther a Backend VM will be started and when its Action En-gine is ready, a new service instance will be deployed, or (in case the Backend VM is already running) the Action Engine is directly able to execute the corresponding request. Vie-PEP allows the following requests: Start a new Backend VM, which includes deploying a

new service instance on it. Duplicate a Backend VM including the service in-

stance. Terminate a Backend VM, which marks the Backend

VM as “phasing out” and will prevent the Load Balanc-er from requesting further service invocations from this Backend VM. Once all previously assured service invo-cations have been finished, the VM is terminated.

Exchange the hosted service instance on the Backend VM. This will also prevent the Load Balancer from re-questing any further invocations of this particular ser-vice instance. Once all assured service invocations have been finished, the instantiated service is replaced by another one.

Move a service instance from one Backend VM to an-other one (with different computational resources).

Within the Backend VM, service instances are hosted in an Application Server, which also features a Monitor to ob-serve the current load on the Backend VM and provide this information to the Load Balancer via the Shared Memory. This Monitor should not be confused with the monitoring capabilities of the Workflow Manager, which monitors the response time of service invocations.

The Shared Memory and the Service Repository are helper components. The latter hosts the service descriptions as well as their implementations as portable Web applica-tion ARchives (WAR). This repository allows searching for services and deploying them on a ViePEP Backend VM. The Shared Memory provides a distributed database which is used to send requests from the BPMS VM to the Backend VM and stores monitoring data. We chose MozartSpaces (Kühn et al., 2009) for this, as it allows easily deploying and accessing a peer-to-peer-based, distributed shared memory. In addition, MozartSpaces allows sending notifications with very low latency.

Finally, the Client API allows process owners to define workflows including SLOs. An owner may request many workflows consecutively or in parallel. When the request is submitted to the BPMS VM via the Workflow Manager, a new workflow instance is generated and taken into account in workflow scheduling and resource allocation.

2.4 PREREQUISITES

Before we formulate a model for workflow scheduling and resource allocation and define a corresponding heuris-tic, it is necessary to introduce some prerequisites in order to determine the scope of our work.

First, each Backend VM hosts exactly one service in-stance; a particular service may be instantiated arbitrarily often at different VMs. Second, defined deadlines are realis-tic, i.e., the deadlines can be met if providing corresponding

Action Engine

R

Backend VM X

Application Server

MonitorService

Shared Memory

BPMS VM

Workflow Manager

Reasoner

LoadBalancer

ClientAPI

WS1 WS2 WS3 WS4

R

Service Repository

R

R

Backend VM 1

Application Server

MonitorService

Action Engine

RR

R

R

R

Scheduler

R


5

resources. However, defined deadlines may be violated because of faults in the Cloud or in the network. In this case, the service invocation in question will be immediately car-ried out again. Third, it is possible to derive the total com-putation time and resource utilization of a particular service on a particular VM type from historical data: Instances of the same service behave similarly with regard to resource consumption and runtime of single invocations. Fourth, computational resources in terms of additional VMs can be leased from different Cloud providers; it is also possible to combine resources from public and private Clouds. Indefi-nite resources are available. Hence, all resource demands of the business process landscape can be met by the available Cloud-based computational resources (i.e., the workflow scheduling and resource allocation are effective). Fifth, different types of VMs with different computational re-sources (CPUs, RAM, bandwidth, …) are available from different Cloud providers. Following Amazon’s EC2 pricing scheme, the cost of these VMs are proportional, e.g., the cost of a VM with 2 cores are half of the cost of a VM with 4 cores with the same specifications. Sixth, it is the goal of our model to minimize the overall cost arising from leasing VMs (i.e., scheduling and resource allocation are efficient).

3. SOLUTIONAPPROACHWe formulate the problem of scheduling workflow in-

stances and their individual services, respectively, as opti-mization problem. The general approach proposed in the work at hand for achieving an optimized scheduling and resource allocation is presented in Section 3.1. A formal specification of the corresponding optimization problem is provided in Section 3.2. Finally, in Section 3.3, we describe a heuristic solution method for efficiently solving the opti-mization problem.

3.1 GENERALAPPROACH

For achieving the necessary workflow scheduling and resource allocation, the deadlines indicating the time when the corresponding workflow instances have to be finished are considered. In addition, in order to minimize the total cost of (Backend) VMs leased for executing workflow re-quests and corresponding services, respectively, the leased VMs should be utilized as much as possible, i.e., leased but unused resource capacities should be minimized. Thus, in order to reduce the necessity of leasing additional VMs, we try to invoke services that cannot afford further delays first-ly on already leased VMs before leasing additional VMs if the remaining resource capacities of already leased VMs are sufficient. Only if the resource requirements of workflows cannot be covered by already leased VMs, additional VMs will be leased. For instance, if the deadlines for certain workflow instances allow delaying their service invocations to another period, it can be beneficial to release leased re-sources and delay such services invocations. However, it has to be considered that those service invocations have to

be scheduled for one of the subsequent optimization periods to ensure that corresponding deadlines are not violated.

In addition, since VMs are leased only for a certain time period and the execution of already scheduled invocations will be finished at a future point in time, the scheduling strategy to be developed cannot be static, i.e., the optimiza-tion approach for scheduling service invocations should not be applied only once. It rather has to be applied multiple times at different optimization points in time. In this respect, it has to be noted that potentially further requested workflow instances may have to be served, which additionally have to be considered when carrying out an optimization step.

Thus, an efficient scheduling strategy has to review allo-cated service instances and scheduled service invocations periodically and carry out further optimization steps for considering dynamically changing requirements and keep-ing the amount of leased but unused resource capacities low. Furthermore, the resource demand and average runtime for single service instances needs to be known in advance – for this, an approach based on linear regression – as presented in our former work (Hoenisch et al., 2013) – will be applied.

3.2 OPTIMIZATIONPROBLEM

In this section, we model the scheduling and allocation of workflows and services as an optimization problem. Since Cloud resources are only leased for a limited time period, it has to be decided whether those resources or even additional resources have to be leased for another period, or whether leased resources can be released again. In this re-spect, we aim at utilizing leased Cloud resources, i.e., VMs in the context of this paper, to their full capacities. Further, additional resources will only be leased if capacities of already leased VMs are not sufficient to cover and carry out further service invocations that cannot be delayed.

For considering the different (optimization) time peri-ods, the index will be used. Depending on these periods, the parameter refers to the actual time and point in time, respectively, indicated by a time period . For scheduling different workflows, multiple workflow templates are con-sidered. The set of workflow templates is labeled with , where ∈ 1,… , # refers to a certain workflow template. The set of workflow instances that have to be considered during a certain period corresponding to a certain workflow template is indicated by , where ∈ 1,… , # refers to a certain workflow instance.

The total number of workflow instances that have to be considered in period is indicated by # . Please note that considering a certain instance in period does not neces-sarily result in invoking corresponding service instances in this period. It rather makes sure that it is considered for the optimization step conducted in period , which may lead to the decision that this instance is further delayed – until an-other optimization step is carried out in a subsequent period. The remaining execution time for executing a certain in-stance , which might involve invoking multiple service


6

instances for accomplishing the different tasks of a certain

workflow instance, is indicated by the parameter .

Thus, by specifying the parameter as the remaining execution time of a certain workflow instance, we account for the fact that certain tasks of this instance might have already been accomplished by having invoked correspond-ing service instances in previous periods.

It has to be noted that executing workflow instances re-fers to invoking a service that accomplishes a task of those workflow instances. The service, executing the next task of a workflow instance, is labeled with ∗ . If, for in-stance, a workflow instance consists of five tasks, its re-maining execution time is determined by the sum of the execution times the services require for accomplish-

ing the different tasks of the workflow instance, i.e., ∑ ∈ . The set thereby represents the set of

services that have to be invoked for accomplishing all tasks of workflow instance . Allocating and executing this instance refers to invoking the next service ∗ that accom-plishes the next task, i.e., the first task in this example. Thus, four tasks of this workflow instance still remain unac-complished. After having invoked service ∗ , the remaining execution time for this workflow instance is reduced by the execution time ∗ of the invoked service. Thus, can

be determined by adding up the services' execution times for the remaining four tasks of this workflow instance (Hoe-nisch et al., 2013a). The corresponding workflow instance needs to be executed again, i.e., a service instance accom-plishing the second task has to be invoked. For accomplish-ing this second task, the corresponding service becomes the next service ∗ .

The deadline at which the execution of a workflow in-stance has to be finished is indicated by the parameter . Assuming a continuous flow of time, refers to an actual point in time as, for instance, “23.10.2013, 14:10:00”. For executing a certain workflow instance , i.e., for invoking the next service ∗ , of a workflow instance, a certain amount of computational resources from a VM are required. The resource requirement for the corresponding next ser-vices is indicated by .

Regarding the leasing of Cloud resources, we assume different types of VMs. The set of VM types is indicated by the parameter , where ∈ 1,… , # refers to VM type . The corresponding resource supply of a VM (in terms of CPU, RAM, bandwidth) of type is indicated by the parameter . For counting and indexing leased VM instances of type , the variable is used. Although in theory unlimited, we assume the number of leasable VM instances of type in a time period to be restricted by # for modeling reasons, i.e., in order to make the idea of in-definite resources, provided by Clouds, tangible.

Model 1 Optimization Problem Objective function

(1) Minimize∑ ∈ ⋅ , ∑ ∑ , ⋅∈∈

so that

(2) ∗ ⋅ , , ∀ ∈ , ∈

(3) ∗ ⋅ , , 1 , , ⋅

∀ ∈ , ∈

(4)

(5) ∑ ∑ ∈∈ ⋅ , , ⋅ , ∀ ∈, ∈

(6) ∑ ,∈ , ∀ ∈

(7) , ⋅ ∑ ∑ ⋅ , ,∈ ∀ ∈∈

, ∈

(8) , , 1∀ ∈ , ∈ , ∈ , ∈| runs

Thus, we assume a maximum number # of leasable VMs. The set of leasable VM instances of type v is indicated by

, where ∈ 1,… , # . The cost for leasing one VM instance of type is indicated by .

For finally deciding at period which service to instanti-ate and invoke, binary decision variables , , ∈ 0,1 are used. A value , , 1 indicates that the next service ∗ of workflow instance should be allocated and invoked in period at VM , whereas a value , , 0 indicates that the invocation of the corresponding service should be delayed, i.e., no service of workflow instance should be invoked in period . For indicating, whether a certain in-stance of VM type v should be leased in period t, another decision variable , ∈ 0,1 is used. Similar to , , , a value , 1 indicates that instance of VM type is leased. The total number of VMs of type to lease in period is labeled with , .

Using these parameters and variables, we formulate the optimization problem in Model 1 for deciding which work-flow instances and corresponding services , respec-tively, should be allocated and invoked in period .

The constraints in (2) make sure that the deadlines for workflow instances will not be violated. For this, the sum of the remaining execution time and the next opti-mization point in time has to be lower or equal to the deadline. By scheduling a service invocation, the corre-sponding remaining execution time is reduced, because the execution time for the next service ∗ will be subtract-

ed in this case. The constraints in (3) and (4) determine the next optimi-

zation point . In order to avoid optimization deadlocks, which would result from not advancing the next optimiza-


7

tion time , is restricted in (4) to be greater or equal to plus a small value 0. In order to replan the sched-uling and invocation of workflows instances as soon as a service invocation has been finished, should be lower or equal to plus the minimum execution time of the services invoked in period . Using an additional parameter

max , the last term in (3) thereby makes sure that

services that are not invoked in this period do not restrict , to be lower or equal to . However, a replanning will

anyhow be triggered if events such as the request of further workflow instances or the accomplishment of a certain in-voked service occur.

The constraints in (5) make sure that the resource ca-pacities required by the next services for workflow instances

assigned to instance of VM type are lower or equal to the capacities these VM instances can offer. If processes are assigned to a certain instance of type , the corre-sponding decision variable , will assume a value of 1. The sum of all decision variables , determines the total number , of instances for VMs of type , as indicated in (6). In (7), the amount of unused capacities for VM instance

, which is indicated by the variable , is determined. In order to account for running service instances invoked in previous periods, corresponding decision variables are set to 1, which is indicated in (8).

The objective function, which is specified in (1), aims at minimizing the total cost for leasing VMs. In addition, by adding the amount of unused capacities of leased VMs to the total cost, the objective function also aims at minimiz-ing unused capacities of leased VM instances.

3.3 HEURISTICSOLUTIONAPPROACH

Due to its formulation as integer linear program, the so-

lution approach from the previous section features relatively high computational complexity, which renders its applica-bility to realistic, large-scale scenarios difficult. Hence, for efficiently solving the optimization problem presented in the last subsection, we develop a heuristic solution method.

This heuristic basically examines which workflow in-stances and services , respectively, have to be sched-uled and invoked in the current optimization period in order to avoid violating corresponding deadlines . For this, the heuristic initially determines the point in time , where the next optimization step has to be carried out – at the latest. Corresponding to this (latest) subsequent optimi-zation time, a virtual time buffer is calculated for each work-flow instance, which will be referred to as slack, indicating the time, the corresponding instance may be delayed at maximum, before a violation of the corresponding deadline

takes place. Those instances, for which the slack is low-er than 0, i.e., those instances, for which a further delay would result in violating a workflow deadline, are consid-ered as critical. Thus, at first, we try to invoke the critical

ALGORITHM1:HEURISTICSOLUTIONAPPROACH1: //Initialize variables 2: d[w,i]; //Deadline for instance i of

workflow w 3: e[w,i]; //Remaining execution time for

instance i of workflow w 4: s[v]; //Resource supply for VM of type v 5: k[v]; //Number of already leased VMs of

type v 6: leasedVM[v,k]; //kth VM instance of type

v 7: unusedRes[v,k]; //Unused resources for

kth VM of type v 8: sl[w,i] := new Double[w#,i#]; //Array for

slack 9: rcrit := 0; //Aggregated resource demand

for critical instances 10: vmTypList; //List of available VM types,

sorted by size ascending 11: sortList := new List(); //Sorted List

corresponding to slack 12: critList := new List(); //List

containing critical instances 13: t+1 = d[1,1]; //Initialize next

optimization point in time 14: //Compute t+1 15: for (w=1;w≤w#;w=w+1) do 16: for (i=1;i≤i#;i=i+1) do 17: if d[w,i]-e[w,i]-t+1 then 18: t+1=d[w,i]-e[w,i];//Get minimum t+1 19: end if 20: if t+1≤t then 21: t+1=t+;//Avoid deadlocks 22: end if 23: end for 24: end for 25: //Compute slack sliw 26: for (w=1;w≤w#;w=w+1) do 27: for (i=1;i≤i#;i=i+1) do 28: sl[w,i]=d[w,i]-e[w,i]-t+1;//Get slack 29: if sl[w,i]<0 then 30: critList.add(getInst(w,i)); 31: rcrit=rcrit+getInst(w,i).

resNextService(); 32: else 33: sortList.

insert(getInst(w,i),sl[w,i]); 34: end if 35: end for 36: end for 37: //Invoke critical instances on leased

but unused resources 38: usedRes=placeOnUnusedRes(critList); 39: rcrit= rcrit-usedRes; 40: //lease new VMs until rcrit is satisfied 41: leaseNewVMs(rcrit); 42: //Place further non-critical instances

on unused resources 43: placeOnUnusedRes(sortList);


8

services instances on such VM instances that are already leased and running. Afterwards, we lease new VM instances such that all remaining critical service instances are allocat-ed and invoked in the current period. Finally, in order to minimize unused resources of leased VM instances, we invoke service instances on the leased VM instances corre-sponding to their slack.

This heuristic solution method is provided in Algo-rithm 1 using pseudocode. Corresponding methods used in Algorithm 1 are indicated in Algorithm 2, Algorithm 3 and Algorithm 4. In lines 1-13 of Algorithm 1, the required parameters and variables are initialized. For instance, the deadlines for workflow instances of workflow tem-plate are initialized in line 2. In this respect, it has to be noted that the corresponding parameter , represents an array containing all deadlines of workflow instances that are considered at optimization period . Analogously,

, represents an array for the remaining execution times of workflow instances , which is initialized in line 3. The resource supply for a VM of type is initial-ized in line 4, whereas the number of instances of leased VMs of type is initialized in line 5. Note that VM instanc-es potentially have been leased in previous periods. Thus, the number of instances is not necessarily 0 when carry-ing out an optimization step.

Correspondingly, leased VMs as well as unused re-sources of already leased VMs, which are indicated by the arrays , and , , have to be considered (cf. lines 6-7). For computing the slack of all workflow instances and aggregating the resource demands of the critical instances, the array , and the variable ALGORITHM2:METHODPLACEONUNUSEDRES(LIST)1: //Variable Initialization 2: usedRes = 0; 3: removeList = new List(); 4: for

(iter=1;iter≤list.size();iter=iter+1) do 5: inst = list.get(iter); 6: r = inst.resNextService(); 7: placed = false; 8: for (v=1;v≤v#;v=v+1) do 9: if (!placed) then 10: for (k=k[v];k≥1;k=k-1) do 11: if (r≤unusedRes[v,k]) then 12: placeInst(inst,leasedVM[v,k]); 13: placed = true; 14: unusedRes[v,k]=unused[v,k]-r; 15: usedRes = unused[v,k]-r; 16: removeList.add(inst); 17: break; 18: end if 19: end for 20: end if 21: end for 22: end for 23: list.remove(removeList); 24: return usedRes;

is used (cf. lines 8-9). Since in this heuristic, different sizes of VMs are considered, i.e., they differ in the amount of available resources, a list of the available VM types is stored in in line 10. This list is sorted in ascend-ing order by the VMs’ sizes, i.e., the smallest VM comes first. In lines 11-12, empty lists are created for storing the critical instances as well as the non-critical instances, which are sorted in the list corresponding to their slack. Finally, the point in time , where the next optimization step has to be carried out at latest, is initialized with an arbitrary deadline, as, e.g., .

Having initialized required parameters and variables, is determined in lines 15-24 by computing the mini-

mum difference between deadlines and the remaining execution times for all workflow instances. For this point in time, the difference between remaining execution time and deadline, i.e., the slack, will be 0 for at least one workflow instance. In order to avoid deadlocks that will result if the subsequent optimization time is equal to the current time , a small value 0 is added (cf. lines 20-21).

In lines 26-36, the slack for each workflow instance is computed (cf. line 28) and the corresponding workflow instances are either added to the list of critical instances (cf. line 30) or inserted into a sorted list of (non-critical) in-stances (cf. line 33) corresponding to their slack. In addi-tion, the resource requirements for the next services of the critical workflow instances are aggregated (cf. line 31).

The corresponding critical service instances need to be allocated and invoked in the current period – either on al-ready leased and running VM instances or on further VM instances that have to be additionally leased in this period. Invoking critical service instances on already leased VM instances is accounted for in line 38 by calling the method

, which is provided in Algorithm 2. Within Algorithm 2, the method is called, which is provided in Algorithm 3.

ALGORITHM3:METHODPLACEINST(LIST,VM,K)

1: //Variable Initialization 2: v = VM.getType(); 3: removeList = new List(); 4: unusedRes = supply[v]; 5: for (iter=1; iter≤list.size();

iter=iter+1) do 6: inst = list.get(iter); 7: r = inst.resNextService(); 8: if (unusedRes≥r) then 9: placeInst(inst,leasedVM[v,k]); 10: unusedRes=unusedRes-r; 11: removeList.add(inst); 12: end if 13: end for 14: list.remove(removeList); 15: return supply[v]-unusedRes;


9

ALGORITHM4:METHODLEASENEWVMS(RES)1: //Variable Initialization 2: vm; //Temp variable for new VM 3: while (rcrit>0) do 4: for (iter=1; iter≤vmTypList.size();

iter=iter+1) do 5: v := vmTypList.get(iter); 6: if (supply[v]≥rcrit OR

iter==vmTypList.size()) then 7: //start a new VM of type v 8: vm = leaseVM(v); 9: k[v] = k[v]+1; 10: leasedVM[v,k[v]-1] = vm; 11: unusedRes[vm,k[v]-1] = supply[v]; 12: usedRes = placeInst(critList,vm,

k[v]-1); 13: rcrit = rcrit-usedRes; 14: break; 15: end if 16: end for 17: end while

In line 39, the resource requirements of the successfully

invoked critical instances are subtracted from the aggregated resource demand of the remaining critical service instances. In line 41, additionally required resources are acquired. For this, the method (Algorithm 4) is called.

The types of the new VMs are chosen according to the following procedure: In general we successively try to ac-quire VMs with rather high resource supply aiming at reduc-ing unused resource capacities due to having a large number of rather small-sized VMs. In addition, the cost of VMs in our scenario is proportional (see Section 2.4), i.e., larger VMs will provide a proportionately lower basic load that is not available to service instances. Therefore, as long as more resources are required, i.e., 0 (cf. line 3 in Algorithm 4), a new VM having the least amount of provid-ed resources, but still bigger or equal to the required re-source demand (cf. line 6), will be leased. If no VM type fulfills this requirement, the biggest available VM type will be leased (cf. line 7, i.e., if the end of vmTypList is reached). Subsequent to that, a new VM having this type will be leased (cf. line 8). If a new VM is leased, the re-source demand will be reduced by the amount of actu-ally used resources, i.e., after having placed critical instanc-es on this VM (cf. line 12-13). Before this, we update in lines 9-11 the number of leased VM instances of type , i.e.,

, store the corresponding VM instance, and set the value of its unused resource supply to the (maximum) supply of a VM of type . Using the method , which is pro-vided in Algorithm 3, we allocate critical service instances on the newly leased VM such that no further critical service instances can be placed on it due to its limited resource capacity. Subsequently, a “break” statement will stop the

loop and the procedure will be repeated until enough resources are available.

Corresponding to the presented algorithm for leasing additional resources (Algorithm 4), it is possible that re-

sources are still available on the leased VMs , i.e., they are not fully utilized. Therefore, we invoke further scheduled service instances on (already) leased VM instances in order to reduce the amount of unused VM resources. This is real-ized in line 43 of Algorithm 1 by calling the method

(cf. Algorithm 2) another time. Based on the results of the algorithms, the Reasoner is

able to lease/release resources corresponding to the calculat-ed resource demand. In addition, the Workflow Manager gets the information at which point in time to invoke which service instance as part of which workflow instance.

4. EVALUATIONThe Vienna Platform for Elastic Processes is a purely

Java-based framework and was developed and tested in a Unix-based environment. The evaluation was done in a private Cloud running the OpenStack operating system. The individual services are deployed in an Apache Tomcat-based Application Server. In the following, we present our General Evaluation Approach (Section 4.1), i.e., the evalua-tion scenario including the evaluation criteria. The experi-ment’s results are presented in Section 4.2. 4.1 GENERALEVALUATIONAPPROACH

While ViePEP and the presented reasoning approach are applicable in arbitrary process landscapes and industries, we evaluate the heuristic using a data analysis process from the finance industry. Choosing this particular process does not restrict the portability of our approach. We apply a testbed-driven evaluation approach, i.e., real Cloud resources are used. For the individual services, we simulate differing workloads regarding CPU and RAM utilization and service execution time (see below). However, real services are de-ployed and invoked during workflow executions.

To simplify an interpretation of the chosen evaluation settings, we decided to make use of one single workflow which will be processed 20,000 times. This sequential work-flow consists out of five individual service steps: The light-weight Dataloader Service simulates the loading of data from an arbitrary source; afterwards, the more resource-intensive Pre-Processing Service is invoked; next, the Cal-culation Service simulates data processing, which leads to high CPU load; then, the Reporting Service generates a simple report – it generates a load similar to the one of the Pre-Processing Service. Last, the Mailing Service sends the report to different recipients – this is a lightweight service comparable to the Dataloader Service. The user-defined maximum execution time for workflow instances has been set to 5 minutes, commencing with the request.

In order to test our optimization approach against a base-line, we have implemented a very basic ad hoc approach. As the name implies, this approach is only able to take into account currently incoming workflow requests in an ad hoc way. While this includes the scheduling of workflow re-quests, the baseline approach does not take into account


10

future resource demands. Instead, whenever a Backend VM is utilized more than 80%, an additional single-core Backend VM for the according service is leased. When the utilization is below 20%, the VM is released again. Notably, the baseline approach will only lease single-core VMs, as it is not able to take into account future resource demands.

Arrival Patterns We make use of two distinct workflow request arrival patterns: In the Constant Arrival pattern, the workflow requests arrive in a constant manner. This means, the same amount of workflows arrives in a regular interval. In our evaluation, the number of simultaneously executed workflows is set to 200 and is send to ViePEP every 20 seconds. In the Linear Arrival pattern, the workflows are executed following a linear rising function, i.e., ∗

40,where is the amount of concurrent requests and

40 is the start value. This value is increased by 40 at an interval of 60 seconds.

Metrics In order to get reliable numbers, we executed each arrival pattern three times and evaluated the results against three quantitative metrics. First, we measure the overall execution duration which is needed to process all 20,000 workflow requests (Duration in Minutes). This is the timespan from the arrival of the first workflow request until the last step of the last workflow instance has been pro-cessed successfully. The second metric is the amount of concurrently leased number of cores, i.e., the sum of (CPU) cores of the leased VMs (Active Cores). The combination of the first two metrics, results in Cost in Core-Minutes, i.e., these tell us the resulting cost of the overall evaluation. The Core-Minutes are calculated following a similar pricing schema as Amazons EC2, i.e., the VMs cost increase pro-portionally with the number of provided resources. Our evaluation environment, i.e., the private Cloud we are run-ning ViePEP in, provides four different VM types with 1-4 cores respectively In order to get the resulting cost, we sum up the active cores over time and get the overall Core-Minutes.

4.2 RESULTSANDDISCUSSIONTable 1 and Figures 3-4 present our evaluation results in

terms of the average numbers from the conducted evaluation runs. Table 1 presents the observed metrics as discussed in the last section for both arrival patterns. For each pattern, the numbers for evaluation runs are given for the baseline algorithm as well as the deployed optimization approach. The table also states the standard deviation for each metric. In general, the observed standard deviation is low, and therefore indicates a low dispersion in the results of the evaluation runs. Figures 3-4 complete the presentation of the average evaluation results by depicting the arrival patterns over time and the number of active cores until all workflow requests have been served. To combine numbers from dif-ferent evaluation runs, we apply nearest-neighbor interpola-tion to the next full minute.

The numbers in Table 1 indicate a substantial perfor-mance difference between the baseline and the optimization approach. Most importantly, the cost in terms of Core-Minutes is lower in both cases, leading to almost 16.5% cost savings for the Constant Arrival pattern and 22.6% for the Linear Arrival pattern. Hence, we can deduce that the opti-mization approach helps to achieve a significantly better utilization of VMs, thus preventing additional cost arising from overprovisioning of Cloud-based computational re-sources. Also, the optimization approach is faster in abso-lute numbers, as it needs 25% less time to execute all work-flow requests in the Constant Arrival pattern and 22.3% in the Linear Arrival pattern.

For both arrival patterns, the baseline approach is in many cases not able to comply with the workflow deadlines (5 minutes), as can be seen from the backlog after all work-flow requests have arrived. This can be traced back to the applied ad hoc approach, i.e., it takes the baseline approach too long to react to new workflow requests and adjust the number of leased VMs correspondingly.

TABLE1:EVALUATIONRESULTS

Constant Arrival Linear Arrival

Baseline Reasoner

+ Scheduler Baseline

Reasoner + Scheduler

Number of Workflow Requests 20,000 Interval between two

Request Bursts (in Seconds) 20 20

Number of Requests in one Burst

200 40 ∗3

40

Duration in Minutes (Standard Deviation)

52 (σ = 2.16)

39 (σ = 0.81)

28.67 (σ = 1.25)

22 (σ = 0.82)

Max. Active Cores (Standard Deviation)

11 (σ = 0)

10 (σ = 0)

18 (σ = 0)

16.66 (σ = 0.47)

Cost in Core-Minutes (Standard Deviation)

443.67 (σ = 7.72)

370.33 (σ = 5.90)

314.71 (σ = 25.36)

243.59 (σ = 6.70)


11

FIGURE3:CONSTANTARRIVALRESULTS

FIGURE4:LINEARARRIVALRESULTS

Interestingly, for the Constant Arrival pattern, Figure 3 shows clearly that ViePEP was able to optimize the sys-tem’s landscape almost perfectly, i.e., the number of active VMs vary only a few times during the experiment. It can be perfectly seen that the optimization approach (i.e., “Reason-er + Scheduler”) is not only faster than the baseline, but also acquired less overall computing resources, i.e., VMs.

For the Linear Arrival pattern, the number of active cores increases quite similarly for the optimization and the baseline. However, the biggest difference is that in the “Reasoner + Scheduler” approach, VMs with more than one core are acquired while in the baseline approach only single-core VMs are acquired. This results in a slower processing

of the whole workflow queue since the overhead of the operating system is comparable higher in a single-core VM than in a quad-core VM.

To summarize, the evaluation results show that the pro-posed optimization approach indeed leads to a more effi-cient allocation of computational resources. As a result, ViePEP is able to provide a higher cost-efficiency than approaches that do not take the process perspective into account – in our evaluation, such approaches where repre-sented by the baseline. In addition, ViePEP is able to de-crease the risk of under- and overprovisioning and therefore adds an important functionality to BPMS.

5. RELATEDWORKResearch on the utilization of Cloud-based computation-

al resources for the execution of business processes is still at its beginning (Dustdar et al., 2011; Andrikopoulos et al., 2013). To the best of our knowledge, the number of ap-proaches is still very small, but nevertheless, there is related work from other fields of research which should be taken into account, i.e., resource allocation and service provision-ing for single tasks (Section 5.1), for Scientific Workflows (Section 5.2), and for business processes (Section 5.3).

5.1 SINGLETASKS

In the field of Cloud Computing, resource allocation and automated service provisioning is a major research chal-lenge (Buyya et al., 2009), and many methods and algo-rithms to allocate or schedule single service requests in an ad hoc manner have been proposed in recent years. These approaches focus on different aspects, with cost optimiza-tion and resource utilization naturally being the most obvi-ous ones. For instance, Lampe et al. (2011) define the Soft-ware Service Distribution Problem in order to appoint ser-vices on the Software as a Service (SaaS) level to particular VMs on the Infrastructure as a Service (IaaS) level. The authors make use of a Knapsack-based heuristic approach in order to solve the problem. Li and Venugopal (2011) pro-vide mechanisms to automatically scale applications up and down on the IaaS level. For this, a reinforcement learning approach is followed, which learns the best server and ap-plication actions. QoS and SLA enforcement are also taken into account, e.g., by Buyya et al. (2010), who propose the federation of independent Cloud resources in order to deliv-er the needed QoS in a cost-efficient way, or by Cardellini et al. (2011), who model resource management in terms of VM allocation for services as a mixed integer linear optimi-zation problem and propose heuristics to solve them. Wu et al. (2011) discuss dynamic resource allocation from the perspective of a SaaS provider, aiming at profit maximiza-tion. Scheduling of service requests is based on defined SLAs between the provider and its customers.

All approaches discussed so far lack a process perspec-tive across utilized resources, but focus on the ad hoc alloca-tion of Cloud resources for individual services and tasks.

0 1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 21 22

0 5 10

15

20

25

30

35

40

45

50

0

50

100

150

200

250

300

350

400

450

500

550

600

650

700

750

800

Am

ount

of A

ctiv

e C

ores

Am

ount

of P

arra

llel W

orkf

low

Req

uest

Arr

ival

s

Time in Minutes

Reasoner + SchedulerBaseline

Workflow Request Arrivals

0 1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 21 22

0 5 10

15

20

25

30

0

50

100

150

200

250

300

350

400

450

500

550

600

650

700

750

800

Am

ount

of A

ctiv

e C

ores

Am

ount

of P

arra

llel W

orkf

low

Req

uest

Arr

ival

s

Time in Minutes

Reasoner + SchedulerBaseline

Workflow Request Arrivals


12

5.2 SCIENTIFICWORKFLOWS There have been several approaches to utilize Cloud re-

sources for the execution of Scientific Workflows (SWFs), e.g., by Hoffa et al. (2008) or Juve and Deelman (2010). Pandey et al. (2010) propose the usage of Particle Swarm Optimization for scheduling SWFs on Cloud resources. The authors especially take into account the cost of data trans-missions and storage cost and focus on the minimization of total cost. SLAs or QoS aspects are not taken into account. Szabo and Kroeger (2012) apply evolutionary algorithms in order to solve scheduling for data-intensive SWFs on a fixed number of VMs. Deadlines are not explicitly regarded and only one workflow is considered at a time. The latter con-straint also applies to the works by Byun et al. (2011) and Abrishami et al. (2013), who both present resource alloca-tion and scheduling approaches to optimize cost under a user-defined deadline. While these approaches offer inter-esting ideas and insights, there are certain differences be-tween business processes and SWFs that prevent a direct adaptation of such approaches (Ludäscher et al., 2009).

5.3 BUSINESSPROCESSES

Approaches which directly address business processes are still scarce, but recently, a number of researchers have started to present corresponding work: Xu et al. (2009) provide some basic assumptions for the work at hand, most importantly, that workflows are interdependent and share services. Optimization of scheduling is done with respect to cost and time, but SLAs are not taken into account. While not explicitly regarding business processes, Lee et al. (2010) allow the execution of applications composed from interde-pendent services running on different machines. The authors focus on maximizing the profit for an IaaS broker, who leases resources and provides VMs to service consumers.

Juhnke et al. (2011) provide an extension to a standard BPEL workflow engine, which allows making use of Cloud resources to execute business processes. As BPEL is ap-plied, workflows are composed from services, which mir-rors our approach. It is possible to execute several work-flows in parallel and optimize their scheduling and the re-source allocation with respect to cost and overall execution time; apart from the cost for VMs, data transfer cost are also taken into account. A genetic algorithm is applied to solve the optimization problem. However, workflow deadlines are not regarded. Hence, this approach makes use of a similar resource model as done by the SWF approaches discussed above. The same applies to the work by Bessai et al. (2013), who also assume that workflows are composed from single software services. The authors propose different methods to optimize resource allocation and scheduling, aiming at cost or time optimization or to find a pareto-optimal solution covering both cost and time. Tasks may be shared among concurrent workflows, but in contrast to our work, tasks will not share the same VM (and service instance) concurrently. Deadlines are also not regarded. As the discussed approach-

es do not regard deadlines, they are not able to optimize resource allocation through postponing particular workflows steps to future timeslots.

Wei and Blake (2013) and Wei et al. (2013) propose a similar approach – again, workflows are built from single services and the authors focus on resource allocation. While service instances may be part of different workflows (Wei et al., 2013), the authors do not allow for parallel service invo-cations in different workflows, i.e., one service instance can only be invoked by a particular workflow at a time. In con-trast, we follow the “classic” service composition model, which allows exactly this. The authors do not take into ac-count SLAs or workflow deadlines, but a workflow owner may define some generic QoS constraints (Wei et al., 2013). Since deadlines are not taken into account, the authors do not provide scheduling mechanisms. Cost areCosts are also not regarded explicitly, but mechanisms are presented which aim at saving cost. Because workflows are not able to con-currently share service instances, the potential for optimiza-tion of resource allocation is not completely exploited. Simi-lar to Bessai et al. (2013), Wei et al. also do not implement a testbed to test their algorithms, but use simulation in their evaluation. Despite the differences between our work and the work by Wei et al., there are also some commonalities, e.g., to allow different sizes of VMs at proportional cost. Furthermore, Wei and Blake (2013) also discuss the usage of resource demand prediction as a prerequisite for resource allocation.

Janiesch et al. (2014) provide an extensive conceptual model for Elastic Processes and implement an correspond-ing testbed which makes use of Amazon Web Services. The authors take into account SLAs (including workflow dead-lines) and cost optimization, but do not provide automatic scheduling and resource allocation methods yet. In contrast to our work, the authors do not make use of workflow moni-toring data to derive resource demands for upcoming ser-vices, but assume that there is a correlation between the resource demands of different tasks in a workflow. Apply-ing a complementary scenario to the work presented within this paper, Gambi and Pautasso (2013) define design princi-ples for RESTful business processes executed using Cloud resources. However, the authors propose to place complete processes on the same VM instead of allowing distributing services which belong to different workflows on different VMs. Hence, it is not possible to share resources between workflows. Finally, Frincu et al. (2013) analyze the applica-tion of resource provisioning and scheduling approaches for Grid workflows to Cloud-based workflows.

6. CONCLUSIONResource-intensive processes and their execution using

workflow and service technologies play an increasingly important role in many industries. The usage of Cloud re-sources to allow the execution of such processes in an elas-tic way seems to be an obvious choice, but so far, BPMS do


13

lack the ability to lease and release Cloud resources and allocate them in order to execute workflows.

In this paper, we have presented the Vienna Platform for Elastic Processes, which combines the functionalities of a BPMS with that of a Cloud resource management system. We have also presented an extended optimization model and heuristic for workflow scheduling and resource allocation for Elastic Process execution. As has been shown in our evaluation, the optimization approach leads to significant cost saving and time savings

Research on Elastic Processes is just at the beginning. There are several research directions that should be pursued in the future. First of all, we would like to extend the basic model of our workflow scheduling and resource allocation approach by allowing several different service instances per VM, vertical and horizontal scaling of VMs, a more com-plex VM model (e.g., non-proportional cost for VMs, mini-mum lease periods for VMs from public Clouds), include data transfer cost when scheduling workflows, and explicit-ly taking into account more complex workflow patterns. Second, while ViePEP was conceptualized for usage in hybrid Clouds, we are currently running it within a private Cloud environment. In the future, we will extend it by mak-ing it possible to combine public and private Cloud re-sources. Third, while the evaluation provides important results, we consider it as preliminary. In our future work, we want to make use of a more realistic Elastic Process test collection; we will also provide this test collection to inter-ested researchers. Last but not least, we are currently reen-gineering ViePEP in order to make it ready for distribution as Open Source software.

7. ACKNOWLEDGMENTSThis work is partially supported by the European Union

within the SIMPLI-CITY FP7-ICT project (Grant agree-ment no. 318201) and by the E-Finance Lab e.V., Frank-furt am Main, Germany (www.efinancelab.de).

This paper is an extended version of Hoenisch et al. (2013).

8. REFERENCESAbrishami, S., Naghibzadeh, M., Epema, D.H.J. (2013). Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds, Future Generation Computer Systems, 29(1), 158-169. Andrikopoulos, V., Binz, T., Leymann, F., Strauch, S. (2013). How to adapt applications for the Cloud environment – Challenges and solutions in migrating applications to the Cloud, Computing, 95(6), 493-535. Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., Zaharia, M. (2010). A View of Cloud Computing, Communications of the ACM, 53(4), 50-58. Bessai, K., Youcef, S., Oulamara, A., Godart, C. (2013). Bi-criteria Strate-gies for Business Processes Scheduling in Cloud Environments with Fair-ness Metrics, Proc. of IEEE 7th Intern. Conf. on Research Challenges in Information Science (RCIS 2013), Paris, France, 1-10. Breu, R., Dustdar, S., Eder, J., Huemer, C., Kappel, G., Köpke, J., Langer, P., Mangler, J., Mendling, J., Neumann, G., Rinderle-Ma, S., Schulte, S., Sobernig, S., Weber. B. (2013). Towards Living Inter-Organizational Processes, Proc. of the 15th IEEE Conf. on Business Informatics (CBI 2013), Vienna, Austria, 363-366.

Buyya, R., Yeo, C. S., Venugopal, S., Broberg, J., Brandic, I. (2009). Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility, Future Generation Computing Systems, 25(6), 599-616. Buyya, R., Ranjan, R., Calheiros, R. N. (2010). InterCloud: Utility-Oriented Federation of Cloud Computing Environments for Scaling of Application Services, Proc. of 10th Intern. Conf. on Algorithms and Archi-tectures for Parallel Processing (ICA3PP 2010), Busan, Korea, 13-31. Byun, E.-K., Kee, Y.-S., Kim, J.-S., Maeng, S. (2011). Cost optimized provisioning of elastic resources for application workflows, Future Gen-eration Computer Systems, 27(8), 1011-1026. Cardellini, V., Casalicchio, E., Lo Presti, F., Silvestri, L. (2011). SLA-aware Resource Management for Application Service Providers in the Cloud, Proc. of the First Intern. Symposium on Network Cloud Computing and Applications (NCCA ’11), Toulouse, France, 20-27. Dustdar, S., Guo, Y., Satzger, B., Truong, H.-L. (2011). Principles of Elastic Processes, IEEE Internet Computing, 15(5), 66-71. Dustdar., S., Schreiner, W. (2005). A survey on web services composition, Intern. J. of Web and Grid Services, 1(1), 1-30. Frincu, M. E., Genaud, S., Gossa, J. (2013). Comparing Provisioning and Scheduling Strategies for Workflows on Clouds, Proc. of the 2013 IEEE 27th Intern. Symposium on Parallel and Distributed Processing (IPDPS 2013) Works. and PhD Forum – 2nd Intern. Works. on Workflow Models, Systems, Services and Applications in the Cloud (CloudFlow 2013), Bos-ton, MA, USA, 2101-2110. Gambi, A., Pautasso, C. (2010). RESTful Business Process Management in the Cloud, Proc. of the 5th Intern. Works. on Principles of Engineering Service-Oriented Systems (PESOS 2013) in conjunction with the 35th Intern. Conf. on Software Engineering (ICSE 2013), San Francisco, CA, USA, 1-10. Gewald, H., Dibbern, J. (2009). Risks and benefits of business process outsourcing: A study of transaction services in the German banking indus-try, Information & Management, 46(4), 249-257. Hoenisch, P., Schulte, S., Dustdar, S., Venugopal, S. (2013). Self-Adaptive Resource Allocation for Elastic Process Execution, Proc. of IEEE 6th Intern. Conf. on Cloud Computing (CLOUD 2013), Santa Clara, CA, USA, 220-227. Hoenisch, P., Schulte, S., Dustdar, S. (2013a). Workflow Scheduling and Resource Allocation for Cloud-based Execution of Elastic Processes (forthcoming), Proc. of 6th IEEE Intern. Conf. on Service Oriented Compu-ting and Applications (SOCA), Kauai, HI, USA; NN-NN. Hoffa, C., Mehta, G., Freeman, T., Deelman, E., Keahey, K., Berriman, B., Good, J. (2008). On the Use of Cloud Computing for Scientific Workflows, Proc. of IEEE Fourth Intern. Conf. on e-Science (e-Science’08), Indianapo-lis, IN, USA, 640-645. Huang, Z, van der Aalst, W. M. P., Lu, X., Duan, H. (2011). Reinforcement learning based resource allocation in business process management, Data & Knowledge Engineering, 70(1), 127-145. Janiesch, C., Weber, I., Menzel, M., Kuhlenkamp. J. (2014). Optimizing the Performance of Automated Business Processes Executed on Virtualized Resources (forthcoming), Proc. of Hawaii Intern. Conf. on System Sciences (HICSS-47), Hawaii, USA, NN-NN. Jin, T., Wang. J., Rosa, M. L., ter Hofstede, A. H. M., Wen, L. (2013). Efficient Querying of Large Process Model Repositories, Computers in Industry, 64(1), 41-49. Juhnke, E., Dörnemann, T., Bock, D., Freisleben, B. (2011). Multi-objetive Scheduling of BPEL Workflows in Geographically Distributed Clouds, Proc. of IEEE 4th Intern. Conf. on Cloud Computing (CLOUD 2011), Washington DC, USA, 412-419. Juve, G., Deelman, E. (2010). Scientific Workflows and Clouds, ACM Crossroads, 16(3), 14-18. Kephart, J. O., Chess, D. M. (2003). The Vision of Autonomic Computing, Computer, 36(1), 41-50. Kühn, E., Mordinyi, R., Lang, M., Selimovic, A. (2009). Towards Zero-Delay Recovery of Agents in Production Automation Systems, Proc. of 2009 IEEE/WIC/ACM Conf. on Intelligent Agent Technology (IAT 2009), Milano, Italy, 307-310. Lampe, U., Mayer, T., Hiemer, J., Schuller, D., Steinmetz, R. (2011). Enabling Cost-Efficient Software Service Distribution in Infrastructure


14

Clouds at Run Time, Proc. of 4th IEEE Intern. Conf. on Service-Oriented Computing and Applications (SOCA 2011), Irvine, CA, USA, 1-8. Lee, Y.C., Wang, C., Zomaya, A. Y., Zhou, B. B. (2010). Profit-Driven Service Request Scheduling in Clouds, Proc. of 10th IEEE/ACM Intern. Conf. on Cluster, Cloud and Grid Computing (CCGrid 2010), Melbourne, Australia, 15-24. Li, H., Venugopal, S. (2011). Using Reinforcement Learning for Control-ling an Elastic Web Application Hosting Platform, Proc. of 8th Intern. Conf. on Autonomic Computing (ICAC 2011), Karlsruhe, Germany, 205-208. Ludäscher, B., Weske, M., McPhilipps, T. M., Bowers, S. (2009). Scien-tific Workflows: Business as Usual? Proc. of 7th Intern. Conf. on Business Process Management (BPM 2009), Ulm, Germany, 31-47. Maurer, M., Brandic, I., Sakellariou, R. (2013). Adaptive resource configu-ration for Cloud infrastructure management, Future Generation Computing Systems, 29(2), 472-487. Mutschler, B., Reichert, M., Bumiller, J. (2008). Unleashing the Effective-ness of Process-Oriented Information Systems: Problem Analysis, Critical Success Factors, and Implications, IEEE Transactions on Systems, Man, and Cybernetics, Part C, 38(3), 280-291. Pandey, S., Wu, L, Guru, M, Buyya, R. (2010). A Particle Swarm Optimi-zation-Based Heuristic for Scheduling Workflow Applications in Cloud Computing Environments, Proc. of 24th IEEE Intern. Conf. on Advanced Information Networking and Applications (AINA 2010), Perth, Australia, 400-407. Pesic, M, van der Aalst, W. M. P. (2007). Modeling work distribution mechanisms using Colored Petri Nets, Intern. J. on Software Tools for Technology Transfer, 9(3-4), 327-352. Rohjans, S., Dänekas, C., Uslar, M. (2012). Requirements for Smart Grid ICT Architectures, Proc. of Third IEEE PES Innovative Smart Grid Tech-nologies (ISGT) Europe Conf., Berlin, Germany, 1-8. Schuller, D., Lampe, U., Eckert, J., Steinmetz, R., Schulte, S. (2012). Cost-driven Optimization of Complex Service-based Workflows for Stochastic QoS Parameters, Proc. of 10th IEEE Intern. Conf. on Web Services (ICWS 2012), Honolulu, HI, USA, 66-74. Schulte, S., Hoenisch, P., Venugopal, S., Dustdar, S. (2013). Introducing the Vienna Platform for Elastic Processes, Proc. of Performance Assess-ment and Auditing in Service Computing Works. (PAASC 2012) at 10th Intern. Conf. on Service Oriented Computing (ICSOC 2012), Shanghai, China, 179-190. Szabo, C., Kroeger, T. (2012). Evolving Multi-objective Strategies for Task Allocation of Scientific Workflows on Public Clouds, Proc. of IEEE Con-gress on Evolutionary Computation (CEC 2012), Brisbane, Australia, 1-8. Wei, Y., Blake, M. B. (2013). Decentralized Resource Coordination across Service Workflows in a Cloud Environment, Proc. of 22nd IEEE Intern. Conf. on Collaboration Technologies and Infrastructures (WETICE 2013), Hammamet, Tunisia, 15-20. Wei, Y., Blake, M. B., Saleh, I. (2013). Adaptive Resource Management for Service Workflows in Cloud Environments, Proc. of the 2013 IEEE 27th Intern. Symposium on Parallel and Distributed Processing (IPDPS 2013) Works. and PhD Forum – 2nd Intern. Works. on Workflow Models, Systems, Services and Applications in the Cloud (CloudFlow 2013), Boston, MA, USA, NN-NN. Wu, L., Garg, S. K., Buyya, R. (2011). SLA-based Resource Allocation for a Software-as-a-Service (SaaS) Provider in Cloud Computing Environ-ments, Proc. of 11th IEEE/ACM Intern. Symposium on Cluster, Cloud and Grid Computing (CCGRID 2011), Newport Beach, CA, USA, 195-204. Xu, M., Cui, L., Wang, H., Bi, Y. (2009). A Multiple QoS Constrained Scheduling Strategy of Multiple Workflows for Cloud Computing, Proc. of 2009 IEEE Intern. Symposium on Parallel and Distributed Processing with Applications (ISPA 2009), Chengdu, China, 629-634.

Authors Dr.-Ing. Stefan Schulte is a Postdoctoral Researcher at the Distributed Systems Group at Vienna University of Technology and the project manager of the ongoing EU FP7 pro-ject SIMPLI-CITY - The Road User Infor-

mation System of the Future (http://www.simpli-city.eu). His research interests span the areas of SOA and Cloud Computing, with a special focus on QoS aspects.

Dr.-Ing. Dieter Schuller is a Postdoctoral Researcher at the Multimedia Communica-tions Lab of Technische Universität Darm-stadt, Germany. Conjointly with Ulrich Lam-pe, he leads the research area on “Service-oriented Computing”. Dieter’s research inter-

ests are in the areas of Service-oriented Computing, specifi-cally on QoS and efficient service selection.

Dipl. Ing. Philipp Hoenisch is a first year PhD student at the Distributed Systems Group at Vienna University of Technology. Before starting his PhD, Philipp collected hands-on software developing experiences in several

Open Source projects. His research interests cover the whole spectrum of Cloud computing, with the main focus on cost-efficient automatic scaling in order to provide a high QoS.

Dr.-Ing. Ulrich Lampe is a Postdoctoral Re-searcher at the Multimedia Communications Lab of Technische Universität Darmstadt, Germany. Conjointly with Dieter Schuller, he leads the research area on “Service-oriented Computing”. Ulrich’s research interests are in

the areas of Service-oriented Computing and Cloud Compu-ting, specifically on efficient software service distribution, auction-based capacity allocation, and Cloud-based multi-media services.

Prof. Dr.-Ing. Ralf Steinmetz is a professor in the Department of Electrical Engineering and Information Technology as well as in the Department of Computer Science at Technische Universität Darmstadt, Germany. Since 1996, he is managing director of the "Multimedia Communications Lab". He is the

author and co-author of more than 750 publications. He has served as editor of various IEEE, ACM and other journals. He was awarded as Fellow of both the IEEE and the ACM.

Schahram Dustdar is a full professor of computer science with a focus on Internet technologies and heads the Distributed Systems Group at the Vienna University of Technology. He is an ACM Distin-guished Scientist (2009) and recipient of

the IBM Faculty award (2012). He is an Associate Editor of IEEE Transactions on Services Computing, ACM Transac-tions on the Web, and ACM Transactions on Internet Tech-nology and on the editorial board of IEEE Internet Compu-ting. He is the Editor-in-Chief of Computing (an SCI-ranked journal of Springer).


15

RECOMMENDINGOPTIMALCLOUDCONFIGURATIONBASEDONBENCHMARKINGINBLACK‐BOXCLOUDS

GueyoungJung,NaveenSharma,TridibMukherjee,FrankGoetz,andJulienBourdailletXeroxResearchCenter

{gueyoung.jung,naveen.sharma,tridib.mukherjee,frank.goetz,julien.bourdaillet}@xerox.com

AbstractThis paper focuses on recommending optimal cloud configuration for deploying complex user workloads. Suchrecommendation has become imperative in recent times for cloud users to reduce complexity in selecting aconfiguration.Thenumberofdifferentconfigurationoptionshasincreasedmany‐foldduetotheproliferationofcloudproviders and different non‐standardized offerings from these providers. Furthermore, the performance and priceimplicationsareunknownbeforehand,whenaclouduserdeploysherworkloadintothesedifferentconfigurations.Theproblemgetsexacerbatedsincecloudprovidersgenerallykeeptheunderlyinginfrastructuresandtechnologiesnon‐transparenttousers.Inthispaper,wepresent(i)abenchmark‐basedmodelingapproachtoaccuratelyestimatethe performance of workloads on target black‐box cloud configurations; and (ii) a search algorithm that firstgeneratesacapabilityvectorconsistingofrelativeperformancescoresofresourcetypes(e.g.,CPU,memory,anddisk)for each configuration and then, identifies a near optimal cloud configuration based on the capability vectors.Experimentsshowthatourapproachaccuratelyestimatestheperformancecapability,andperformsefficientsearchforthenearoptimalcloudconfigurationthathastheminimumpricewhilemeetingthethroughputgoal.

Keywords:cloud,benchmarking,estimation,recommendation__________________________________________________________________________________________________________________1. INTRODUCTION

Recently, identifying optimal cloud configuration from various options has become critical but difficult problem to cloud users. We have seen a proliferation of cloud providers with increasing popularity of cost-effective deployment of complex applications such as multi-tier web transactions and parallel data mining on the cloud. Indeed, there are more than hundred cloud providers in the current market1. Further, each provider offers many different virtual machines (VMs). These VMs typically vary in terms of their types (e.g., small, medium, large, and extra large) and prices, which can be determined by resource capacities (e.g., the number of vCPUs, disk space, and memory size). Such VMs can have different performance among cloud providers, even if they offer the same VM type, since these VMs can be configured with different customized software (e.g., guest operating system), and deployed on different underlying virtualization technologies and infrastructures. Therefore, cloud users typically face with numerous options on different VMs and their combinations to build a cloud configuration for their workloads.

It is non-trivial for cloud users to identify the best cloud configuration (i.e., a best VM or a combination of VMs to achieve optimal performance and cost for a given workload). A cloud user can be overwhelmed by a number of cloud configurations, when exploring clouds for her workload deployment. Moreover, the performance (e.g. throughput) and cost implications of choosing a cloud configuration for the workload is typically unknown to cloud users. The

1 Refer to CloudHarmony at http://cloudharmony.com

problem gets exacerbated since clouds are typically black-boxes to cloud users. Cloud providers generally keep the underlying infrastructure and technology details (e.g., server, cluster, storage, and network structures and how VMs are managed) non-transparent (Voorsluys, 2011), but mainly open up the list of pre-defined VM types along with their corresponding prices. Additionally, the cloud providers continually integrate new hardware and software artifacts into clouds.

A user may provision a large number of high-end VMs to avoid the risk of not meeting her performance objectives. This may lead to over-provisioning and unnecessarily high cost. Meanwhile, a cost-concerned user may want low-end VMs to save cost. This however leads to under-provisioning and undesirably low performance. The user may further find an optimal cloud configuration by blindly exploring different cloud configurations and evaluating them to see whether they meet her throughput goal. However, this trial-and-test approach will be very expensive and time consuming, since, as mentioned earlier, she will find numerous different cloud configuration options. Different workloads and configurations have different performance characteristics (Jayasinghe, 2011). Moreover, the deployment of a workload for testing is typically very complicated (Jayasinghe, 2012). Although a user can figure out which VM has the fastest CPU, disk I/O, or memory separately using existing online services 2 , this is not sufficient for the user to understand the performance

2 Such as CloudVertical (https://www.cloudvertical.com/) and CloudyMetrics (http://www.cloudymetrics.com/) for VM price comparisons and CloudHarmony (http://cloudharmony.com) for VM performance comparison.


16

Figure 1.Overview of our recommender system

implications of the VM capabilities on her complex workloads. This is because resource types are usually inter-dependent while dealing with workloads. This is also because a resource can be bottlenecked for certain amount of loads, and the bottleneck can be migrated between resources as load changes (Malkowski, 2009).

1.1CONTRIBUTIONS

This paper describes an approach to efficiently estimate performance capabilities of black-box clouds for complex workloads, and to identify and recommend the near optimal cloud configuration that minimizes the price, while meeting user throughput goal. Specifically, our contributions are as follows. Estimating cloud capability requirement for

workload and its throughput goal. We develop an approach to characterize the performance of a given complex workload and then, build the performance model for estimating the required capability of each resource type to achieve the throughput goal of the workload. These performance capabilities are then encoded into a capability vector.

Estimating performance capabilities of target clouds.We develop benchmark-based performance scoring methodology to support the accurate estimation of a target cloud’s capability for a workload via the performance model. The benchmark suite is much simpler to be deployed than the application itself. Thus, the performance modeling can be done in an efficient manner. These performance scores are encoded into relative performance capability vectors, each of which represents the capability (i.e., the maximum throughput) of a building block (i.e., a VM) of a target cloud configuration.

Efficiently identifying near optimal cloud configuration. Using capability vectors, we cast the search problem into the Knapsack problem (Kellerer, 2004). We develop a heuristic search algorithm to identify a near optimal cloud configuration having a best price. Our approach can reduce the search space and speed up the search procedure.

We have evaluated our approach in public clouds with intensive web transaction workloads. Experiment results show that our approach accurately estimate performance capabilities of cloud configurations. To evaluate the scalability of our search algorithm, we perform an extensive simulation with capability vectors collected from benchmark results.

The remainder of this paper is structured as follows. Section 2 outlines our recommendation procedure. Section 3 and 4 describe the estimation and the search algorithm respectively. Section 5 shows our evaluation results. Section 6 reviews the related work. Section 7 concludes the paper.

2. SYSTEMOVERVIEWIdentifying optimal cloud configuration is a crucial

problem in the modern cloud era. In this regard, we are developing a cloud recommender system, referred to as Cloud Advisor. It aims to allow cloud users to explore various different optimal cloud configurations based on their preferences and requirements of their workload deployments. We achieve the goal in two steps. First, we accurately estimate the capability of the target cloud. Second, we enable comparison between potential offerings from different cloud providers, in terms of performance and price for the given workload. Figure 1 outlines the approach of Cloud Advisor that consists of offline modeling (solid arrows) and online recommendation (dashed arrows).

2.1OFFLINEMODELING

For a given user workload, our system figures out its performance characteristics in terms of the workload’s resource usage patterns in a white-box test-bed (i.e., profiling). For the profiling, we have developed Cloud Meter that can capture the relative contribution of each resource type (e.g., CPU, memory, disk I/O, and network bandwidth) to the workload’s throughput. Based on the profiling, our system can build an abstract performance model (see Section 3.1 and 3.2). Meanwhile, Cloud Meter also captures the performance characteristics of target clouds by benchmarking their VMs to be used as building blocks of target cloud configurations. The performance characteristics of a target VM can be encoded to a capability vector, where each element represents the relative benchmark score of each resource type against the white-box test-bed (see Section 3.3). Note that the benchmarking


17

Figure 2. Comparison tables

Figure 3. The change rates before knee points of throughput and two resource types

process will be scheduled periodically, since those benchmarking results can dynamically change over the time.

2.2ONLINERECOMMENDATION

Our system computes a near optimal configuration of each target cloud, in the context of throughput and price. Further, the system interactively adjusts the configuration based on user preferences (e.g., the maximum budget, the throughput goal, estimated hourly usage, and load, as shown in the top right part of Figure 1). Please refer to (Jung, 2013) for the detail implementation of Cloud Advisor interface. To do this, our system (i) estimates a capability vector using the abstract model, where each element in the vector represents the required capacity of each resource type to meet the throughput goal (see Section 4.1), (ii) with the capability vectors of target clouds collected from the offline modeling and the capability vector computed in the previous step, searches an optimal cloud configuration (i.e., combined VMs to run the target workload) until there is no more chance to minimize the price (see Section 4.2), and finally, (iii) provides comparison tables using search results.

The top table in Figure 2 shows a price comparison among four different cloud configurations including one extracted from an in-house cloud, when these cloud configurations have similar throughputs of a workload deployment. Then, a user can identify which configuration offers a best price while it meets her throughput goal. Similarly, as shown in the bottom table of Figure 2, Cloud Advisor can provide a throughput comparison, where prices are similar. Note that those prices and throughputs can dynamically change over the time due to dynamics of cloud market.

To achieve such comparisons, we develop two key components, capability estimation and search for an optimal cloud configuration, to be described in the following sections.

3. CAPABILITYESTIMATIONOne of the most important parts in the recommender

system is accurately estimating the performance capability of each building block of cloud configuration (i.e., a VM) for a given workload. Here, the performance capability of VM is defined as the approximated maximum throughput that can be achieved using the VM for the workload.

We develop Cloud Meter, which estimates such performance capabilities of different VMs in a target cloud. Using Cloud Meter, our approach first builds an abstract performance model based on the resource usage patterns of the workload measured in an in-house test-bed (i.e., a white-box environment). Second, it computes relative performance scores of many different VMs of a target cloud against the in-house cloud using benchmarking technique. Finally, it applies the collected performance scores into the abstract performance model to estimate performance capabilities of those VMs. This approach needs little cost and time to estimate the performance. This is because we do not need to deploy the complex workload itself into all possible VMs and cloud configurations to evaluate their performance capabilities.

3.1PERFORMANCECHARACTERIZATIONOFWORKLOAD

For a given workload, Cloud Meter first characterizes the workload in the context of its resource usage patterns by deploying the workload into an in-house test-bed and computing the correlation of resource usages to workload throughput, while load changes. In our approach, we capture the change rate of resource usage (i.e., slope) of each resource type and the change rate of workload throughput until the capability is reached (i.e., knee point), while load increases.

The usage change rate before the knee point can approximately indicate the degree of contribution of each resource type to the throughput change and the performance capability. These change rates are used as parameters of our performance model that will be described in the following section. Figure 3 shows the change rates of throughput and


18

Figure 4. Figuring out knee point

two representative resource types, while load increases over the time. In this example, the change rate of CPU is higher than memory usage. It indicates that CPU contributes more to the workload throughput than memory, and CPU can be bottlenecked first on its knee point. Note that the knee points of two resource types can occur at different points, since a bottlenecked resource type (e.g., CPU) can affect usages of the other resource types (e.g., memory usage).

To compute the change rates and then, build a model, we have to identify the knee points. Figure 4 illustrates our approach. At the end of measurement, Cloud Meter generates a linear line that connects the first measurement point to the last point and then, computes its length (i.e., z in the figure). At each measurement point, Cloud Meter can compute the length of the orthogonal line drawn from the linear line to the measurement point (i.e., the height hk in the figure, where k is each measurement point). To compute the height of each measurement point, it generates two lines and computes their lengths (xk and yk in the figure). First line is drawn from the first measurement point to the current measurement point, and the second line is from the current measurement point to the last point. Then, using cosine rule and sine rule, it computes the height as following.

hk = xk sin (cos-1 ((xk2 + z2 – yk

2)/2xkz))

Finally, the knee point among all measurement points has the highest height from the linear line.

A workload simulator of Cloud Meter has been developed for the performance characterization as well. It generates synthetic loads with various data access patterns (e.g., the ratio of database write over read transactions and the ratio of CPU usage over I/O usage). If a historical load is available, the workload simulator can sort and re-play the workload to give systematic stress to the target application. We also consider that the test-bed is highly capable to run any type of workload such as CPU-intensive, memory-intensive, I/O intensive, or network-intensive. 3.2BUILDINGABSTRACTPERFORMANCEMODEL

Based on the performance characterization, our approach defines performance models for resource types. As shown in Figure 3, throughput increases until the performance capability is reached. The performance capability is determined by some resource types that consume the most of their available capacities (i.e., bottlenecked). Hence, our system defines a quantitative performance model for each individual resource type to identify its correlation to the performance capability. Specifically, for each resource type j, a quantitative performance model can be defined as,

Tj = f(Uj | (Cj = c,∃j ∈ R) ˄ (Cr’ = ∞, ∀r’ ∈ R, r’ j))

, where Tj is the throughput to be achieved with the normalized usage rate, Uj, over given capacity (i.e., Cj = c) of a resource type j. R is a set of resource types, and r’ is a resource type in R, where r’ is not equal to j. We consider r’

has unlimited capacities so that we can compute the correlation of only j to Tj.

To compute Tj using function f, our system takes 4 steps. First, the system figures out the relation of load to the usage rate of the resource type. The relation can be defined as a linear function or, generally, as a function that has a curve for a resource type j. Usage rates we consider in this paper are the total CPU that consists of user and system CPU usages, cache, memory, disk I/O, and network usages. More specifically, the function can be as follows.

Uj = si,j (αj (2L – Lp) + γj) (1)

, where L is the amount of load, p is used to minimize the square error (a linear function is a special case when p is 1). αj is the change rate (e.g., a slope in a linear function), and γj is an initial resource consumption of the current configuration.

We can obtain αj, γj, and p by calibrating the function to fit into actual curve. As mentioned in the previous section, in this fitting, we use the change rate before knee point. Then, si,j is computed that is the relative performance score of a target VM i. This will be described in the next section in detail. In the white-box, si,j is set to 1.

Second, the relation of L to throughput T is defined as,

T = β (2L – Lq) (2)

, where β is the change rate of throughput, and q is used to minimize the square error (a linear when q is 1). Similarly, we can obtain β and q by calibrating the function to fit into actual curve.

Third, we compute the capability based on the correlation of resource type j to L. We can obtain a theoretical amount of load, when j reaches the full usage point using Equation 1 (i.e., theoretically extending the curve beyond the knee point with the same change rate αj until Uj is 1). Then, the obtained amount of load is applied to Equation 2.

Finally, the estimated capability of a target VM can be computed as follow.


19

= min(T1, T2, T3, …, Tr)

, where Tj is the maximum throughput computed from Equation 2 for each resource type j.

Tj can be the capability if it is the minimum throughput achieved by j among throughputs by all resource types (i.e., min()). This is because Tj has been computed when resource type j is fully consumed while other resources’ capacities are unlimited in the equation. Intuitively, this means that the capability is determined by bottlenecked resource, and non-bottlenecked resources do not consume all of their available resources while only bottlenecked resource is consumed completely.3.3COMPUTINGRELATIVEPERFORMANCESCORES

Although the same workload is deployed, target VMs (and target cloud configurations) have different performances each other due to their various different resource capacities and capabilities. To complete the abstract performance model (i.e., Equation 1) and estimate capabilities of such cloud configurations, we have to capture the performance characteristics of each target VM i, in terms of relative performance score si,j for each resource type j. Using Cloud Meter, our approach collects relative performance scores based on resource capability measurements. These scores can be reused for any different workload later. In current implementation, Cloud Meter contains a set of micro-benchmark workloads including Dhrystone, Whestone, cache capacity, system calls, and context switch that are integrated into UnixBench for CPU benchmarking, CacheBench for memory benchmarking, IOZone for disk I/O benchmarking, and our own network benchmark application. Cloud Meter is very useful when historical workload trace of an application in the target cloud is not available. Once it is available after the application is deployed, the application itself can be a benchmark of Cloud Meter and then, the historical data can be used to compute si,j of a new workload that has similar performance characteristics.

Using measurements, our approach computes si,j as, si,j = (bi,j / bj) (aj / ai,j), where aj and bj are a given allocation of j (e.g., the number of vCPUs or memory size of a VM) and the benchmarking measurement for j, respectively, in the white-box cloud configuration. Similarly, ai,j and bi,j are those in the target VM i.

By applying si,j to Equation 1, we can obtain the performance capability of i. When we deal with CPU for si,cpu, the system CPU usage for si,sys and user level CPU usage for si,user must be considered separately in the total CPU usage, since the system CPU usage is related to context switch and system call used for interrupts, allocate/free memory, and communicating with file system that can be different among cloud configurations. Thus, we extend si,j for CPU as si,cpu = (si,user αuser + si,sys αsys) / αcpu, where αuser is the increase rate of user CPU usage and αsys is the increase rate of system CPU usage while αcpu is the

increase rate of the total CPU usage. These rates are captured from the fitting model step in Equation 1.

4. SEARCHFOROPTIMALCLOUDCONFIGURATION

Using the performance model, Cloud Advisor identifies a near optimal cloud configuration, which can be composed of multiple VMs, for the workload and its throughput goal. In particular, we focus on parallel workload in this paper. To do this, our approach first generates capability vectors, each of which represents the performance of a target VM. Second, we encode the search problem into Knapsack Problem (Kellerer, 2004) and then, we develop a search algorithm to solve the problem in an efficient way.

4.1GENERATINGCAPABILITYVECTORSFORGIVENWORKLOAD

The first step to identify an optimal cloud configuration is generating a required capability vector using the performance model in the white-box test-bed. Each element of the required capability vector represents the performance capability value of each resource type to meet a throughput goal.

Specifically, the required performance capability value, remarked as c*j of a resource type j (i.e., the usage rate Uj required to achieve the throughput goal T*), can be given as c*j = (αj / β) T* + γj, when we consider a linear function in Equation 1 and 2. Here, (αj / β) indicates the normalized increase rate of the resource usage to increase a unit of throughput. Thus, the equation indicates how much resource capability is required to meet T*. Note that if c*j is more than 1, it indicates that more resource capability is required to meet T* than currently available in the test-bed configuration. With this equation, the required capability vector for the workload and its throughput goal is a set of such required performance capability values and is represented as a form of V* = [c*1, c*2, c*3,…, c*r].

We may obtain the required capability vector of a cloud configuration in a target cloud by directly deploying the given workload into it and perform aforementioned measurements. However, it is expensive and time consuming task, since there are many different cloud configurations to be evaluated, and the workload deployment is typically very complicated. Hence, our approach instead captures the relative performance capability value of each resource type j, remarked as ci,j of a target VM i in a cloud, by using the benchmarking measurements that are used for performance score si,j. It is much simpler than deploying the complex workload itself into the target cloud. Moreover, benchmarking measurements can be reused for other workloads once these are continually updated to reflect any change of the cloud.

Specifically, the relative performance capability value ci,j = (bi,j / bj) (ai,j / aj) cj, where (bi,j / bj) is the performance


20

Figure 5. Knapsack problem with capability vectors

ratio based on benchmarking, (ai,j / aj) is resource allocation ratio, and cj is the maximum resource usage rate for given capacity of j (typically, cj = 1). Then, the capability vector of i can be represented as a form of Vi = [ci,1, ci,2, ci,3, …, ci,r]. Finally, our approach computes all target capability vectors from clouds of interests. Note that Vi is just the relative capability vector of i against the test-bed, while V* is the capability vector required to achieve T* and computed in the test-bed.

4.2SEARCHALGORITHM

Cloud Advisor explores various different cloud configurations offered from each cloud provider to identify a near optimal cloud configuration having a best price while meeting a throughput goal. If we can obtain the price per unit of resource usage, we can easily compute the total price. However, most cloud providers have pre-determined small cloud configurations with specific prices as VM types that have different CPU, memory, disk and network capacities (Cardosa, 2011). Some cloud providers such as Amazon Web Services use a dynamic pricing scheme for their virtual resources as well. In this paper, we assume a static pricing scheme that is used by many cloud providers. In our current implementation of the search algorithm, we also assume a parallel workload such as clustered parallel database transactions and a parallel data mining using MapReduce (Dean, 2004). For parallel workloads, the cloud configuration can have multiple heterogeneous VMs to handle loads in parallel with a load balancing technique such as one introduced in (Jung, 2012).

The capability vector Vi of each specific VM type can be computed as described in the prior section. We can also capture V* that represents a resource capacity requirement to meet a throughput goal T*. Then, the search procedure is to fit those numerical capability values of pre-determined VM types into numerical capability values defined in V*. Especially, the search algorithm identifies a cloud

configuration having a minimum price when it fits into the required capability vector V*.

As shown in Figure 5, we illustrate V* with a set of containers, each of which has a specific size (i.e., the numerical capability c*j of resource type j as described in Section 4.1). Then, we can convert the problem into Knapsack problem (Kellerer, 2004), since the search algorithm tries to fill containers with items (i.e., capability vectors Vi’s) that have different values (i.e., prices) while aiming at a minimal total price. Figure 5 shows a snapshot of the search procedure, when containers are filled with two different VMs. Here, V* is [10.0, 9.8, 7.0, 5.0] representing CPU, memory, disk, and network resource types. V1 and V2 are [1.0, 1.2, 0.8, 0.5] and [2.0, 1.2, 0.8, 1.0], respectively. After two capability vectors are inserted, V* becomes [7.0, 7.4, 5.4, 3.5] and is further filled with more capability vectors.

The problem we tackle is finding such VMs to minimize the cumulated price of VMs, while all containers are completely filled. More specifically, the optimal cloud configuration will have 0 distance for each resource type j between c*j of V* and cumulated capabilities of VMs (i.e., (c*j – ∑ i,j) = 0, where m’ is the number of VM instances in a cloud configuration) while having the minimum cumulated price of VMs. The distance can be computed as follows.

D = ∑ ((c*j – ∑ i,j), 0) (3)

In this equation, if (c*j – ∑ i,j) < 0, its distance is considered as 0 since it indicates enough capability. Otherwise, containers have to be filled with more VMs since the throughput goal is not met. Note that all containers must be fully filled even if a container’s size is very small (i.e., the required capability of a resource type is relatively small such as network bandwidth in Figure 5). This is because the bottleneck can occur in any resource that has not enough capability even if the resource type is not critical to achieve the throughput goal.

Identifying such cloud configuration is not trivial, since there are many different combinations of VMs that can meet the throughput goal. We can formulate the problem into Integer Linear Programming (ILP) as follows.

Minimize ∑ i pi Subject to ∑ i ci,j ≥ c*j ∀j ∈ R, ni ≥ 0, pi ≥ 0, and ci,j ≥ 0

, where m is the number of VM types, ni is the required number of VM instances of each VM type i, and pi is the price of i. Then, to solve this ILP, we can use generic solutions such as Tabu search (Glover, 1989), Simulated annealing (Granville, 1994), and Hill climbing (Russell, 2003).

In our approach, we develop an efficient search algorithm based on the detection of resource bottlenecks and the best-first search method, rather than blindly exploring the search space. Algorithm 1 shows the basic best-first


21

Input: V* // required capability vector Γ // a set of pre-determined VM types Output: curr // the cheapest combination of VMs 1 Q ← null; // a queue for candidate combinations 2 curr ← null; 3 for each Vi ∈ Γ do 4 Q ← Q ∪ vmi; // where vmi is an instance of Vi 5 end for 6 while forever do 7 Q ← SortByPrice(Q); //ascending order 8 curr ← RemoveFirst(Q); 9 if (V*.c*j - curr.cj) ≤ 0, ∀j ∈ R then 10 return curr; //i.e., compute D in Equation 3 11 end if 12 for each Vi ∈ Γ do 13 V.cj ← curr.cj + vmi.cj, ∀j ∈ R; 14 V.p ← curr.p + vmi.p; 15 if ¬(V ∈ Q) then 16 Q ← Q ∪ V; 17 end of 18 end for 19 end while

Algorithm 1. Best-first search

while forever do curr ← RemoveFirst(Q); if | curr | < | prior | then break; // where | | is the number of instances end if if ( curr.cj < V*.c*j) ˄ ( curr.cj ≥ prior.cj) , ∃j ∈ R then break; end if end while

prior ← curr;

Algorithm 2. Conservative Gradient Search

search algorithm. We extend this algorithm for efficient search by reducing the search space.

The algorithm takes two inputs, the required capability vector V* and a set of pre-determined VM types Γ. All capability vectors have been computed as described in the previous section. To obtain the optimal combination of VM instances, curr, our algorithm first takes an instance of each VM type in Γ as a candidate (line 3~5 in Algorithm 1) and then, generates different combination of VM types over the search procedure. In the main loop, our search algorithm uses the cheapest cumulated configuration (i.e., the cheapest combination of VMs) first to fill V* among all candidate combinations (line 7 and 8). It keeps generating candidates (line 12~18) and trying to fit each candidate combination into V* (line 9~11) iteratively until it finds one successfully fitted into V* (i.e., the distance D in Equation 3 is 0).

The overhead of the search procedure is caused by too many candidate combinations to be generated (i.e., a large search space) in Algorithm 1. Hence, we extend the algorithm to significantly reduce the search space using two pruning techniques.

Conservative Gradient Search (CG). Our algorithm checks the gradient of each resource capability (i.e., any improvement toward the goal by adding the capability) in the current cheapest candidate against a previously selected one by replacing the line 8 of Algorithm 1 with Algorithm 2.

For the current cheapest combination curr, CG checks its size (i.e., the number of VM instances combined into it), and chooses it as the new search starting point if the size is less than the size of the candidate chosen in the prior iteration, prior. This is because the size and price of newly

generated candidate have been increased in Algorithm 1, so that (| curr | < | prior |) means that curr has been generated earlier, but it does not have a chance to be explored yet. If it is not the case, our algorithm further checks if curr has any resource type still required to be filled into V* (i.e., curr.cj < V*.c*j, ∃j ∈ R) and reducing the gap more than prior has done (i.e., curr.cj ≥ prior.cj). Alternatively, we can check this condition for all resource types or bottlenecked one (see below), instead of any resource type meeting the condition. However, this alternative method sometimes significantly affects the optimality of our algorithm, while it speeds up the search further. Therefore, we choose the conservative method to find the optimal combination.

Bottleneck-aware Proactive Search (BP). Our algorithm explores the search space along with biased paths in the search space that have more remaining distances (i.e., focuses on resource types to be potentially bottlenecked). To do this, we insert the following two functions after line 11 in Algorithm 1.

The first function is to figure out the bottlenecked resource type j* and compute the remaining distance g* for j*. j* is the resource type of curr that has the maximum distance between V* and curr (i.e., V*.c*j – curr.cj). The second function is to select top K VM types based on j* and g* from the set of all VM types Γ and store the selected top K VM types into Γtemp to be used for generating a set of new candidate combinations. Then, our algorithm uses smaller candidate types (i.e., Γtemp) instead of Γ in line 12 of Algorithm 1. The top K VM types should fill the distance of the bottlenecked resource with less amount of price than the other configuration types in Γ. Thus, this function computes the potential price when using each VM type in Γ to fill the distance of the bottlenecked resource as upper(g*/ Vi.cj*) Vi.p, where Vi Γ is the type, and Vi.p is the price of the

(j*, g*) ← max{(V*.c*j – curr.cj) for each j ∈ R}; // where j* is bottlenecked resource, // and g* is the remaining distance for j* to be filled. Γtemp ← SelectTopKConfigTypes(j*, g*, K, Γ); // where K is the number of candidates newly generated


22

Name vCPU (No.)

Mem (GB)

Disk (GB)

Network (Mbps)

Price ($/hr)

B-1-1-X 1 1 40 60 0.06 B-2-2-X 2 2 80 120 0.12 B-2-4-X 2 4 160 200 0.24 B-4-8-X 4 8 320 300 0.48 B-6-15-X 6 15 620 400 0.90 B-8-30-X 8 30 1200 600 1.20

Table 1. Black-box configurations in Rackspace

type. This means that the potential price is computed by considering the required number of VM instances to fill g*. Then, the function sorts all VM types in Γ by their potential prices. Note that our algorithm checks remaining distance of resource types and figures out a bottlenecked resource type every time in the iteration, since the potential bottleneck keeps migrating while adding resources. The number of VM types K in the algorithm can be determined empirically or dynamically over the iteration (i.e., reducing the number as the distance decreases). The higher K is, the better chance there is to increase the optimality, but the slower the search would be. In Section 5, we show the impact of the choice of K on the search speed and the optimality.

Using these techniques (i.e., CG and BP) on the best-first search (i.e., Algorithm 1), we can reduce the search space by pruning out some candidate combinations that can achieve the throughput goal, but can be more expensive than the optimal combination. Hence, we can compute the near optimal combination (i.e., a cloud configuration) much more efficiently compared to blind exploration.

5. EXPERIMENTALEVALUATION5.1EXPERIMENTALSETUP

To evaluate our approach, we have used an online auction web transaction workload, RUBiS3, that is deployed on servlet server (Tomcat) and back-end database server (MySQL). The workload provided by RUBiS package consists of 26 different transaction types. In our experiments, we focus on clustering of database servers so that we have modified the original workload to intentionally place loads on the database servers. This has been done by reducing simple HTML transactions that lightly place loads on the servlet server, and by increasing the rate of database query transactions. Then, we have created two different workloads by changing the rate of database write transactions in the original workload. The light-write workload has 5% write transactions, while the heavy-write workload has 50% write transactions out of all transactions in the workload.

We have prepared a VM type in our test-bed cloud that is configured with 2 vCPUs, 4 GB memory, and 40 GB disk. Ubuntu 10.04 operating system is loaded in the VM. Instances of this VM type have been deployed into our Intel blade with KVM virtualization. We have used this VM type to build our simple white-box cloud configuration (i.e., a single VM instance) and clustered white-box cloud configuration (i.e., two VM instances to configure a database cluster). We call these configurations as W-2-4-K (i.e., White-box having 2 vCPUs and 4 GB memory with KVM virtualization) and W-4-8-K for the simple and the clustered configurations, respectively.

For evaluating the capability estimation (Section 5.2), we have set up a VM type as a black-box. An instance of

3 RUBiS is available at http://rubis.ow2.org

this VM type is called B-4-2-X (i.e., Black-box having 4 vCPUs and 2 GB memory with Xen virtualization). It has also 80 GB disk space running on AMD server. We have set up another small black-box VM type to be compared. An instance of this small VM type is called B-1-2-K (i.e., Black-box having 1 vCPU and 2 GB memory with KVM). It has disk space of 40 GB and runs on Intel server.

We have prepared other VM types obtained from Rackspace that is a well-known cloud infrastructure provider. There are various VM types in Rackspace configured with different number of vCPUs, memory size, and disk size. We have used 6 VM types, where they vary from 1 to 8 vCPUs, 1 to 30 GB memory, 40 GB to 1.2 TB disk space, and 30 to 300 Mbps network. Their prices are from $0.06 to $1.20 per hour. Ubuntu 10.04 has been installed in all these VMs. These VMs have been used in our experiments for the search algorithm (Section 5.3). Table 1 outlines these VMs.

5.2PERFORMANCECAPABILITYESTIMATIONTo evaluate the capability estimation, we have deployed

two RUBiS workloads into W-2-4-K and built the abstract performance models (i.e., Equation 1 and 2 in Section 3.2). In this section, we show that our model can accurately estimate the maximum throughput of the target VM for any workload using these two workloads that have different performance characteristics from each other.

Heavy-Write Workload. As shown in Figure 6, our workload generator records throughput (i.e., the number of responses per 3 minutes) while increasing the number of concurrent users by 50 every 3 minutes. The throughput change and the maximum throughput of W-2-4-K are shown in the figure. Figure 6 also shows the maximum throughputs of the other two target configurations, B-1-2-K and B-4-2-X. Our estimation approach will estimate these maximum throughputs. We note that change rates of all three configurations are almost identical in low load. This is because they have enough capability to deal with such low load. However, their maximum throughputs are different due to their different capabilities.

To compute such capabilities, the resource usage patterns (i.e., usage changes) of 5 resource types in W-2-4-K have been plotted as shown in Figure 7. We can observe that CPU user, CPU system, and memory have notably affected the throughput of the heavy-write workload since


23

Figure 7. Resource usages of heavy-write workload

their usage changes are much higher than the other 2 resource types (i.e., disk I/O and network I/O) in this configuration.

We have captured the abstract performance model of the heavy-write workload into Table 2. To compute the change rates and the parameters of the model in the table (i.e., αj, p, and γj), our approach has first figured out the knee points of these trajectories (i.e., the red circles in Figure 6 and 7) using the technique described in Section 3.1. Note that the knee points of memory, disk I/O, and network I/O are located at the last measurement points. This is because there

are no obvious knee points of these resource types. Similarly, our approach has figured out the knee point of throughput graph, and captured β = 5.7 and q = 0.98 used in Equation 2, as the throughput increase rate and square error, respectively.

Cloud Meter has been deployed into W-2-4-K, B-1-2-K, and B-4-2-X configurations and performed benchmarking of resource types to compute performance scores of these configurations as described in Section 3.3. Figure 8 shows the performance scores of 6 different resource types. Note that the scores of B-4-2-X are quite different from other two configurations since the ratio of CPU and memory allocations is different, and B-4-2-K is configured with different type of architecture (i.e., AMD) and virtualization (i.e., Xen).

By applying these scores to the abstract performance model, we can estimate capabilities of B-1-2-K and B-4-2-X as shown in Table 3.

Based on our approach described in Section 3.2, CPU is bottlenecked in B-1-2-K (i.e., Tcpu is the minimum in the column of Table 3), similar to W-2-4-K as shown in Figure 7. In B-4-2-X, it however turns out memory is bottlenecked (different from other two configurations). This is because it has enough CPU allocation, but the bottleneck is migrated to memory in this case. Compared to the measured maximum throughputs, the error rate is less than 10% (for B-1-2-K, it is 8.75%, and for B-4-2-X, 6.98%).

When we see the resource usage patterns in Figure 7, it seems that CPU is bottlenecked in W-2-4-K, but it turns out that memory can be bottlenecked in different configurations. Additionally, using B-4-2-X instances can be over-provisioning to achieve a little capability increase, when we compare to W-2-4-K. Finally, we can also find the bottleneck can be migrated between resources as adding more allocation into bottlenecked resources. Hence, the recommender system must accurately capture such resource usage patterns and bottleneck migrations for workload.

Light-Write Workload. Similarly, we have measured the throughput and resource usage patterns in W-2-4-K and then, estimated the capabilities of the other two black-box configurations for this workload. Figure 9 shows the knee point and the throughput change rate, while Figure 10 shows the knee points and usage rates of 5 resource types.

We note here that this workload has different performance characteristics from the heavy-write workload. As shown in Figure 9, the maximum throughput of W-2-4-K is higher than one of B-4-2-X in this workload. As shown

Figure 6. Throughputs of three configurations (VMs) in heavy-write workload

Resources αj p γj CPU 0.08 1.01 8 CPU User 0.05 1.02 5 CPU Sys 0.03 0.98 3 Memory 0.02 1.01 21

Table 2. Parameters of abstract performance models of heavy-write workload

W-2-4-K B-1-2-K B-4-2-X Measured T 6050.32 3552.12 7091.53 Tcpu 3241.02 13681.25 Tmem 8165.41 7586.55 Tdisk 4178010.09 3483818.44 Tnetwork 2097002.99 1715384.62

Table 3. Capability estimates of heavy-write workload


24

in Figure 10, this may be caused by that the usage change rate of memory resource in this workload is higher than in the previous workload, and W-2-4-K has more memory capacity than B-4-2-X, although B-4-2-X has more CPU capacity than W-2-4-K. However, we need to further analyze the resource usage patterns since system CPU obviously has higher change rate than user CPU, and B-4-2-X has relatively better system CPU score than user CPU score as shown in Figure 8.

To analyze this situation and estimate the capabilities, we have captured the performance model in Table 4 with β = 5.0 and q = 0.99. Then, we have computed the estimates of the maximum throughputs. Table 5 summarizes the results. Note that we can reuse the performance scores used in the heavy-write workload for this estimation.

Although Table 4 indicates that the system CPU usage change has the highest rate in this workload, Table 5 shows that the bottleneck of B-4-2-X is still memory resource, while the bottleneck of B-1-2-K is CPU resource. Compared to the measured maximum throughputs, the error rate is less than 10% in this workload.

Experiment results indicate that our performance modeling approach is accurate enough to be applied to the search for the optimal cloud configuration in our recommender system. 5.3SCALABILITYOFTHESEARCHALGORITHM

We have obtained all capability vectors of 6 VM types of Rackspace and then, evaluated the accuracy of our search algorithm using the light-write and heavy-write workloads. The resulting configuration (i.e., database cluster using multiple VMs) computed by our search algorithm (i.e., Bottleneck-aware Proactive Search (BP) described in Section 4.2) has been compared with the configuration computed by a brute-force search (i.e., exhaustive searching for the best configuration among all possible VM combinations). We have used a relatively low throughput goal (i.e., 20K) so that the best resulting configuration has consisted of only 5 low-end VM instances. We have

deployed a simple load balancer in a separate VM that forwards user requests to the VM cluster based on their performance capabilities. We can see that BP returns the exactly same configuration (i.e., 5 low-end VM instances)

Figure 8. Performance scores of resource types in three configurations (lower is better)

Figure 9. Throughputs of three configurations in light-write workload

Figure 10. Resource usages of light-write workload

Resources αj p γj CPU 0.06 1.02 6 CPU User 0.02 1.02 2 CPU Sys 0.04 1.01 4 Memory 0.03 1.01 12

Table 4. Parameters of abstract performance models of light-write workload

W-2-4-K B-1-2-K B-4-2-X

Measured T 7345.52 4509.92 6604.87 Tcpu 4102.12 16012.05 Tmem 7325.71 7266.28

Table 5. Capability estimates of light-write workload


25

with the brute-force search in this small setup. Then, we have deployed the result configuration into Rackspace infrastructure to see if it can meet the throughput goal (i.e., 20K). Although the VM cluster has been a little over-provisioned (i.e., 21.3K) against the goal, the error is still around 5 % in light-write workload. For the heavy-write workload, the resulting VM cluster has been under-provisioned (i.e., 18.2K) with around 10% error rate. We have analyzed the cause of under-provisioning further with our clustered white-box configuration (i.e., W-4-8-K), and figured out that the network resource has been consumed a little more than we have estimated, to synchronize database writings among servers. In our current on-going work, we are improving the accuracy of our recommender system by considering such database synchronization.

We have conducted an extensive simulation to evaluate the potential scalability and optimality of our search algorithm. Currently, we plan to deploy a large-scale MapReduce cluster with a parallel data mining workload that may need up to several hundreds of VMs in the cluster. Thus, the search algorithm has to be scalable to deal with such large workload. In experiments, we have run our

search algorithm with all capability vectors of 6 VMs and W-1-2-K to compute the configuration and its total price. We have increased the throughput goal from 60K to 240K while measuring the duration of our search algorithm and the total price of resulting configuration. The higher throughput goal is, the more VMs are combined into the configuration, and the longer the search algorithm runs. Three different search algorithms have been compared in this evaluation. Naive Best-First Search (NBF): It uses the basic best-

first search algorithm as shown in Algorithm 1 of Section 4.2. This algorithm is not so scalable since it explores the cheapest candidate step by step while generating all possible candidates in the iteration. We have used this algorithm as a baseline.

Conservative-Gradient Search (CG): It integrates the conservative gradient search (described in Section 4.2) only on NBF.

Bottleneck-aware Proactive Search (BP): It integrates the bottleneck-aware search to prune the search space with CG on NBF as described in Section 4.2. To show the impact of the pruning parameter K on the optimality and the scalability of our algorithm, We have used the different K values (i.e., K = 1, 3, or 5 indicating the number of top K VM types out of the total 6 VM types to generate new candidate combinations).

Experiment results show that our BP algorithm with K = 3 is scalable (Figure 11), while having reasonable optimality (Figure 12). NBF in Figure 11 shows an obvious exponential increase as the throughput goal increases. Although CG has better scalability than NBF, it still shows the exponential increase. This is mainly because capabilities per unit price of given VM types are not so different in our experimental setup so that the pruning rate is very low. When we have set K is to 5 in our BP algorithm, we cannot obtain the good scalability since it has still generated numerous candidates in the queue to be evaluated later in the search procedure. However, BP algorithm with K = 3 starts to aggressively prune the search space based on potential bottlenecks. Hence, we can achieve a good scalability that is close to BP with K = 1 (i.e., the case that increases the search space linearly). Meanwhile, BP with K = 3 shows only a little loss of optimality. As shown in Figure 12, BP with K = 3 returns at most 4 % more price than the price computed by NBF. CG and BP with K = 5 is almost identical to NBF since it explores the most of candidates that are explored by NBF. However, BP with K = 1 shows the significant loss of the optimality caused by ignoring some candidates that can lead to the optimal configuration.

6. RELATEDWORKCloud has gathered pace, as most enterprises are moving

toward the agile hosting of applications in public clouds. In this regard, many researchers have focused on three

Figure 11. Scalability comparison of three search algorithms

Figure 12. Price comparison of three search algorithms


26

different directions: (i) updating application architecture to move from legacy systems to clouds (Chauhan, 2011), (ii) evaluating different clouds’ functional and non-functional attributes for informed decision on which cloud to host applications (Jayasinghe, 2011; Huang, 2010; Calheiros, 2011; Jayasinghe, 2011; Li, 2012; Cunha, 2013), and (iii) efficient orchestration of virtual appliances in a cloud (Keahey, 2008), which may also include negotiations with users (Venugopal, 2009). This paper complements these different directions by enabling recommendation of cloud configuration to suit application requirements. In this regard, cloud capability estimation methodology has been developed using benchmark-based profiling. Previous work has principally focused on comparing rudimentary cloud capabilities using benchmarks (Jayasinghe, 2011; Huang, 2010) and automated performance testing (Malkowski, 2009). This paper focuses on detailed characterization of cloud capabilities for user workload in various target clouds.

Recently, many cloud providers such as (Calheiros, 2011; Li, 2012) have further offered testing paradigm to enable the evaluation of various workload models on different resource management models. However, scaling such offerings requires identifying various cloud capability models (Gao, 2011). This paper fills the gap by developing a generic methodology to characterize and model cloud capabilities. The methodology is applicable to black-box clouds since the modeling of cloud capabilities are based on experiments on clouds only using externally observable characteristics. The model is used to estimate performance of applications in target clouds and then, such estimates are used to efficiently search near optimal cloud configurations.

Building performance models and diagnosing performance bottlenecks for Hadoop clusters have been explored in (Gupta, 2012). This paper further develops methodology to recommend cloud configurations to meet user demands. Orchestration of virtual appliances in a cloud (Keahey, 2008) and efficient allocation of instances in cloud (Venugopal, 2009) have been addressed previously to meet user demands. However, such approaches are typically non-transparent to users. This paper, on the other hand, makes the recommendation transparent by allowing users to make choices accordingly based on cloud capability estimation.

The importance of estimating cloud configuration has been mentioned in (Cunha, 2013) as well, and they have proposed an approach to collect the estimates from public clouds. However, in their approach, users must need a help from application expert to deploy the application and collect estimates. Our approach can estimate the cloud configuration using simple benchmark workloads, rather than deploying the complex user application itself.

7. CONCLUSIONSThis paper has aimed to address one major barrier given

to recommender system that identifies an optimal cloud configuration for complex user workload based on

benchmark-based approximation. Especially, we have shown that such estimation and recommendation can be done even under black-box environments. To achieve this, our system generates the capability vector that consists of relative performance scores of resource types and then, our search algorithm identifies a near optimal cloud configuration based on these capability vectors. Our experiments show our approach estimates the near optimal cloud configuration for workload within 10% error, and our search algorithm is scalable enough to apply for a large scale workload deployment. We currently work on applying our approach to a large-scale MapReduce job by extending the current approach. In this case, we are considering the time factor (e.g., the number of hours to be used for each VM) into Integer Linear Programming problem, defined in Section 4.2, for so-called bag-of-tasks as introduced in (Gutierrez-Garcia, 2012).

8. REFERENCESCalheiros, R., Ranjan, R., Beloglazov, A., De Rose, C., & Buyya, R. (2011). CloudSim: A toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Software: Practice and Experience, 41, 23-50.

Cardosa, M., Singh, A., Pucha, H., & Chandra, A. (2011). Exploiting spatio-temporal tradeoffs for energy-aware MapReduce in the cloud. ProceedingsofInternational Conference on Cloud Computing, 251-258.

Cunha, M., Mendonca, N., & Sampaio, A. (2013). A declarative environment for automatic performance evaluation in IaaS clouds. ProceedingsofInternational Conference on Cloud Computing, 285-292.

Chauhan, M. A. & Babar, M. A. (2011). Migrating service-oriented system to cloud computing: An experience report. Proceedings of International Conference on Cloud Computing, 404-411.

Dean, J. & Ghemawat S. (2004). Mapreduce: Simplified data processing on large clusters. Symposium on Operating Systems Design and Implementation, 10-19.

Gao, J., Bai, X., & Tsai, W. T. (2011). Cloud testing- issues, challenges, needs and practice. International Journal on Software Engineering.

Glover, F. (1989). Tabu search-part II. ORSA, 1(1), 4-32.

Granville, V., Krivanek, M., & Rasson, J-P. (1994). Simulated annealing: A proof of convergence. Transactions on Pattern Analysis and Machine Intelligence, 16, 652-656.

Gupta, S., Fritz, C., de Kleer, J., & Witteveen, C. (2012). Diagnosing heterogeneous hadoop clusters. Workshop on Principles of Diagnosis.

Gutierrez-Garcia, J. O. & Sim, K. M. (2012). GA-based cloud resource estimation for agent-based execution of bag-of-tasks applications. Information Systems Frontiers, 14 (4), 925-951.

Huang, S., Huang, J., Dai, J., Xie, T., & Huang, B. (2010). The HiBench benchmark suite: Characterization of the MapReduce-based data analysis. Proceedings of International Conference on Data Engineering Workshops, 41-51.

Jayasinghe, D., Malkowski, S., Wang, Q., Li, J., Xiong, P., & Pu, C. (2011). Variations in performance and scalability when migrating n-tier applications to different clouds. Proceedings of International Conference on Cloud Computing, 73-80.

Jayasinghe, D., Swint, G., Malkowski, S., Wang, Q., Li, J., & Pu, C. (2012). Expertus: A generator approach to automate performance testing in IaaS. ProceedingsofInternational Conference on Cloud Computing, 115-122.

Jung, G., Gnanasambandam, N., & Mukherjee, T. (2012). Synchronous parallel processing of big-data analytics to optimize performance in


27

federated clouds. Proceedings of International Conference on Cloud Computing, 811-818.

Jung, G., Mukherjee, T., Kunde, S., Kim, H., Sharma, N., & Goetz, F. (2013). CloudAdvisor: A recommendation-as-a-service platform for cloud configuration and pricing. ProceedingsofIEEESERVICES, 456-463.

Keahey, K. & Freeman, T. (2008). Contextualization: Providing one-click virtual clusters. Proceedingsof International Conference on eScience, 301-308.

Kellerer, H., Pferschy, U., & Pisinger, D. (2004). Knapsack Problems. Springer. Li, J., Wang, Q., Kanemasa, Y., Jayasinghe, D., Malkowski, S., Xiong, P., Kawaba, M., & Pu, C. (2012). Profit-based experimental analysis of IaaS cloud performance: Impact of software resource allocation. Proceedings ofInternational Conference on Services Computing, 344-351.

Malkowski, S., Hedwig, M., & Pu, C. (2009). Experimental evaluation of n-tier systems: Observation and analysis of multi-bottlenecks. ProceedingsofInternational Symposium on Workload Characterization, 118-127.

Russell, S. & Norvig, P. (2003). Artificial Intelligence: a modern approach (pp. 111-114). Prentice Hall.

Venugopal, S., Broberg, J., & Buyya, R. (2009). Openpex: An open provisioning and execution system for virtual machines. Proceedings ofInternational Conference on Advanced Computing and Communications.

Voorsluys, W., Broberg, J., & Buyya, R. (2011). Introduction to Cloud Computing. In Buyya, R. (Ed.), Cloud Computing: Principles and Paradigms (pp. 3-41). Hoboken: WILEY. Authors

Gueyoung Jung received his Ph.D. in Computer Science from Georgia Institute of Technology. His research background is in Distributed Computing, Autonomic Computing, and Services Computing. In particular, his research interests are in designing systems and analytical modeling to predict and optimize QoS in large-

scale, data-intensive distributed systems. He has authored numerous papers in prestigious conferences, journals, and book chapters. Before starting his graduate program, he was a system developer for five years. Since he joined Xerox Research Center in 2010, he has participated in some cutting-edge research projects in broad areas. His project portfolio includes Optimal Cloud Deployment, Services Composition, and Big Data Analytics. He has filed more than 20 US patent applications.

Naveen Sharma directs the Computing and Information Services Lab (CISL) at the Xerox Research Center Webster. His technical leadership in distributed and cloud computing and data intensive computing played a significant role in the formation of CISL. Naveen joined the Xerox

Research Center in 1994. He has extensive experience in technology transfer and the delivery of product ready code

to Xerox business groups. Prior to joining Xerox, Naveen was a research assistant professor at the Institute of Computational Science, Kent State University in Ohio. Naveen has a doctorate in computer science. He holds more than 25 U.S. patents, and has published more than 20 research papers.

Tridib Mukherjee is a research scientist at Xerox Research Center, India and works in the broad areas of Distributed Computing, Cloud Computing, Green Computing, Sensor Networks, and Services Computing. Tridib has been involved in cutting edge research in these areas since joining the Xerox

Innovation Group in 2011. He has published numerous articles in reputed journals, conferences, book chapters and books. He has also filed many US patent applications. Tridib has been serving in the technical program committee of many top conferences. He has also co-presented tutorials and involved in organizing workshops in prestigious conferences. Tridib received his Ph.D. in Computer Sc. from Arizona State University. Tridib was a finalist of the IBM Ph.D. fellowship award in 2007.

Frank Goetz is responsible for overseeing research, development and transition of technology options targeted for our Information Technology business. His early years were in product application software development for Xerox production. As a member of the Research Center, Frank led a team

working on advanced distributed systems architecture and technology for enterprise output management. He has contributed to architecture and technology for Xerox products. Frank received his BS degree in Electrical Engineering from the University of Illinois at Urbana-Champaign and his MS degree in Computer Science from the Rochester Institute of Technology.

Julien Bourdaillet is a research scientist at Xerox Research. His research interests include artificial intelligence, social media, cloud computing, and natural language processing. He filed more than 10 patent applications and published more than 20 papers in prestigious conferences and journals. He received his PhD from University

Pierre and Marie Curie, Paris, France. He held a post-doctoral position at the University of Montreal, Canada; and worked in several cutting-edge startups.


28

TAMINGTHEUNCERTAINTYOFPUBLICCLOUDSMaximSchnjakin,ChristophMeinel

HassoPlattnerInstitutePotsdamUniversity,Germany

Prof.Dr‐Helmert‐Str.2‐3,14482Potsdam,Germanymaxim.schnjakin,office‐[email protected]‐potsdam.de

AbstractPublic cloud storage services enable organizations to manage data with low operational expenses. However, the benefits

come along with challenges and open issues such as security, reliability, performance unpredictability and the risk to become dependent on a provider for its service. In our work, we presented a system that improves availability, confidentiality and reliability of data stored in the cloud. To achieve this objective, we encrypt user’s data and make use of the RAID technology principle to manage data distribution across cloud storage providers.

Recently, we conducted a proof-of-concept experiment for our application to evaluate the performance and cost effectiveness of our approach. We deployed our application using eight commercial cloud storage repositories in different countries. We observed that our implementation improved the perceived availability and, in most cases, the overall performance when compared with cloud providers individually. We also observed a general trend that cloud storage providers have constant throughput values - whereby the individual throughput performance differs strongly from one provider to another. With this, the experienced transmissions can be utilized to increase the throughput performance of the upcoming data transfers. The aim is to distribute the data across providers according to their capabilities utilizing the maximum of the available throughput capacity. To assess the feasibility of the approach we have to understand how providers handle high simultaneous data transfers. Thus, we put an additional focus on the performance and the scalability evaluation of those cloud storage providers, which are supported by our application. ___________________________________________________________________________________________________________________________1. INTRODUCTION

Cloud Computing is a concept of utilizing computing as an on-demand service. It fosters operating and economic efficiencies and promises to cause an unanticipated change in business. Using computing resources as pay-as-you-go model enables service users to convert fixed IT cost into a variable cost based on actual consumption. Therefore, numerous authors argue for the benefits of cloud computing focusing on the economic value [12], [6].

Despite of the non-contentious financial advantages cloud computing raises questions about privacy, security and reliability. Among available cloud offerings, storage services reveal an increasing level of market competition. According to a recent study by IDC the IT spending for public and private cloud storage will be over $20 billion by 2015 [16]. One reason is the ever increasing amount of data which is supposed to outpace the growth of storage capacity. Currently, it is very difficult to estimate the actual future volume of data but there are different estimates being published. Another review [15] states that, the amount of digital information created and replicated will grow by a factor of 300, from 130 exabytes to 40,000 exabytes, or 40 trillion gigabytes from 2005 to 2020. In addition, the authors estimate that the amount of digital information will double every two years.

However, for a customer (service) to depend solely on one cloud storage provider (in the following provider) has its limitations and risks. In general, vendors do not provide far reaching security guarantees regarding the data retention

[17]. Users have to rely on effectiveness and experience of vendors in dealing with security and intrusion detection systems. For missing guarantees service users are merely advised to encrypt sensitive content before storing it on the cloud. Placement of data in the cloud removes the physical control that a data owner has over data. So there is a risk that service provider might share corporate data with a marketing company or use the data in a way the client never intended. Further, customers of a particular cloud service might experience vendor lock-in. In the context of cloud computing, it is a risk for a customer to become dependent on a provider for its services. Common pricing schemes foresee charging for inbound and outbound transfer and requests in addition to hosting the actual data. Changes in features or pricing might motivate a switch from one storage service to another. However, because of the data inertia, customers may not be free to select the optimal vendor due to immense costs associated with a switch of one provider to another. The obvious solution is to make the switching and data placement decisions at a finer granularity then all-or- nothing. This could be achieved by distributing corporate data among multiple storage providers. Such an approach is pursued by content delivery networks (for example in [10], [11]) and implies significant higher storage and bandwidth costs without taking into account the security concerns regarding the retention of data.

A more economical approach, which is presented in this paper, is to separate data into unrecognizable slices, which are distributed to providers - whereby only a subset of the nodes needs to be available in order to reconstruct the


29

original data. This is indeed very similar to what has been done for years at the level of file systems and disks. In our work we use Redundant Array of Independent Disks (RAID) like techniques to overcome the mentioned limitations of cloud storage in the following way: 1) Security. The provider might be trustworthy, but

malicious insiders represent a well known security problem. This is a serious threat for critical data such as medical records, as cloud provider staff has physical access to the hosted data. We tackle the problem by encrypting and encoding the original data and later by distributing the fragments transparently across multiple providers. This way, none of the storage vendors is in an absolute possession of the client’s data. Moreover, the usage of enhanced erasure algorithms enables us to improve the storage efficiency and thus also to reduce the total costs of the solution.

2) Service Availability. Management of computing resources as a service by a single company implies the risk of a single point of failure. This failure depends on many factors such as financial difficulties (bankruptcy), software or network failure, etc. In July 2008, for instance, Amazon storage service S3 was down for 8 hours because of a single bit error [29]. Our solution addresses this issue by storing the data on several clouds - whereby no single entire copy of the data resides in one location, and only a subset of providers needs to be available in order to reconstruct the data.

3) Reliability. Any technology can fail. According to a study conducted by Kroll Ontrack 1 65 percent of businesses and other organizations have frequently lost data from a virtual environment. A number that is up by 140 percent from just last year. Admittedly, in recent times, no spectacular outages were observed. Nevertheless failures do occur. For example, in October 2009 a subsidiary of Microsoft, Danger Inc., lost the contracts, notes, photos, etc. of a large number of users of the Sidekick service [24]. We deal with the problem by using erasure algorithms to separate data into packages, thus enabling the application to retrieve data correctly even if some of the providers corrupt or lose the entrusted data.

4) Data lock-in. By today there are no standards for APIs for data import and export in cloud computing. This limits the portability of data and applications between providers. For the customer this means that he cannot seamlessly move the service to another provider if he becomes dissatisfied with the current provider. This could be the case if a vendor increases his fees, goes out of business, or degrades the quality of his provided services. As stated above, our solution does not depend on a single service provider. The data is balanced among several providers taking into account user expectations regarding the price and availability of the

1 http://www.krollontrack.com/resource-library/case-studies/

hosted content. Moreover, with erasure codes we store only a fraction of the total amount of data on each cloud provider. In this way, switching one provider for another costs merely a fraction of what it would be otherwise.

In recent months we conducted an extensive experiment for our application to evaluate the overall performance and cost effectiveness of the approach. In this paper we present the results of the experimental study. We show, that with an appropriate coding configuration Cloud-RAID is able to improve significantly the performance of the data transmission process, whereby the monetary costs are competitive to the cost of using a single cloud. Further, based on the indepth evaluation of the performance and resilience qualities of individual clouds and the results obtained, we propose possible strategies to improve the overall performance of the Cloud-RAID system.

2. CLOUD‐RAIDARCHITECTUREThe ground of our approach is to find a balance between

benefiting from the cloud’s nature of pay-per-use and ensuring the security of the company’s data. As mentioned above, the basic idea is not to depend on solely one storage provider but to spread the data across multiple providers using redundancy to tolerate possible failures. The approach is similar to a service-oriented version of RAID. While RAID manages sector redundancy dynamically across harddrives, our approach manages file distribution across cloud storage providers. RAID 5, for example, stripes data across an array of disks and maintains parity data that can be used to restore the data in the event of disk failure. We carry the principle of the RAID-technology to cloud infrastructure. In order to achieve our goal we foster the usage of erasure coding technics (see chapter IV). This enables us to tolerate the loss of one or more storage providers without suffering any loss of content [30], [14]. The system has a number of core components that contain the logic and management layers required to encapsulate the functionality of different storage providers. Our architecture includes the following main components: User Interface Module. The interface presents the user a

cohesive view on his data and available features. Here users can manage their data and specify requirements regarding the data retention (quality of service parameters).

Resource Management Module. This system component is responsible for intelligent deployment of data based on users’ requirements. The component is supported by:

– a registry and matching service: assigns storage repositories based on users requirements (for example physical location of the service, costs and performance expectations). Monitors the performance of participating providers and ensures that they are meeting the agreed SLAs


30

– a resource management service: takes operational decisions regarding the content storage

– a task scheduler service: has the ability to schedule the launch of operations at peak-off hours or after specified time intervals.

Data Management Module. This component handles data management on behalf of the resource management module and is mainly supported by: – a data encoding service: this component is

responsible for striping and encoding of user’s content

– a data distribution service: spreads the encoded data packages across multiple providers. Since each storage service is only accessible through a unique API, the service utilizes storage ”serviceconnectors”, which provide an abstraction layer for the communication to storage repositories

– a security service: manages the security functionality based on a user’s requirements (encryption, secret key management).

Interested readers will find more background information in our previous work [19],[25]. The next section gives an overview on the implementation of our system on a more detailed level.

3. DESIGNAny application needs a model of storage, a model of

computation and a model of communication. In this section we describe how we achieve the goal of the consistent, unified view on the data management system to the enduser. The web portal is developed using Grails, JNI and C technologies, with a MySQL back-end to store user accounts, current deployments, meta data, capabilities and pricing of cloud storage providers. Keeping the meta data locally ensures that no individual provider will have access to stored data. In this way, only users that have authorization to access the data will be granted access to the shares of (at least) k different clouds and will be able to reconstruct the data. Further, our implementation makes use of AES for symmetric encryption, SHA-1 and MD5 for cryptographic hashes and an improved version of Jerasure library [20] for using the Cauchy-Reed-Solomon and Liberation erasure codes. Our system communicates with providers via ”storage connectors”, which are discussed further in this section.

A.SERVICEINTERFACEThe graphical user interface provides two major

functionalities to an end-user: data administration and specification of requirements regarding the data storage. Interested readers are directed to our previous work [26] which gives a more detailed background on the identification of suitable cloud providers in our approach. In short, the user interface enables users to specify their

requirements (regarding the placement and storage of user’s data) manually in form of options, for example: budget-oriented content deployment (based on the

price model of available providers) data placement based on quality of service parameters

(for example availability, throughput or average response time)

storage of data based on geographical regions of the user’s choice. The restriction of data storage to specific geographic areas can be reasonable in the case of legal restrictions (e.g. European data protection law).

B.STORAGEREPOSITORIES1) CLOUD STORAGE PROVIDERS: Cloud storage

providers are modeled as a storage entity that supports six basic operations, shown in table I. We need storage services to support not more than the aforementioned operations. Further, the individual providers are not trusted. This means that the entrusted data can be corrupted, deleted or leaked to unauthorized parties [18]. This fault model encompasses both malicious attacks on a provider and arbitrary data corruption like the Sidekick case (section I). The protocols require n = k + m storage clouds, at most m of which can be faulty. Present-day, our prototypical implementation supports the following storage repositories: Amazons S3 (in all available regions: US west and east coast, Ireland, Singapore and Tokyo), Box, Rackspace Cloud Files, Microsoft Azure Storage, HP Cloud Object Storage, Google Cloud Storage (EU and US) and Nirvanix SND. Further providers can be easily added.

2) SERVICE REPOSITORY: Until now, the capabilities of storage providers are created semi-automatically based on an analysis of corresponding SLAs which are usually written in a plain natural language [5]. The claims stated in SLAs need to be translated into WSLA statements and updated manually (interested readers will find more background information in our previous work [26]). Subsequently the formalized information is imported into a database of the system component named service repository. The database tracks logistical details regarding the capabilities of storage services such as their actual pricing, SLA offered, and physical locations. With this, the service repository represents a pool with available storage services.

Function Description

create(ContainerName) creates a container for a new user write(ContainerName, ObjectName)

writes a data object to a user container

read(ContainerName, ObjectName) reads the specified data object list(ContainerName) list all data objects of the container delete(ContainerName, ObjectName)

removes the data object from the container

getDigest(ContainerName, ObjectName)

returns the hash value of the specified data object

TABLE I STORAGE CONNECTOR FUNCTIONS


31

3) MATCHING: The selection of storage services for the data distribution occurs based on user preferences set in the user interface. After matching user requirements and provider capabilities, we use the reputation of the providers to produce the final list of potential providers to host parts of the user’s data. A provider’s reputation holds the details of the historical performance plus the ratings in the service registries and is saved in a Reputation Object (introduced in our previous work [3], [2], [4]). By reading this object, we know a provider’s reputation concerning each performance parameter (e.g. has high response time, low price). Note, the performance parameter is the median value of the last 10.000 performed requests as the performance of provider’s APIs may change over time. The values are only valid for our location of infrastructure. To observe all changes based on a) time of usage, b) possible infrastructure updates performed by cloud service providers, and c) the location of infrastructure, every user is expected to perform a set of experiments to determine those factors. These experiments would consume some time and are subject of future work and development of the application.

With the available information the system creates a prioritized list of repositories for each user. In general, the number of storage repositories needed to ensure data striping depends on a user’s cost expectations, availability and performance requirements. The total number of repositories is limited by the number of implemented storage connectors.

C.DATAMANAGEMENT1) DATA MODEL: In compliance with [1] we mimic

the data model of Amazon’s S3 by the implementation of our encoding and distribution service. All data objects are stored in containers. A container can contain further containers. Each container represents a flat namespace containing keys associated with objects. An object can be of an arbitrary size, up to 5 gigabytes (limited by the supported file size of cloud providers). Objects must be uploaded entirely, as partial writes are not allowed as opposed to partial reads. Our system establishes a set of n repositories for each data object of the user. These represent different cloud storage repositories (see figure 1).

2) ENCODING: Upon receiving a write request the system splits the incoming object into k data fragments of an equal size - called chunks. These k data packages hold the original data. In the next step the system adds m additional packages whose contents are calculated from the k chunks, whereby k and m are variable parameters [20]. This means, that the act of encoding takes the contents of k data packages and encodes them on m coding packages. In turn, the act of decoding takes some subset of the collection of n = k + m total chunks and from them recalculates the original data. Note, any subset of k chunks is sufficient to

reconstruct the original object of size s [23]. The total size of all chunks (after encoding) can be expressed with the following equation: s _ (1+ m k ). With this, the usage of erasure codes increases the total storage by a factor of m k . Summarized, the overall overhead depends on the file size and the defined m and k parameters for the erasure configuration. Figure 3 visualizes the performance of our application using different erasure configurations. Competitive storage providers claim to have SLAs ranging from 99.9% to 99.999% uptime percentages for their services. Therefore choosing m = 1 to tolerate one provider outage or failure at time will be sufficient in the majority of cases. Thus, it makes sense to increase k and spread the packages across more providers to lower the overhead costs. The automated determination of the appropriate m and k values remains a subject of future work.

In the next step, the distribution service makes sure that each encoded data package is sent to a different storage repository. In general, our system follows a model of one thread per provider per data unit in such a way that the encryption, decryption, and provider accesses can be executed in parallel.

However, most erasure codes have further parameters as for example w, which is word size2. In addition, further parameters are required for reassembling the data (original file size, hash value, coding parameters, and the erasure algorithm used). This metadata is stored in a MySQL backend database after performing a successful write request.

2The description of a code views each data package as having wbits worth of data.


32

3) DATA DISTRIBUTION: Each storage service is integrated by the system by means of a storage-service-connector (in the following service-connector). These provide an intermediate layer for the communication between the resource management service (see section III-D) and storage repositories hosted by storage vendors. This enables us to hide the complexity in dealing with proprietary APIs of each service provider. The basic connector functionality covers operations like creation, deletion or renaming of files and folders that are usually supported by every storage provider I. Such a service-connector must be implemented for each storage service, as each provider offers a unique interface to its repository. In some cases a higher overhead is needed to ensure the basic file management functionality. As we mentioned above, services differ in their usage. For example, until recently Amazon didn’t offer rename support and the only way to rename an object was to to let the service connector upload the object with a new name and delete the old one. Unfortunately there is still no possibility to rename a bucket (container). Therefore, it requires the according S3 service-connector to create a new bucket and copy the contents of old bucket to a new one. As discussed earlier in this chapter all accesses to the cloud storage providers can be executed in parallel (see algorithm 1). As erasure codes alone do not satisfy the confidentiality guarantee we enable our users to encrypt data prior to transmission3 (the functionality of the security component is described in our previous work [19]). Therefore, following the encoding, the system performs an

3In general, if no security options are specified the encoded data packages are stored in plain text.

initial encryption of the data packages based on one of the predefined algorithms (this feature is optional).

Algorithm 1 The workflow of the data distribution process. Require: coding parameters user.codingParams, list of prioritized storage providers user.providerList1: fgetDataObject()2: puser.condingParams()3: packages[]encode(f,p)// encodes n = k+m data packages from a file f according to the defined parameters and stores them into a file-array 4: skgenerateSecretKey()5: for all parallel data package : packages[] do 6: fenc[]encrypt(datapackage,sk)7: end for parallel 8: pluser.providerList9: for all parallel file : f enc[] do 10: storageConnector[pli].write(file)11: if getDigest(file)=storageConnector[pli].getDigest(file)then 12: transmissionLog(pli,file)true13: else 14: transmissionLog(pli,file)false15: end if 16: end for parallel

4) REASSEMBLING THE DATA: When the service receives a read request, the service component fetches k from n chunks (according to the list with prioritized service providers which can be different from the prioritized write-list, as providers differ in upload and download throughput as well as in cost structure) and reassembles the data. This is due to the fact, that in the pay-per-use cloud models it is not economical to read all chunks from all clouds. Therefore, the service is supported by a load balancer component,

Figure 1 Data unit model at different abstraction levels. At a physical layer (local directory) each data unit has a name (original file name) and theencoded k+m data packages. In the second level, Cloud-RAID perceives data objects as generic data units in abstract clouds. Data objects arerepresented as data units with the according meta information (original file name, cryptographic hash value, size, used coding configurationparameters m and k, word size etc.). The database table ”Repository Assignment” holds the information about particular data packages and their(physical) location in the cloud. In the third level, data objects are represented as containers in the cloud. Cloud-RAID supports various cloudspecific constructions (buckets, treenodes, containers etc.).


33

which is responsible for retrieving the data units from the most appropriate repositories. Different policies for load balancing and data retrieving are conceivable as parts of user’s data are distributed between multiple providers. A read request can be directed to a random data share or the physically closest service (latency-optimal approach). Another possible approach is to fetch data from service providers that meet certain performance criteria (e.g. response time or throughput). Finally, there is a minimal-cost aware policy, which guides user requests to the cheapest sources (cost optimal approach). The latter strategy is implemented as a default configuration in our system. Other more sophisticated features as a mix of several complex criteria (e.g. faults and overall performance history) are under development at present. However, the read optimization has been implemented to save time and costs.

D.RESOURCEMANAGEMENTSERVICEThis component tracks each user’s actual deployment

and is responsible for various housekeeping tasks: 1) The service is equipped with a MySQL back-end

database to store crucial information needed for deploying and reassembling of users data.

2) Further, it audits and tracks the performance of the participated providers and ensures, that all current deployments meet the corresponding requirements specified by the user.

3) The management component is also responsible for scheduling of not time-critical tasks. This primarily concerns the deployment of content. Some providers may offer discounts for large volumes and lower bandwidth rates for off-peak hours (as it is the case with unused computing capacity, for example Amazons Spot instances4). In our approach we plan ahead and take advantage of these possible discounts to optimize the overall costs of data hosting. It seems also reasonable to suppose that, user might be interested in shifting the up and download of content to his own off-peak ours (on weekend or

4 http://aws.amazon.com/en/ec2/spot-instances/

at night). In this case, the management service would delegate selected workloads to a system component named task scheduler.

4. EVALUATIONIn this section we present an evaluation of our system

that aims to clarify the main questions concerning the cost, performance and availability aspects when erasure codes are used to store data on public clouds.

A.METHODOLOGYThe experiment was run in Hasso Plattner Institute

(HPI), which is located close to Berlin, Germany, over a period of over 377 (24x7) hours, in the middle of February 2013. As it spans seven days, localized peak times (time-of-day) is experienced in each geographical region. HPI has a high speed connectivity to an Internet backbone (1 Gb), which ensures that our test system is not a bottleneck during the testing. The global testbed spans eight cloud providers in five countries on three continents. The experiment time comprises three rounds, with each round consisting of a set of predefined test configurations (in the following sequences). Table II provides a summary of the conducted experiment. We used test files of different sizes from 100 kB up to 1 GB, deployed by the dedicated test clients.

Prior to each test round the client requires a persistent connection to the APIs of the relevant cloud storage providers, so that requests for an upload or download of test data can be send. In general, providers will refuse a call for the establishment of a new connection after several back-to-back requests. Therefore we implemented an APIconnection holder. After two hours of an active connection the old connection is overwritten by a new one. Further, we determine a timeout of one second between two unsuccessful requests, each client waits for a think time before the next request is generated.

1) MACHINES FOR EXPERIMENTATION: We employed three machines for experimentation. Neither is exceptionally highend, but each represents middle-range commodity processor, which should be able to encode, encrypt, decrypt and decode comfortably within the I/O speed limits of the fastest disks. These are: Windows 7

FIGURE 2 WORKFLOW OF THE EXPERIMENT


34

Enterprise (64bit) system with an Intel Core 2 Duo E8400 @3GHz, 4 GB installed RAM and a 160 GB SATA Seagate Barracuda hard drive with 7200 U/min.

Category Description Cloud storage provider 8 Locations Europe, USA, Asia Total experiment time about 15d 9h (377h) Total number of test rounds about 3 rounds Total number of requests (read/write) / round 281,900 Service time out for each request 1 sec Test file size 100 kb – 100 MB Coding Method cauchy_good Coding configuration [k,m] k=[2..4,6,10], m=[1..2],

k>=m TABLE II EXPERIMENT DETAILS

2) EXPERIMENT SETUP: Figure 7 presents the workflow of the experiment. In general we use two machines to transfer test data to cloud storage providers. The first machine (the upper part of the graph) uses erasure codes. This means, upon receiving a write request the test system encodes the incoming object into n = k+m chunks (see III-C2). Again, the reconstruction of the original data requires any subset of k shares [23]. In the next step, each chunk will be sent to a different storage repository - whereby all requests will be executed in parallel.

The second machine (the lower part of the graph in the figure 7) uploads the entire data object to a single provider without any modifications. As we are interested in the direct comparison between these two approaches, we want each data transmission to start simultaneously. Therefore we used the third machine as a ”sync-instance” running a Tomcat 7 server with a self-written sync-servlet which controls the workflow of the experiment.

3) ERASURE CONFIGURATION: In our experiment we make use of the Cauchy-Reed-Solomon algorithm for two reasons. First, according to Plank et al. [22] the algorithm has a good performance characteristics in comparison to existing codes. In their work, the authors performed a head-to-head comparison of numerous open-source implementations of various coding techniques which are available to the public. Second, the algorithm allows free selection of coding parameters k and m, whereas other algorithms restrict the choice of parameters. Liberation Code [21] for example is a specification for storage systems with n = k +2 nodes to tolerate the failure of any two nodes (whereby the parameter m is fix and is equal to two).

In our test scenario we tested more than 2520 combinations of k and m. We will denote them by [k,m] in the course of the paper, whereby the present evaluation focuses on an encoding configuration [4,1]. Which means, that the setting provides data availability toward one cloud failure at the time of read or write request. The expected availability for the selected configuration can be calculated using the following formula:

With this, the configuration [4,1] results in a monthly up-time percentage value of 99,999% or a tolerated failure of 25.868 seconds per month (under the assumption that the average monthly up-time percentage Ap is 99.9%).

B.SCHEMESANDMETRICSThe goal of our test is to evaluate the performance of our

approach. Mainly we are interested in the effective availability of APIs, overhead caused by erasure codes and transmission rates. Therefore, we implemented a simple logger application to record the results of our measurements. In total we log 34 different events. For example, each state of the workflow depicted in figure 7 is captured with two log entries (START and END).

1) ERASURE OVERHEAD: Due to the nature of erasure codes, each file upload and download is associated with a certain overhead. On one hand this overhead is caused by the redundant m packages, which have to be stored, uploaded and sometimes downloaded (in the events of failure). As stated in III-C2, the usage of erasure codes increases the total storage by a factor of m/k . Further, we need to encode data prior to its upload and accordingly decode the downloaded packets into the original file. Both operations cause an additional computational expense.

2) TRANSMISSION PERFORMANCE AND

THROUGHPUT: We measure the throughput obtained from each read and write request. In general the throughput is defined as the average rate of successful message delivery over a communication channel. In our work we link the success of the message delivery to the success of the delivery of the entire data object. In our approach, a data object is completely transferred, when the last data package is being successfully transferred to the transfer destination. This means that in case of data upload, the transfer is only completed, when (upon a write request) our client receives a confirmation message in the form of individual digest values that correspond with the results of the local computation (this applies for all transferred chunks). In the event of a mismatch the system will delete the corrupted data and initiate a reupload procedure. With this, the value of throughput does not only represent the pure upload or download rate of the particular providers, as the measured time span includes also possible failures, latency and the bilateral processing of get-hash calls.

C.EMPIRICALRESULTSThis section presents the results in terms of read and

write performance, as well as throughput, response time and availability based on over 281.000 requests. Due to space constraints, we present only some selected results from the conducted experiment.


35

1)ERASUREOVERHEAD: As described in IV-B1 the erasure coding leads to a storage overhead of factor m/k. For instance, an [k = 4,m = 1] encoding results in a storage overhead of 1/4 * 100% = 25%. In order to reduce the storage overhead, it would be advisable to define high k and preferably low m values. For example, an encoding configuration [k = 10,m = 1] produces a storage overhead of only 1/10 * 100% = 10%. Erasure causes also a computational overhead. During the experiment we scrutinized 12 different configurations. A selection of the results is presented in figure 3. The figure illustrates, that the computational expense increases with the file size regardless of the erasure configuration. As the encoding of a 100 MB data object takes approximately one second, the encoding overhead can be neglected in view of the significantly higher transmission times. In [19] we showed, that the average performance overhead caused by data encoding is less than 1% of the entire data transfer process to a cloud provider.

Using encryption, we can say that the total performance decreases as individual data packages have to be encrypted locally before moving them to the cloud. In our experiments the costs for encryption were less than 3% of total time which is also negligible in view of the overall transmission performance. This point has been addressed in our previous work [19] and [27].

Figure 3 The computational overhead caused by erasure with different

configurations and file sizes. The overall overhead increases with growing file size regardless of the defined m and k parameters for the erasure configuration. In general, the encoding step requires not more than 0,5 % of the entire data transmission process.

2) TRANSMISSION PERFORMANCE ANDTHROUGHPUT: Due to space constraints the current evaluation focuses on the Cloud-RAID configuration with k = 4 and m = 1. For performance comparison we experimented with different combinations among nine clouds, which are: Amazon US, Amazon EU, Azure, Box, HP, Google EU, Google US, Nirvanix and Rackspace. The particular combinations are represented in table III.

In general, we observed that utilizing Cloud-RAID for data transfer improves the throughput significantly when compared with cloud storages individually. This can be explained with the fact, that Cloud-RAID reads and writes a fraction of the original data (more specific 1/4th with [4,1] setting, see IV-B1) from and to clouds simultaneously. However, the total time of data transfer depends on the

throughput performance of each provider involved into the communication process. The throughput performance of Cloud-RAID increases with higher performance values of cloud providers involved into the data distribution setting. During the performance evaluation we observed, that storage providers differ extremely in their upload and download capabilities. Moreover, some vendors seem to have optimized their infrastructure for large files, while others focused more on smaller data objects. In the following we will clarify this point.

As we mentioned above there is a striking difference in the up- and download capabilities of cloud services. Except Microsoft Azure all the tested providers are much faster in download than in upload. This applies to smaller and larger data objects. At one extreme, with Google EU or Google US services a write request of a 100 kB file takes up to 19 times longer than a read request (see figures 4a and 4d). This behavior can also be observed with larger data objects (although less pronounced). Here the difference in the throughput rate may range from 4 to 5 times, with the exception of the provider Rackspace, where an execution of a write request is up to 49 times slower than of a read request (e.g. an upload of a 100 MB file takes on average 17,3 minutes, whereas the download of the same file is performed in less than 21 seconds, see figures 5b and 5d). Then again, Google US service improves its performance clearly with the growing size of data objects (see figures 4a and 5a). The explanation for this could be that with larger files the relatively long reaction time of the service (due to the long distance between our test system and the service node) has less impact on the measuring results. Similar to the US service Google EU performs rather mediocre in comparison to other providers when it comes to read speeds for data objects up to 1 MB, (see figures 4a and 4b). In terms of performance for writing larger files, Google EU becomes the clear leader and even outperforms the fastest Cloud-RAID setting, which consists of the five fastest providers: Amazon EU, Azure, Google EU, Google US and Nirvanix (see figure 5b). Similar phenomena have been observed by read requests. Microsoft Azure belongs to the leading providers for reading 100 kB data objects (see figure 4d) and falls back by reading 100 MB files (see figure 5d).

Hence, the performance of Cloud-RAID differs depending on the provider setting and file size. It is observed that our systems achieves better throughput values for read requests. The reason is that the test client fetches less data from the cloud (only k of n chunks) than in case of a write request, where all n packages have to be moved to the cloud.

As expected, we observe that the fastest read and write settings consist of the fastest clouds. Concerning writing 100 kB data objects, the fastest Cloud-RAID setting CRA improves the overall throughput by an average factor of 3 (compared to the average throughput performance of the providers in the current Cloud-RAID setting). For reading 100 kB, CR-E achieves an improvement factor of 5. In


36

terms of performance for writing 1 MB and 10 MB objects, Cloud-RAID settings CR-D and CR-E achieve already an average improvement factor of 7. Then again, for reading 10 MB, Cloud-RAID improves the average performance by a factor of 13 and even outperforms the fastest cloud providers (see figure 5c). By smaller data objects, execution of both read and write requests is highly affected by erasure overhead, DNS lookup and API connection establishment time. This can lead to an unusual behavior. For example, the transmission of a 100 kB data object to Google US can take our system more time than the transmission of a 500 kB or even 1 MB file (see figure 4a, 4b and 4c). Hence, increasing the size of data objects improves the overall throughput of Cloud-RAID. Concerning read and write speeds for 100 MB data objects, Cloud-RAID increases the average performance by a factor of 36 for writes (despite of the erasure overhead of 25 percent) and achieves an improvement factor of 55 for reads (see figures 5c and 5d).

There is also an observed connection between the throughput rate and the size of data objects. Charts 4a to 4f show results from performance tests on smaller files (up to 1 MB). Microsoft Azure and Amazon EU achieve the best results in terms of write requests. When writing 10 MB or 100 MB data objects Amazon EU falls back on the fourth place (see figures 5b and 5d). Form these observations, we come to the following conclusions. The overall performance of Cloud-RAID is not only dependent on the selection of k and m values, but also on the throughput performance of the particular storage providers. Cloud-RAID increases the

overall transmission performance compared to the slower providers. Beyond that we are able to estimate, that the more providers are involved into the data distribution process, the less weight slower providers carry in terms of overall throughput performance. The underlying reason is again the size of individual data units, which decrease with the growing number of k data packages (see chapter IV-B1).

Cloud-RAID

Provider Setting

CR-A Amazon EU, Amazon US, Azure, Nirvanix, Rackspace

CR-B Amazon EU, Amazon US, Azure, Google EU, Rackspace

CR-C Amazon US, Azure, Google EU, Nirvanix, Rackspace

CR-D Amazon EU, Amazon US, Azure, Google EU, Nirvanix

CR-E Amazon EU, Azure, Google EU, Google US, Nirvanix

CR-F Amazon EU, Google EU, Google US, Nirvanix, Rackspace

CR-G Amazon EU, Amazon US, Azure, Google EU, Google US

CR-H Amazon EU, Amazon US, Google EU, Google US, Nirvanix

CR-I Amazon EU, Azure, Google EU, Google US, Rackspace

CR-K Amazon EU, BoxNet, Google EU, Google US, Nirvanix

CR-L Amazon EU, Amazon US, BoxNet, Google EU, Google US

CR-M Amazon EU, Amazon US, Azure, BoxNet, Google EU

TABLE III CLOUD-RAID WITH K=4 AND M=1

D.OBSERVATIONSANDECONOMICCONSEQUENCES

Finally, based on the measured observations, we determine users benefits from using our system. In order to assert the feasibility of our application we have to examine

Figure 4 Average throughput performance in milliseconds and seconds observed on all reads and writes executed for the [4,1]Cloud-RAID configuration (4 of 5 chunks are required to reconstruct the original data, m = 1). The Cloud-RAID bars (CR)correspond to the complete data processing cycle: the encoding of a data object into data packages and the subsequenttransmission of individual chunks in parallel threads.


37

the cost structure of cloud storage services. Vendors differ in pricing scheme and performance characteristics. Some providers charge a flat monthly fee, others negotiate contracts with individual clients. However, in general pricing depends on the amount of data stored and bandwidth consumed in transfers. Higher consumption results in increased costs. As illustrated in tables IV and V providers also charge per API request (such as read, write, get-hash, list etc.) in addition to bandwidth and storage. The usage of erasure codes increases the total number of such requests, as we divide each data object into chunks and stripe them over multiple cloud vendors. The upload and download of data takes on average two requests. Considering this, our system needs (4+1)*2 = 10 requests for a single data upload with a [4, 1] coding configuration. The download requires only 4 * 2 = 8 requests, as merely 4 packets have to be received to rebuild the original data. Thus, erasure [k,m] increases the number of requests by a factor of k + m for upload and k for download. Consequently, the usage of erasure codes increases the total cost compared to a direct upload or download of data due to the caused storage and API request overhead. Tables IV and V summarize the cost in US Dollars of executing 10.000 reads and 10.000 writes with our system considering 5 data unit sizes: 100 kB, 500 kB, 1

MB, 10 MB and 100 MB. We observe, that the usage of erasure is not significantly more expensive than using a single provider. In some cases the costs can be even reduced.

Provider Filesize in kB

100 500 1024 10240 102400

CR-B 0.15 0.55 1.07 10.21 101.61 CR-G 0.16 0.52 0.99 9.28 92.25 CR-I 0.15 0.55 1.07 10.21 101.61

CR [6,1]5 3.61 4.12 4.78 16.50 133.69 Azure 0.11 0.53 1.08 10.74 117.42

Amazon/Google 0.13 0.59 1.19 11.74 117.21 Rackspace 0.17 0.86 1.76 17.58 175.78 Nirvanix 4.14 4.72 5.46 18.65 150.48

TABLE IV COSTS IN $ FOR 10,000 READS.

5The setting CR [6,1] consist of nearly all providers involved in the test setting: Amazon EU, Amazon US, Azure, Boxnet, Google EU, Nirvanix, Rackspace.

Figure 5 Throughput observed in seconds on reads and writes executed for the [4,1] Cloud-RAID configuration. Here again, CR bars correspond to the complete data processing cycle.


38

Provider Filesize in kB

100 500 1024 10240 102400

CR-B 0.12 0.12 0.12 0.12 0.12 CR-G 0.16 0.16 0.16 0.16 0.16 CR-I 0.12 0.12 0.12 0.12 0.12

CR [6,1] 8.14 8.20 8.29 9.75 24.40 Azure 0.00 0.00 0.00 0.00 0.00

Amazon/Google 0.02 0.02 0.02 0.02 0.02 Rackspace 0.00 0.00 0.00 0.00 0.00 Nirvanix 4.10 4.48 4.98 13.77 101.66

TABLE V COSTS IN $ FOR 10,000 WRITES.

5. PERFORMANCEOPTIMIZATIONAs stated in chapter IV, the involvement of providers

with different throughput and response time capabilities can influence the overall performance of the Cloud-RAID application in a negative way. Once again, this can be attributed to the fact, that in our approach the transmission of an individual data object depends on the capabilities (e.g. throughput or response time) of all the providers involved into the data distribution process. With Cloud-RAID, a data object is completely transferred, when the last data package is successfully transferred to its destination (see figure 6a).

One possible solution, to improve the overall transmission performance would be to stripe the original data (after the encoding step) into slices and to distribute the load across providers according to their capabilities. The aim would be to ensure that the transmission ends at around the same time. More specifically, the duration of the transmissions should take approximately the same time. Figure 6b illustrates the proposed approach and compares it with the current implementation. The optimization is only applicable under the assumption that the transmission performance can be increased through simultaneous transfers. Applying this method, we would be able to

increase the overall transmission performance by utilizing the maximum of the available throughput capacity of participating providers.

The traffic of the network determines the speed of data movements, thus we can’t predict the the speed at the time of uploading and downloading data precisely. However, in our tests, we noticed that most providers have more or less constant throughput values. With this, the experienced transmissions can be utilized to estimate the size of individual data packages for the upcoming data transfers.

As mentioned above our system executes all API-requests in parallel. Therefore, to be able to make assumptions about the feasibility of the approach we have to clarify three questions: First, can the transmission performance be increased through simultaneous writing accesses to particular cloud providers? Second, does the performance of providers remain constant while data transmission process (or does it degrade over time)? And third, how quickly can we interact with the APIs of the cloud storage services. To answer these questions we conduct a new set of experiments focusing on the performance abilities of individual clouds in terms of response time and resilience. The latter will help us to understand how individual providers handle high object counts of different sizes.

A.EXPERIMENTSETUPTo measure the performance and resilience of selected

clouds we conducted two further experiments (by utilizing the same infrastructure as in our prior experiments presented in chapter IV):

The first part of the experiment clarifies the question, if the transmission performance can be increased by dividing the data objects into chunks and their simultaneous upload to cloud providers. To make assumptions about the

FIGURE 6 OPTIMIZATION OF CLOUD-RAID.


39

reliability of the services, we let our test clients read infrequently 10% of the transferred data back from the provider (not the recent ones) and compare the hash values against an expected date. The result of each run is the time elapsed between the execution of the first write request until the last write request.

With the second test we aimed to find out, whether simultaneous uploads influence each other and to what extent. In each sequence our test client generates n data objects of a fixed size and then transfers them simultaneously to a cloud provider. In this part of the experiment, we are interested in the average duration of the data transfer operations (reads and writes). Then again, to check the integrity of the transferred data, the test client reads 10% of randomly selected data backfrom the cloud and compares it against an expected value.

Figure 7 presents the workflow of the second experiment. All transferred data objects will be only deleted after the completion of the experiment, as the aim is to observe the performance of providers while filling up the repositories.

FIGURE 7 PROCESS WORKFLOW OF THE ”STRESS

TEST” EXPERIMENT (BPMN DIAGRAM).

B.SCHEMESANDMETRICSAs mentioned above, we intend to evaluate the

performance of cloud storage providers, which are currently supported by our application. More specifically, we want to observe the behavior of cloud providers when it comes to parallel transmission of high data counts. In this context we are also interested in response times and resilience properties of the APIs. Therefore, as in the first experiment we capture each state of the workflow depicted in figure 7 with two log entries (START and END).

1) Response Time: In general, the measure of the response time (latency) depends on the network proximity,

congestions in network path and traffic load on the target server. We define response time as a time delay a storage provider needs to react on an API call. More precisely, we want to measure the time interval between starting the API call to download a file and the receipt of its first byte. To this end we contacted the respective providers with a request for instruction to perform the correct measurement. Thereupon we received nearly the same answers: we were recommended to send an API call and to track the time span until the network adapter will receive the first bits. The implementation of the recommended procedure seemed to complicated. Therefore we looked for an easier and simpler approach to measure something that would reflect the workload and the response time of a provider. We concluded that we could use the getHash method for two reasons: first, based on our observations (see below) the hash value is computed only ones after the data has been received by a storage provider. Secondly, the size of the data packet with the requested information is small enough to be ignored. Note, that the procedure does not exclude the amount of time that a storage provider spends processing the request. Nevertheless, as each provider is expected to process the requests in the same way, we can presume that the approach described above will reflect the response time of providers with sufficient precision.

2) Resilience: In general, resilience is defined as the ability of a system (network, service, infrastructure, etc.) to provide and maintain an acceptable level of service in the face of various faults and challenges to normal operation. In the context of our experiment we assess the resilience of a provider based on the following:

1) The constancy of performance during a series of simultaneous transfers; and

2) The reliability of the provider over a long period of sustained operation rates.

For our test we executed a series of simultaneous write requests with data objects of various sizes (1 MB, 10 MB, 100 MB and 1 GB). We made the decision to start with the 1 MB file size due to our observations from the previous experiment IV. We observed that with Cloud-RAID, the transmission of smaller data objects (e.g. 64 kB, 100 kB, 500 kB) takes almost the same time as the transmission of a 1 MB file. The underlying reason is that execution of both read and write requests is dominated by erasure overhead, DNS lookup and API connection establishment time.

3) Availability: Usually, the availability is defined as . Applying to cloud storage services we define the

perceived availability of providers as . This definition of availability can be found in the SLAs of most storage providers. Indeed, some vendors use self-defined metrics for calculation of the availability of their services. Rackspace, for example, perceives its network to be down if user requests fail during two or more consecutive 90 second intervals6. At the same time Google defines downtime when

6 http://www.rackspace.com/cloud/legal/sla/


40

more than 5% of request failures occur in a certain time interval 7 . The latter availability metrics allow a higher margin for failures.

C.EMPIRICALRESULTSThis section presents the results in terms of response

time and resilience based on over one million requests. Due to space constraints, we present only some selected results from the conducted experiment.

1) Response Time: In order to observe the behavior of the participating storage providers we uploaded data units of various sizes to each provider (100 kB, 1 MB, 10 MB and 100 MB). Each transferred object has a unique hash value, regardless of file size. After that, we performed a series of randomized download requests and measured the time span between executed calls and received responses. In addition, we measured the time our system needed to calculate the hash value of data packages (locally).

Provider File Size (in kB)

API call (in msec)

Local hash calculation (in msec)

Amazon [EU]

100 485 0 1024 417 4 10240 463 54 102400 512 547

Amazon [US]

100 1326 0 1024 1069 5 10240 1280 54 102400 1390 550

Azure

100 28 0 1024 36 4 10240 26 54 102400 26 547

Box

100 308 1 1024 296 12 10240 291 124 102400 317 1258

Google

100 85 0 1024 71 5 10240 60 54 102400 63 548

Nirvanix

100 380 0 1024 393 4 10240 390 54 102400 408 547

Rackspace

100 171 0 1024 152 5 10240 169 54 102400 194 548

TABLE VI THE COMPARISON OF HASH VALUE CALCULATIONS

The results of the experiment are presented in table VI. There are few observations that can be taken from the table:

7 https://developers.google.com/storage/docs/sla

1) Box uses a different hash method, therefore it takes our system nearly twice as much time to calculate the hash values;

2) The measured time span is not affected by the size of a data unit;

3) API calls for receiving hash values of larger data units (greater-than 100 MB) is faster than their on site calculation;

4) From 2 and 3 we conclude that each provider stores the information to a meta-object after computing the hash value of the received data unit.

Following this we conclude that on any API getHash-call the requested information is extracted from the meta-object and transferred to the caller.

Further we observed a general trend that our test clients experienced consistent and constant response times in most cases - whereby the individual latency values differ extremely from provider to provider. After the first analysis we divided providers into three clusters related to our test location:

FIGURE 9 AVERAGE RESPONSE TIME OF THE

PROVIDERS AMAZON AND HP OVER A PERIOD OF TWENTY DAYS

• Fast (response time <200ms); • Medium (response time varies between

200 and 1500ms); and • Slow (response time >1500ms)

The figures 8 and 9 show how quickly providers react on a getHash request. At several time instances during the experiment we observed increased response time which can be attributed to the sudden increases in request traffic on the target server nodes (for example in case of Google- US service in figure 8a). Overall, the best and continuous provider for our location of infrastructure is Azure. The average time needed by the service to react on a request is about 50 milliseconds (see figure 8a). We measured the slowest reaction time on Amazon-US service (see figure 9). The reason is obvious and refers to the large distance between the location of our test clients (located in Germany) and the destination server. With a deviation of up to 100


41

milliseconds the providers Google and Amazon do not provide as constant results than other providers. It would be speculative to explain the experienced behavior. One reason could be, that both Amazon and Google are running more cloud services at the same nodes than other providers, which would result in an additional traffic load on servers. Then again, the behavior might be also related to the usage of different consistency models, which is the subject of our further analysis.

However, the empirical results summarized in this section are based on continuous monitoring over the course of over 20 days. Overall, the measured values appear to be sufficiently comprehensive in order to effectively predict the response time for upcoming requests. Hence, the information can be used for an intelligent data placement within Cloud- RAID application.

2) Resilence: For the sake of brevity, we present only some selected results from the conducted experiment. In the first instance, the performance comparison focusses on the upload performance of six clouds presented earlier (Amazon, Box, Google , Nirvanix, HP and Rackspace). The evaluation of download performance showed similar results.

The first part of the experiment can be briefly summarized as follows: a data unit F of size s(F) was splitted into multiple chunks of equal size (starting with 5 and ending with 100 in intervals of 5 segments) and transferred to a cloud. At the same time, we measured the time span between the first and the last read or write request within each segmentation interval. Here again, our system tries to execute all API requests in parallel.

Before the experimentation, we assumed that the transmission performance increases with simultaneous

Figure 8 Average response times of cloud storage providers over a period of twenty days. The representation uses the trend linefeature of excel to calculate a polynomial trend line (grade 6), which ”straightens” the values. In general, we attach lessimportance to outliers in our experimentation.

Table VII THE TABLE CAPTURES THE RESULTS OF THE EXPERIMENT. DATA OBJECTS WERE SPLITTED INTOMULTIPLE CHUNKS OF EQUAL SIZE AND TRANSFERRED TO CLOUD STORAGE PROVIDERS.


42

chunk transfer. Hence, the interpretation of the results of the conducted experiment focusses on the following values:

the number of chunks where the increase in performance stops, we will denote the value as (maximum) segmentation level x;

the size of chunks at the segmentation level with the best performance value, and which we

denote as s(Fx) (= ); the average transmission performance of all

chunks at the segmentation level x, which we denote as P(Fx);

the transmission performance of chunks of size s(Fx) in case of a native data transfer, denoted as P(s(Fx)). This means that an individual chunk is transferred as a single file with one single thread (the segmentation level equals to one);

the relation between the transmission performance of a native data transfer7 P(F) and P(Fx), expressed as an improvementfactor;

the relation between the transmission performance of a native transfer of a single chunk P(s(Fx)) and P(Fx);

Table VII captures the results of the experiment. The evaluation confirmed the assumption, that the behavior of providers differs when it comes to simultaneous transfers of a large number of data objects. In general, an increase in segmentation level goes hand in hand with an increase in the transmission performance, at least to a certain extent. As we have discussed earlier, vendors have optimized their infrastructure for particular file sizes. Hence, the improvement factor depends of the file size and varies from one provider to another. More specifically, relatively slow providers with optimized APIs for transmissions of smaller data objects, achieve significantly better performance with a higher segmentation rate, as the size of individual chunks decreases with an increasing number of segments. For example, a native transmission of a 100 MB data object to Nirvanix takes about 82,13 seconds (see 12)e. The transmission of the same object in 75 segments (with a size of 1,33 MB each) takes nearly 8,9 seconds, which improves the transmission rate by a factor of 92. In terms of performance of writing a 10 MB file, Nirvanix achieves only an improvement factor of 5,46 (see table VII). HP Cloud Storage service shows similar behavior. A native transfer of a 100 MB file takes the service about 45 minutes, whereas the segmented upload takes only 41 seconds.

Similar behavior can also be observed by providers with higher throughput rates, although less pronounced. At one extreme, Amazon achieves the highest improvement factor of 32, when it comes to an upload of a 100 MB file in 85 segments. However, Google-EU provides consistent high data throughput for all data sizes and therefore achieves only an improvement factor of 5 for a segmented transmission of 100 MB data objects.

Important to note, is also the relation between the performance of a native chunk transfer P(s(Fx)) and the performance of multiple simultaneous chunk transfers (cumulated transfer) of the same size P(Fx). For fast providers (e.g. Amazon and Google), the values of P(s(Fx)) are approximately identical to P(Fx) (see table VII). This means, the choice of chunk size determines the accumulated transmission performance of the original data. In order to minimize transmission times of unsegmented data objects we have to identify the optimal chunk size. Experienced deviations can be attributed to the overhead associated with an establishment of API connections. In addition, with a high number of threads, it cannot be guaranteed, that all processes are executed exactly in parallel. Further, no assumptions can be made about the order in which individual API connections are processed on the side of providers. It is important to note, that the transmission of chunks of small sizes takes only few seconds, so that minor delays in the thread processing affect the measurement results.

For other providers the relation between P(s(Fx)) and P(Fx) may differ up to 300% (see table VII). The observed behavior could be attributed to weaker load balancing capabilities. It cloud be also presumed, that these providers limit the throughput performance beyond a certain number

of connections that are opened simultaneously. Here again, the minimization of native transfer time requires the identification of an appropriate chunk size and in this case the upper limit of simultaneous connections.

The evaluation of the second test provided insights into the constancy of performance during simultaneous data transfers. The results of the experiment are presented in figure 10. Following a preliminary analysis, the general behavior of cloud storage providers can be grouped into three categories:

• the number of simultaneous transfers has no impact on the average transmission performance of individual chunks (see figure 10a, 10b, 10c);

• additional connections decrease the average performance (e.g. in case of Google or Nirvanix in figure

10e and 10f) • the average performance decreases above a certain

segmentation level, the behavior is shown in figure 10d An especially interesting point is the constancy of

performance during simultaneous transfers observed at various segmentation levels during the experiment. Figure 11 shows that the transmission performance is relatively constant in most cases, except the providers Box, Nirvanix and HP. The average throughput performance of the latter services decreases above a certain number of simultaneous transfers. For example, at the segmentation level of 90 the transmission of threads 51 to 90 takes twice as much time as the transmission of threads 1 to 50 (see figure 11d). Again, the behavior might be related to a server-sided limitation of throughput performance beyond a certain number of open connections and is subject of the further analysis. However,


43

the decreasing performance of threads as the number of threads increases, which is observed in individual cases, can be compared with processor contention on virtual machines. As the load raises above a certain limit, the amount of bandwidth that each thread receives decreases. But as the figure 12 shows, the higher number of threads leads to a significantly higher bandwidth than individual threads.

From these observations, we come to the following decisive conclusions. Simultaneous transfers can increase significantly the transmission performance of individual data objects. The improvement factor depends on the capabilities of cloud providers when dealing with different file sizes and simultaneous transfers. A ”one-size-fits-all” approach is not workable. The optimization of Cloud-RAID requires an intelligent identification of chunk size taking into account server-sided limitations of open API connections. Nevertheless, conducted experiments provide

sufficient results for the identification of an appropriate transfer strategy based on the individual capabilities of cloud storage providers.

3) Availability: During the second part of the experiment (resilience testing) we performed over 660.000 operations (read/write). In this time period we observed only a few number of failed requests. After a thorough evaluation of the occurred failures, we can safely say, that nearly all exceptions can be attributed to the implementation errors on our side. Nevertheless, we experienced a number of operations that cloud not be completed due to some error on the server side (e.g readTimeOut or peerNotAuthenticated). Table VIII presents the observed availability of all experiments calculated as . Further, the table captures also the results of the infrequent hash value comparison, which was successful in nearly all cases, except the providers Rackspace and Box. Note, the observed

Figure 10 Average transmission performance of simultaneous uploads with increasing number of chunks. Each chunk has a size of10 MB.

Figure 11 Transmission performance of single threads at different segmentation levels. Each thread transfers a chunk of 10 MB.The per-thread performance for each thread decreases as the total number of threads increases from 1 to 100.


44

availability values represent only a short period of time, which is little more than twenty days. Actual values may differ from our observations.

Provider Number of writes

Errors

wrong hash value

Google 72500 0 100% 0 Amazon 72500 16 99,978% 0 Nirvanix 42000 3 99,993% 0 Rackspace 42000 145 99,643% 5 Box 42000 0 99,905% 40 HP 42000 0 100% 0 TABLE VIII THE OBSERVED AVAILABILITY DURING

THE EXPERIMENT.

6. RELATEDWORKThe main idea underlying our approach is to provide

RAID technique at the cloud storage level. In [9] the authors introduce the HAIL (High-Availability Integrity Layer) system, which utilizes RAID-like methods to manage remote file integrity and availability across a collection of servers or independent storage services. The system makes use of challenge-response protocols for retrievability (POR) [7] and proofs of data possession (PDP) [7] and unifies these two approaches. In comparison to our work, HAIL requires storage providers to run some code whereas our

system deals with cloud storage repositories as they are. Further, HAIL does not provide confidentiality guarantees for stored data. In [13] Dabek et al. use RAID-like techniques to ensure the availability and durability of data in distributed systems. In contrast to the mentioned approaches our system focuses on the economic problems of cloud computing described in chapter I.

Further, in [1] authors introduce RACS, a proxy that spreads the storage load over several providers. This approach is similar to our work as it also employs erasure code techniques to reduce overhead while still benefiting from higher availability and durability of RAID-like systems. Our concept goes beyond a simple distribution of users’ content. RACS lacks sophisticated capabilities such as intelligent file placement based on users’ requirements or automatic replication. In addition to it, the RACS system does not try to solve security issues of cloud storage, but focuses more on vendor lock-in. Therefore, the system is not able to detect any data corruption or confidentiality violations.

The future of distributed computing has been a subject of interest for various researchers in recent years. The authors in [11] propose an architecture for market-oriented allocation of resources within clouds. They discuss some existing cloud platforms from the market-oriented perspective and present a vision for creating a global cloud

Figure 8 Average transmission performance in seconds observed on all writes at different segmentation levels. Each initial dataobject (an unsegmented file) has a size of 100MB. The highlighted bars correspond to the best segmentation levels, whichrepresent the highest factor of improvement for individual providers. The size of chunks decreases with growing segmentationlevel.


45

exchange for trading services. The authors consider cloud storage as a low-cost alternative to dedicated Content Delivery Networks (CNDs).

There are more similar approaches dealing with high availability of data trough its distribution among several cloud providers. DepSky-A [8] protocol improves availability and integrity of cloud-stored data by replicating it on cloud providers using quorum techniques. This work has two main limitations. First, a data unit of size S consumes n x S storage capacity of the system and costs on average n times more than if was stored on a single cloud. Second, the protocol does not provide any confidentiality guaranties, as it stores the data in clear text. In their later work the authors present DepSky-CA, which solves the mentioned problems by the encryption of the data and optimization of the write and read process. However, the monetary costs of using the system is still twice the cost of using a single cloud. On top of this, DepSky does not provide any means or metrics for user centric data placement. In fact, our approach enables cloud storage users to place their data on the cloud based on their security policies as well as quality of service expectations and budget preferences.

7. CONCLUSIONIn this paper we outlined some general problems of

cloud computing such as security, service availability and a general risk for a customer to become dependent on a service provider. In the course of the paper we demonstrated how our system deals with the mentioned concerns. In a nutshell, we stripe users’ data across multiple providers while integrating with each storage provider via appropriate service-connectors. These connectors provide an abstraction layer to hide the complexity and differences in the usage of storage services.

The main focus of the paper is an extensive evaluation of our application. From the results obtained, we conclude that our approach improves availability at costs similar to using a single commercial cloud storage provider (instead of100% and more when full content replication is used). Our approach makes use of erasure code techniques for striping data across multiple providers. The experiment proved, that given the speed of current disks and CPUs, the libraries used are fast enough to provide good performance - whereby the overall performance depends on the throughput performance of the particular storage providers. More specifically, the throughput performance of Cloud-RAID increases with the selection of providers with higher throughput performance values. Hence, with an appropriate coding configuration Cloud-RAID is able to improve significantly the data transmission process when compared with cloud storages individually. We also observed that slow vendors may influence the transmission performance in a negative way. On the lookout for possible performance optimizations of Cloud-RAID we conducted further

experiments which helped us to understand how individual clouds handle high object counts of different sizes. The tests focused on the performance evaluation in terms of service provider’s response time and its resilience (i.e. availability, performance).

The results clearly demonstrate that simultaneous transfers can increase significantly the transmission performance (depending on the individual capabilities of cloud providers when dealing with different file sizes and simultaneous transfers). The requirements of implementing the proposed optimization include the identification of: appropriate chunk size (based on the individual throughput capabilities of each provider), and a limitation of open API connections (i.e. load balancing).

Nevertheless, we do not find one winning strategy to optimize the performance of Cloud-RAID. Rather, the optimization needs to be tackled individually per provider when it comes to simultaneous transfers of high object counts. Performance tests showed that our system is best utilized for deployment of large files. In case of transmission of smaller data objects the transmission is highly affected by the overhead which is associated with DNS look-ups, API connection time, and API handling of multiple threads. With increasing segmentation level (smaller chunk size), response time becomes significant as it can dominate the overall transmission rate. Therefore, it is an important factor when deciding on a segmentation strategy.

In the long term, our approach might foster the provision of new and even more favorable cloud storage services. Today, storage providers surely use RAID like methods to increase the reliability of the entrusted data to their customers. The procedure causes costs which are certainly covered by providers price structure. With our approach, the on-site backups might become redundant, as users data is distributed among dozens of storage services.

Furthermore, we enable users of cloud storage services to control the availably and physical segregation of the data by themselves. Google’s Durable Reduced Availability (DRA) storage could be a pioneer for such services, as it provides users storage buckets at lower cost (up to 29% less than standard Google Cloud Storage) and lower availability (99% instead of 99.9%).

However, additional storage offerings are expected to become available in the next few years. Due to the flexible and adaptable nature of our approach, we are able to support any changes in existing storage services as well as incorporating support for new providers as they appear.

8. FUTUREWORKOur performance testing revealed that some vendors

have optimized their systems for large data objects and high upload performance, while others have focused on smaller files and better download throughput. We will use these observations to optimize read and write performance of our


46

application. During our experiment we also observed that the reaction time of read and get-hash requests may vary from provider to provider at different times of day. This behaviour might be related to the usage of different consistency models and is subject of further analysis.

The performance of the model introduced heavily depends on relative location of cloud service providers and the uploading unit - Cloud-RAID. Therefore, the actual use is based on the assumption that the location of clouds and the uploading unit is always the same. In the course of future work, we aim to address an alternative scenario, where the parties involved into the distribution process do not need to be at the same location (e.g. upload can be located in Europe for saving data where as the downloading user can be in the US).

In addition, we are also planing to implement more service connectors and thus to integrate additional storage services. Any extra storage resource improves the performance and responsiveness of our system for end-users.

9. REFERENCES[1] Hussam Abu-Libdeh, Lonnie Princehouse, and Hakim Weatherspoon. Racs: A case for cloud storage diversity. SoCC’10, June 2010. [2] Rehab Alnemr, Justus Bross, and Christoph Meinel. Constructing a context-aware service-oriented reputation model using attention allocation points. Proceedings of the IEEE International Conference on Service Computing(SCC2009), 2009. [3] Rehab Alnemr and Christoph Meinel. Getting more from reputation systems: A context-aware reputation framework based on trust centers and agent lists. Computing in the Global Information Technology, International Multi-Conference, 2008. [4] Rehab Alnemr, Maxim Schnjakin, and Christoph Meinel. Towards context-aware service-oriented semantic reputation framework. IEEE TrustCom/IEEE ICESS/FCST, International Joint Conference of, 0:362–372, 2011. [5] Amazon. Amazon ec2 service level agreement. online, 2009. [6] Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica, and Matei Zaharia. A view of cloud computing. Commun. ACM, 53(4):50–58, April 2010. [7] G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson, and D. Song. Provable data possession at untrusted stores. 14th ACM CCS, 2007. [8] Alysson Bessani, Miguel Correia, Bruno Quaresma, Fernando Andr´e, and Paulo Sousa. Depsky: dependable and secure storage in a cloud-of-clouds. In Proceedings of the sixth conference on Computer systems, EuroSys ’11, pages 31–46, New York, NY, USA, 2011. ACM. [9] Kevin D. Bowers, Ari Juels, and Alina Oprea. Hail: A highavailability and integrity layer for cloud storage. CCS’09, November 2009. [10] James Broberg, Rajkumar Buyya, and Zahir Tari. Creating a ‘cloud storage’ mashup for high performance, low cost content delivery. Service-Oriented Computing (Volume 5472), ICSOC’08 Workshops, April 2009. [11] Rajkumar Buyya, Chee Shin Yeo, and Srikumar Venugopal. Market-oriented cloud computing: Vision, hype, and reality for delivering it services as computing utilities. Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications, August 2008.

[12] Nicholas Carr. The Big Switch. Norton, 2008. [13] F. Dabek, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Wide-area cooperative storage with cfs. ACM SOSP, October 2001. [14] R. Dingledine, M. Freedman, and D. Molnar. The freehaven project: Distributed anonymous storage service. The Workshop on Design Issues in Anonymity and Unobservability, July 2000. [15] John Gantz and David Reinsel. The digital universe in 2020: Big data, bigger digital shadow s, and biggest grow the in the far east. online, 2012. [16] Dan Iacono and Laura DuBois. Worldwide enterprise storage for public and private cloud 2012–2016 forecast: Enabling public cloud service providers and private clouds. online, 20012. [17] Ponemon Institute. Security of cloud computing providers study. online, April, 2011. [18] Leslie Lamport, Robert Shostak, and Marshall Pease. The byzantine generals problem. ACM Trans. Program. Lang. Syst., 4(3):382–401, July 1982. [19] Implementation of a Secure and Reliable Storage Above the Untrusted Clouds. Platform for a secure storage-infrastructure in the cloud. Proceedings of 8th International Conference on Computer Science and Education – ICCSE 2013, 2013. [20] J. S. Plank, S. Simmerman, and C. D. Schuman. Jerasure: A library in C/C++ facilitating erasure coding for storage applications - Version 1.2. Technical Report CS-08-627, University of Tennessee, August 2008. [21] James S. Plank. The raid-6 liberation codes. In Proceedings of the 6th USENIX Conference on File and Storage Technologies, FAST’08, pages 7:1–7:14, Berkeley, CA, USA, 2008. USENIX Association. [22] James S. Plank, Jianqiang Luo, Catherine D. Schuman, Lihao Xu, and Zooko Wilcox-O’Hearn. A performance evaluation and examination of open-source erasure coding libraries for storage. In Proccedings of the 7th conference on File and storage technologies, FAST ’09, pages 253–265, Berkeley, CA, USA, 2009. USENIX Association. [23] S. Rhea, C. Wells, P. Eaton, D. Geels, B. Zhao, H. Weatherspoon, and J. Kubiatowicz. Maintenance free global storage in oceanstore. IEEE Internet Computing, September 2001. [24] David Sarno. Microsoft says lost sidekick data will be restored to users. Los Angeles Times, October 2009. [25] Maxim Schnjakin, Rehab Alnemr, and Christoph Meine. A security and high-availability layer for cloud storage. In Web Information Systems Engineering – WISE 2010 Workshops, volume 6724 of Lecture Notes in Computer Science, pages 449–462. Springer Berlin / Heidelberg, 2011.

Authors Maxim Schnjakin studied computer science at the University of Trier and moved to Hasso-Plattner-Institute after graduation. Since 2008 he works as a scientific assistant at the chair of professor Ch. Meinel. The main focus of his research is enforcement of high availability and security

requirements in public clouds. Christoph Meinel studied Mathematics and Computer Sciences at the Humboldt-University of Berlin from 1974 to 1979 and received his PhD degree (Dr. rer. nat.) in 1981.


47

Today, Christoph Meinel is President, Scientific Director and CEO of the Hasso-Plattner-Institut for IT-Systems Engineering (HPI). Internet-Technologies and Systems are in the center of research and teaching of his chair.


48

CLOUDSTANDBYSYSTEMANDQUALITYMODELAlexanderLenk,FrankPallas

FZIForschungszentrumInformatikFriedrichstr.60,10117Berlin,Germany

{lenk,pallas}@fzi.deAbstractContingencyplansfordisasterpreparednessandconceptsforresumingregularoperationasquicklyaspossiblehavebeenanintegralpartofrunningacompanyfordecades.Today,largeportionsofrevenuegenerationaretakingplaceovertheInternetandithastobeensuredthattherespectiveresourcesandprocessesaresecuredagainstdisasters,too.Cloud‐Standby‐SystemsareawayforreplicatinganITinfrastructuretotheCloud.Inthiswork,theCloudStandbyapproachandaMarkov‐basedmodelispresentedthatcanbeusedtoanalyzeandconfigureCloudStandbysystemsonalongtermbasis.ItisshownthatbyusingaCloud‐Standby‐Systemtheavailabilitycanbeincreased,howconfigu‐rationparameters likethereplication intervalcanbeoptimized,andthatthemodelcanbeusedforsupportingthedecisionwhethertheinfrastructureshouldbereplicatedornot.Keywords:Cloud‐Standby,Warm‐Standby,BCM,CloudComputing,IaaS,DisasterRecovery__________________________________________________________________________________________________________________1. INTRODUCTION

The effort of companies to protect their production facil-ities, distribution channels or critical business processes against possible risks is not a new phenomenon. Instead, contingency plans for disaster preparedness and concepts for resuming regular operation as quickly as possible have been an integral part of running a company since the times of the industrial revolution. In this context, disasters are fire, earthquakes, terrorist attacks, power outages, theft, illness, or similar circumstances. The respective measures that must be taken in order to being prepared for such disasters and for keeping up critical business processes in the event of an emergency are commonly referred to as Business Continuity Management (BCM) (Hiles, 2010) in economics. The effec-tiveness of BCM can be controlled via the key figures Re-covery Time Objective (RTO) and Recovery Point Objective (RPO) (Hiles, 2010). RTO refers to the allowed time for which the business process may be interrupted and the RPO relates to the accepted amount of produced units or data that may be lost by an outage.

Today, with the Internet being production site as well as distribution channel, BCM faces different challenges. One of the most important tasks in IT-related emergency man-agement is the redundant replication of critical systems. Depending on the system class, different mechanisms are used to secure a system against prolonged outages. In this regard the RTO specifies the maximum allowed time within which the IT system must be up again and the RPO is the accepted period of data updates that may be lost, i.e. gener-ally the time between two backups (Wood et al., 2010).

This work presents an approach for a Cloud Standby system and the modeling of costs and availability based on Markov chains. Cloud Standby uses a meta model approach to describe distributed systems (Tanenbaum & Van Steen, 2002) in a machine readable language that can be used to deploy the system on several cloud providers. For modeling the quality attributes the basic idea is to carry out a random

walk (Gilks, Richardson, & Spiegelhalter, 1996) on the system’s state graph according to defined transition proba-bilities. The costs and the availability can then be calculated by means of the Markov chain and the probability distribu-tion for staying in each state. The presented model is illus-trated by means of a simple example and it is shown that the model can be used to calculate optimal configuration op-tions, like the replication interval, of the Cloud-Standby-Systems.

The remainder of this paper is structured as follows: First the related work and a description of the Cloud-Standby-System is presented. Then the quality model itself is developed and it is shown how it can be used to make deliberate configuration decisions on the basis of a simple example. Finally, the conclusion sums up the paper and gives an outlook to future work in this field1.

2. RELATEDWORKIn this work we present a) the general approach of Cloud

Standby, a warm standby for the Cloud and b) based on the states of the proposed Cloud Standby System a quality model for predicting the long term availability and costs with certain parameters.

2.1CloudStandby.

Wood et al. (2010) describe a generic warm standby sys-tem in order to evaluate whether Cloud Computing is bene-ficial for this use case. This paper does however concentrate on the economic part and describes no real warm standby system.

Klems, Tai, Shwartz, and Grabarnik (2010) describe a running system that allows to use the Cloud as a warm standby environment, using BPNM processes to orchestrate an IaaS provider’s infrastructure services. In contrast to the

1 This article is an extended version of the paper “Modeling Quali-

ty Attributes of Cloud-Standby-Systems” (Lenk & Pallas, 2013)


49

model presented herein, their approach focuses on single machines and does not allow to secure whole distributed systems.

PipeCloud (Wood, Lagar-Cavilla, Ramakrishnan, Shenoy, & Van der Merwe, 2011) targets private Clouds where the user has access to the hypervisor and uses this access for capturing disk writes and replicating them to the Cloud. Remus and SecondSite (Cully et al., 2008; Ra-jagopalan, Cully, O’Connor, & Warfield, 2012) basically follow a similar approach: direct the access to the hypervi-sor is used to dynamically replicate the whole virtual ma-chine to another location. Due to the requirement of hyper-visor access, these approaches can, however, not be applied in a public Cloud scenario.

2.2QualityModel.

The calculation of quality metrics addressed in this pa-per can generally be subdivided into the two fields of cost and availability calculation. Regarding these calculations, related work already exists in the field of virtualized infra-structures, Cloud Computing, and warm standby systems.

The approach of Alhazmi and Malaiya (2012) describes a way of evaluating disaster recovery plans by calculating the costs. The approach is of generic nature and is not focus-ing on the field of Cloud Computing with its own specific pricing models.

Wood et al. (2010) describe a way of replicating data from one virtual machine to a replica machine. The respec-tive cost calculation is limited to this specific approach and cannot be adapted to Cloud-Standby-Systems like the ones considered herein.

Dantas, Matos, Araujo, and Maciel (2012) present a Markov-based approach to model the availability of a warm-standby Eucalyptus cluster. Even if the approach is related to the work presented herein with regards to the used mathematical model and also shows that Markov chains can be used to model availabilities in Cloud Computing, it is not used to model the costs and the calculation of the availabil-ity is restricted to a single Eucalyptus installation with dif-ferent clusters and does not consider settings with several cloud providers.

Klems et al. (2010) present an approach for calculating the downtime of a Cloud-Standby-System. This approach evaluates the system in general but is a rather simplistic short term approach, comparing a Cloud-Standby-System with a manual replication approach.

2.3IaaSDeploymentMetaModel.

Over the years several IaaS deployment standards were proposed in the industry and science. Some of them are related to the deployment model presented in this paper.

Amazon Web Services (AWS Inc., 2013) is an IaaS pro-vider that added additional services over the time. Today Amazon Web Services cannot just be used to start virtual machines but to deploy whole scalable application stacks. These “Cloud Formations” allow constraint definition and

sophisticated monitoring. As one the industry leader in IaaS Cloud computing Amazon has a rich toolset and API. How-ever Amazon does not support provider independence since as an industry leader they are more interested in creating lock-in effects than reducing them.

Chieu et al (2010) introduce in their work the concept of a “Composite Appliance”'. A composite appliance is a col-lection of virtual machine images that have to be started together in order to run a multi-tier application. The pro-posed description language lacks support for runtime fea-tures like the current configuration of the virtual machines. The architecture proposed in this work requires the language to be implemented in the vendor's Cloud computing plat-form and thereby does not allow using different cloud pro-viders.

Konstantinou et al. (2009) describe in their work a mod-el-driven deployment description approach. This approach is based on different parties that participate in the deployment process. A deployment is first modeled as a “Virtual Solu-tion Model” and then by another party translated to a ven-dor-specific “Virtual Deployment Model”. Maximilien et al. (2009) introduce a model that allows deploying middleware systems on a Cloud infrastructure. In the aspects of deploy-ment both the meta model proposed by Konstantinou et al. and Maximilien et al. are related to our meta model. The approach of Maximilien et al. however is focused on the deployment of middleware technologies we are aiming on a more holistic approach that has the final distributed system in focus and not just middleare technologies. The approach of Konstantinou et al. does not allow having different Cloud vendors in a single deployment. However some of both the ideas of Konstantinou et al. and Maximilien et al. have in-fluenced our meta model.

Mietzner et al. (2009) orchestrate services by using workflows and web services using e.g. WS-BPEL. They concentrate on high level composition of services, rather than proposing an actual language for deployments. The focus on WS-BPEL gives the approach a great flexibility but also adds an overhead. The work we present in this pa-per is influenced by the work done and the general ideas developed by Mietzner et al. However while Mietzner et al. is focusing on the BPEL deployment processes, our work is focusing on the distributed application itself and enriches it with meta data that can also be used for deployment and thereby supporting higher level processes like the ones de-scribed by Mietzner et al.

3. CLOUDSTANDBYSYSTEMThe recovery of IT systems can be achieved through dif-

ferent replication mechanisms (Henderson, 2008; Schmidt, 2006). “Hot standby” is on the one side of the spectrum: A second data center with identical infrastructure is actively operated on another site with relevant data being continu-ously and consistently mirrored in almost real-time from the first to the second data center. The operating costs of such a


50

hot standby system, however, amount to the operating costs of the secondary system plus the cost of the mirroring. On the other side of the spectrum is the “cold standby”, the low-cost backup, e.g. on tape, without retaining a second site with backup infrastructure resources. A tape backup is not possible during productive operation and is usually done at times of low load like at night or during weekends. In this case, a RPO of days or weeks is common. Due to the fact that the IT infrastructure has to be newly procured in case of a disaster, an RTO of several days to months is possible. Between these two extremes lies the concept of “warm standby”. Although a backup infrastructure is kept at anoth-er location in this case, it is not fully active and must be initially put into operation in case of a disaster. A warm standby system usually has a RPO and RTO between minutes and hours.

A common option for reducing the operating costs of on-ly sporadically used IT infrastructure, such as in the case of the “warm standby” (Henderson, 2008; Schmidt, 2006), is Cloud Computing. Cloud Computing provides the user with a simple, direct access to a pool of configurable, elastic computing resources (e.g. networks, servers, storage, appli-cations, and other services, with a pay-per-use pricing mod-el) (Lenk, Klems, Nimis, Tai, & Sandholm, 2009; Mell & Grance, 2011). More specifically, this means that resources can be quickly (de-)provisioned by the user with minimal provider interaction and are also billed on the basis of actual consumption. This pricing model makes Cloud Computing a well-suited platform for hosting a replication site offering high availability at a reasonable price. Such a warm standby system with infrastructure resources (virtual machines, im-ages, etc.) being located and updated in the Cloud is herein referred to as a “Cloud-Standby-System”. The relevance and potential of this cloud-based option for hosting standby systems gets even more obvious in the light of the current situation in the market. Only fifty percent of small and me-dium enterprises currently practice BCM with regard to their IT-services while downtime costs sum up to $12,500-23,000 per day for them (Symantec, 2011).

The calculation of quality properties, such as the costs or the availability of a standby system, and the comparison with a “no-standby system” without replication is an im-portant basis for decision-making in terms of both the intro-duction and the configuration of Cloud-Standby-Systems. However, due to the structure and nature of standby sys-tems, this calculation is not trivial, as in each replication state different kinds of costs (replication costs, breakdown costs, etc.) with different cost structures incur. Furthermore, determining the quality of the system is difficult due to the long periods of time and the low probability of disasters (e.g. only one total outage every 10 years). A purely exper-imental determination by observing a reference system over decades is therefore not feasible. Instead, a method for simulating and calculating the long-term quality characteris-tics of different configurations is needed.

In this work we introduce Cloud-Standby. It is a novel Cloud based warm standby approach where a formally de-scribed primary distributed system (PS) running in Cloud 1 (C1) is periodically synced as a replica system (RS) to Cloud 2 (C2). The states of this Cloud-Standby-System are depicted in Fig. 1.

Fig. 1. State chart of a Cloud-Standby-System

3.1States.

PS Deployment. The PS is deployed on C1 at first and goes into runtime after . The time the deployment takes depends highly on its structure. For each use case the de-ployment time can be determined by means of experimenta-tion.

PS Runtime. During PS runtime, the RS is turned off and generates no costs. The PS data are, however backed up using standard backup tools. This ensures the RPO can be met when a disaster occurs.

PS Runtime + RS Update. Periodically (after ) the RS is started and changes are updated on the

RS. This ensures that the deployment time of the RS is re-duced when an actual disaster occurs. The time the update process lasts is defined as

RS Deployment. When C1 fails the RS is started and takes over the service. The time for the deployment is

. The deployment time varies with the amount of data that needs to be installed or stored during the deploy-ment process. This means that decreases with de-creasing . The correlation between and

can be determined through experiments or moni-toring over time.

RS Runtime. In case of an outage on C1, the RS takes over and only if during this time an outage also takes place on C2 the whole system is unavailable.

RS Runtime + PS Deployment. As soon as C1 is up again, the PS can be redeployed and then takes over the service.

Outage. If the systems on both Clouds are not available the outage of the service could not be avoided. Now the whole system must be recovered from the scratch by hand.

3.2IaaSDeploymentMetaModel.

For the automatic deployment of the PS and RS a ma-chine readable deployment description is needed which


51

allows the deployment of distributed systems on different IaaS providers. In this section we present a modeling ap-proach for formalizing distributed systems on top of feder-ated IaaS Clouds. In this context a distributed system is a software stack capable of providing a distributed application and all its artifacts (Object Management Group – OMG, 2011). Software artifacts are applications like a self-contained webserver binary or application packages like jar-files. Even a whole operation system including kernel, bina-ries, and other files, packaged in an image file is considered as an artifact. In the UML-notation the Component acts as a “DeploymentTarget” for the “DeployedArtifacts” (Object Management Group – OMG, 2011).

Deployment languages in federated Clouds have to meet several requirements (Lenk, Danschel, Klems, Bermbach, & Kurze, 2011). A Cloud service, by definition, needs to be elastic, meaning to scale up and down with changing de-mand within a short time (Mell & Grance, 2011). Since it should also be possible to deploy a service not only on a single Cloud provider, the model must also support Cloud federation (Kurze et al., 2011; Lenk et al., 2011).

Fig. 2. Distributed system deployment meta model

In Fig. 2 we depict a meta model for distributed systems in federated Clouds. A distributed system consists of several elastic components that have dependencies between each other (“component a” requires “component b”), representing the tiers in the distributed system. These components can be application servers, databases, key-value stores, etc. All components are running on a single cloud provider at a time but can - in the case of a disaster - be deployed on another provider in the federated Cloud. Each of these components has several instances running a configured operation system with a predefined software stack. This software stack is either stored in a basic image held by Cloud provider or is applied during the deployment process via installation of software packages on a single basic image or all the images of a component.

This modeling approach of having the federated image, federated virtual machine, and installation tasks assigned to the component ensures that all instances of a component have the same configuration and are thereby horizontally scalable. We assume that the functionality of the load bal-ancing is done via an external service having access to the model, or the load balancer being one component, required by the component it controls.

Infrastructure part. This part is modeled by a profes-sional (the IaaS Admin) knowing the different Cloud pro-viders, images, and software stacks. The task of the IaaS Admin is to select feasible virtual machines and basic imag-es from the available infrastructure and software packages that can be installed on top of the basic images for the prep-aration of configured images. Configured images are the functional design time representations of instances. Differ-ently configured images for different Cloud providers which are functionally equivalent will be grouped to feder-ated images. This ensures that the federated image has the same functionality even when instantiated on different Clouds.

In order to instantiate an instance the image needs to be deployed on a runtime, or virtual machine (VM). The virtual machine represents the non-functional attributes of the in-stance. The selected VM determines the properties like price, performance etc. for the instance. By defining inter-vals for these properties and by grouping the VMs along these dimensions in the intervals, the IaaS Admin guaran-tees that a federated virtual machine has the same or at least similar non-functional qualities on each provider.

Furthermore, the IaaS Admin groups for each Cloud ex-actly one VM and one configured image in the correspond-ing federated element. Thus, during the deployment phase the federated image and federated VM can be deployed on any Cloud provider represented in the infrastructure model.

One challenge in in this approach is to determine the vir-tual machines and basic images of different Cloud providers that are grouped to federated virtual machines and federated images. Furthermore, to determine the software that should be installed on a certain image is a highly manual task that should be automated in the future. Concepts like feature modeling for selection of Cloud services (Wittern, Kuhlen-kamp, & Menzel, 2012) could be used to automate this pro-cess.

Deployment part. In the deployment part the System Administrator defines the structure of the final distributed system on the basis of the infrastructure model. Since it is not clear how the different components are orchestrated for the final service, this task (currently) has to be carried out manually by the system administrator. He defines the com-ponents, the dependencies between them, and the software packages installed on the components. By grouping instanc-es in components the model enables elasticity.2

2 It is, however, not subject of this work to describe mechanisms

that are used to add/remove resources based on monitoring data or that deal


52

Deployment parallelization. The relation “component requires component” allows to calculate the dependency graph that is essential for the deployment process of the PS and the RS. It depends on the structure of the deploy-ment modeled with the description language how long the actual deployment takes. If there are many single software packages to be installed and the deployment graph allows no parallelization this process takes longer than in cases with completely configured virtual machines that all can be de-ployed in parallel.

Without this graph it is not clear in which order the components have to be deployed. When having the depend-ency graph calculated with a topological sort the sequential deployment order of the components can be calculated. This sequential deployment order is not optimal when it comes to deployment time. In the deployment graph there are often parallel paths where deployments can be started at the same time without violating the constraints introduced by the “requires” relation.

Fig. 3. Cloud Standby Deployment Algorithm

In computer science there are several algorithms for the parallel execution of jobs or tasks. All these algorithms, however, have in common that they assume that the re-sources are limited. In Cloud computing the assumption is that there are unlimited resources available. Therefor we use in this work a modified version of the MPM net planning algorithm (Thulasiraman & Swamy, 2011) from the disci-pline of operations research which is listed in Fig 3.

We use the forward calculation of the MPM algorithm to get the Early Start Dates (ESD) and the Early Finish Dates

with the problems of migrating of the data and so on. By using this design we just make sure that distributed systems described with this language are able to be elastic.

(EFD). The ESD give us an idea when approximately the component will be deployed but it is no guarantee. With the EST however the deployment scheduling can be calculated: components with the same start time can be deployed in parallel. The EFD give an approximation on how long the whole process will take with:

t max

4. CLOUDSTANDBYQUALITYMODELUsing the Cloud Standby approach on the one hand

leads to additional costs but on the other hand increases availability. In order to provide decision support regarding the question whether the introduction of such a Cloud Standby System is useful or not, the states are transferred into a mathematical model. In this chapter we build such a quality model using a graph and Markov chain, based on the UML chart in Fig.1.

In order to facilitate the calculation of quality properties at all, some variables must be defined and parameterized for calculation. Some of the parameters are defined in the use case, or of experimental origin, others are taken from exter-nal sources and some can only be estimated. Together with results from previous experiments, average start times can then be calculated. Table 1 represents the time variables to be parameterized as well as the underlying source for its parameterization.

Table 1. Parameters

Type Variable Unit Duration of the initial deployment

min.

Backup interval min. RS update time min. Duration of the replica deployment

min.

Transition from emergency to normal state

min.

Primary Cloud provider costs €/h/server Secondary Cloud provider costs

€/h/server

Unavailability costs €/h Primary Cloud availability years Secondary Cloud availability years To calculate the total costs, the costs for the run-time of

each server must be known. These data can be found in the offers of the Cloud providers. For some evaluations, the costs / loss of profit faced by the company in the case of system unavailability must also be known or at least esti-mated. All types of costs included in the following analysis are summarized in Table 1. The availability of the Cloud provider is an important basis for the calculation of the overall availability of the system and thus also of the costs.


53

Many Cloud providers declare such availability levels in their SLA. However, this availability is less interesting in the context of this calculation because this work focuses on global, long-term outages caused by disasters that cannot be handled by traditional backup techniques. The availability described in the third part of Table 1 indicates the average time period in which exactly one such global outage of the respective Cloud provider is likely to be expected.

Even if elasticity (Mell & Grance, 2011) is a key con-cept of Cloud Computing and although the prices for Cloud resources constantly changed during the past years, we use static values for the average amount of servers and for the costs over the years. These dynamic aspects could nonethe-less easily be added in future work by not having constant prices and servers but functions representing these values. For a first step towards modeling the costs of Cloud-Standby-Systems, however, the use of static values appears acceptable.

4.1Units.

The states for the state graph that should represent the basis for further calculations can be directly derived from the different states of the UML state chart (Fig. 1). In that regard, corresponds to the description of the state from the state space . To calculate the quality properties of the system, stopping times must be assigned to each of the states (see Table 2). It is assumed that the step length of the Markov chain is one minute and the stopping time is ∀ ∈ in a state .

Table 2. Designation of the states from the process steps

Process Step Model State PS Deployment PS Runtime PS Runtime + RS Update RS Deployment RS Runtime RS Runtime + PS Deployment Outage

≔

As shown in the definition of the stopping times , all times except those of , and can be determined from the previously set parameters (Table 1). The update interval

is part of the configuration and has a major influ-ence on the costs and the availability of the system. The time it takes to start the replica deployment ( strongly

depends on when the server has last been updated. Conse-quently, the start time of the replica is increased by a long update interval. Hence, an increase of the backup interval results in a reduction of the deployment time and according-ly the function is increasing monoton-ically. For it is assumed that the time is constant, regardless of the use of a standby system. The run-time of the standby system is therefore made up of the outage time less replication deployment time ( and the time for the return to the production system ( .

4.2Markovchainandtransitiongraph.

The quality properties of the standby system can be cal-culated by modeling the states as a Markov chain and a long-term distribution of the stopping time probabilities in the states . Due to the lack of memory of the Markov chain (Markov property) it is not possible to directly model the stopping times. The stopping times must be transferred into recurrence probabilities. These must be designed so that, on average, in of the cases the state is maintained and in one case the state is left. It follows that the total number of pos-sible cases is 1. Thus, the recurrence probabilities have to be calculated with ∀ ∈ :

1∀ ∈

In addition to the recurrence probabilities, the probabili-ties of an outage are required. These are calculated analo-gously to the recurrence probabilities. On the average, nor-malized to the iteration step of the Markov chain of one minute, one outage in the period of , ∈ 1, 2 should incur:

1∗ 365 ∗ 24 ∗ 60

, ∈ 1,2

Fig. 4. States of the standby system as a Markov chain ( )

s1 s3 s4

s2

s5

s7

1-λ2-ε1 1-λ3-ε1

ε1 1-λ4-ε2

1-λ5-ε2

ε2

1-λ1-ε1

ε1

ε2

λ1

λ2

λ3

λ4 λ5

λ7

1-λ7

ε1

s6

λ6

ε2

1-λ6-ε2


54

Standby System. Considering these probabilities, the Markov chain for the standby system can now be es-tablished as shown in Fig 4.

The transition matrix can be read directly from the Markov chain in Fig. 4:

λ 0 1 λ ε 0 0 ε 00 λ 1 λ ε ε 0 0 00 1 λ ε λ ε 0 0 00 0 0 λ 1 λ ε 0 ε0 0 0 0 λ 1 λ ε εε 0 1 λ ε 0 0 λ 0

1 λ 0 0 0 0 0 λ

No-Standby System. As the properties of the standby system should in the end be compared to the original sys-tem, now the Markov chain and the transition matrix

must be created as a reference for the system without replication. The two chains only differ in the fact that no update is performed, which means → ∞ , the stopping time in the states are equal to zero and no second provider exists, the probability of outage ε is there-fore 1. In case these parameters are applied to , the states and are no longer obtainable. With a probability of 1 the state of merges directly with and can thus be combined with .

Due to the fact that the update interval is infinite, the re-currence probability of is one3. This also results in a negative transition probability from to . However, as the recurrence probability of is zero, this negative transi-tion probability can be resolved by combining the vertices and to . Eventually, this results in a new recurrence probability for of 1 ε .

The new Markov chain is therefore (as shown in Fig. 5).

Fig. 5. States of the no-standby system as a Markov chain ( )

The transition matrix was created similarly to as a matrix, so that the same algorithms are applicable on

both matrices. The transitions to and from the states have a probability of zero:

3 lim

→lim

→1

λ 1 λ ε 0 0 0 0 ε0 1 ε 0 0 0 0 ε0 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0

1 λ 0 0 0 0 0 λ

4.3Long‐termdistribution.

The stationary distribution of a Markov chain can be calculated in order to reach a long-term distribution of the system. This distribution , ∈ states the probability of the system to be in the state , ∈ at any given time ∈ . With the help of the probability distribution, long-

term quality properties such as the cost of γ and the overall availability of can easily be calculated. The algorithm for determining the stationary distribution is represented in shortened form as follows4. In this case is the unit matrix and is the unit vector with the rank .

Ε

, ⋯ , 1⋮ ⋱ ⋮ ⋮, ⋯ , 1

The result of the equation system

∗

is the stationary distribution . This distribution is a vec-tor of which point ∈ indicates the probability to be in the state at a given step .

5. QUALITYMETRICSANDDECISIONSUPPORT

After defining the stationary distribution , ∈ , the quality properties of costs and availability can be deter-mined.

5.1Cost.

The costs for provider 1 result from the sum of the costs in the states , , , . The costs incur for pro-vider 2 during the update, in the emergency mode in , , and the recurrence via state . The costs for the

non-availability of the system incur in the states , , .

4 A detailed description of the calculation of the stationary distri-

bution is given in [6].


55

, , ≔

∈ , , ,

∈ , , , ∈ , ,

5.2Availability.

The availability results from the sum of the probabilities of the states in which the system is available ( , , , ) or from the recurrence probability for the states in which the system is unavailable ( , , ):

≔

∈ , , ,

1

∈ , ,

5.3Decisionsupportbasedonthequalitymetrics

In case of a decision having to be made whether the standby system should be used in a particular configuration or not, it is useful to compare the quality properties of the different options. Especially during the introduction phase such a direct comparison between the no-standby and the standby system makes sense.

In many cases companies cannot accurately predict cer-tain parameters such as the cost of an outage ( ) and can only make estimations in a specific interval. Therefore, it is appropriate to make quality properties not only depend-ent on the update interval, but also on other parameters.

Ratio of outage costs to replication interval. To per-form a comparison of the total costs in relation to the outage costs and update interval, the total costs with variable out-age costs ( and update interval ( ) have to be calculated first. We represent these total costs as:

, ≔ , , , ∈ 1,2

By using these variable cost calculation functions the ar-ea in which the two systems have the same cost can be de-termined. This is achieved by sectioning the function:

∶ , ∩ ,

The function will facilitate the consideration of the limit of value. In this case limits for the update interval are the value of continuous updates and an update interval tending to infinity. Due to the cost structure of the Cloud provider

(billing period of an hour) the continuous replication is to be equated with a replication interval of 1 hour or 60 minutes:

≔ lim→

≔ lim→

Outside of the interval cost , cost a Cloud-standby replication such as described in this work doesn't make sense. Should the costs cost decrease, the no-standby system is always cheaper and should the update interval be 60 minutes, two systems are operated in parallel. In this case, there would be a direct transition to a hot standby approach because it guarantees an even higher availability.

Ratio of availability to the replication interval. To es-tablish a ratio between availability and replication interval, the availability is represented as a function that is depend-ent on :

∶ , ∈ 1,2

This ratio allows a determination of the interval in which the standby system can ensure availability:

≔ lim→

≔ lim→

The system without replication availability is independ-ent of :

As the availability function is convex, always applies. Furthermore, it also applies:

This connection which is surprising at first glance can be explained by the fact that in case of an error in the no-standby system it will be directly changed to the state , while in case of a replication the outage time can be bridged by using Cloud provider 2. Only in case of 0 ap-plies:

i.e. for 0 ∨ 60 applies:


56

Thus, it can be assumed that from an availability point of view, the outage time 0 should definitely be used on a standby system, even if a very large update interval is chosen.

Determining the cost neutral update interval. In order to decide on the length of the replication interval it makes sense to perform a comparison of the systems on a cost basis. It is assumed that outage costs can be quanti-fied. In order to perform a cost comparison, the two total cost functions are set up:

, , , 1,2

The maximum and minimum costs for the standby sys-tem can easily be determined by considering the limit val-ues:

, ∶ lim→ ,

, ∶ lim→ ,

The cost neutral update interval can be determined by the intersection of the two cost functions:

≔

, ∩ ,

6. USECASEIn section 3 we motivated that a quality model is needed

to evaluate if a Cloud-Standby-System is useful in a given use case. In this chapter we evaluate the model by applying it to a given use case. We demonstrate how the quality mod-el can be applied to a server deployment of 10 servers and given or experimentally determined metrics (see Table 3). It is further illustrated how the administrator of the application can be supported in his decision whether to use Cloud Standby or not.

Table 3. Input Parameter (see Table 1-3) Assumptions

Variable Value 60 min. 30 min.

1440 min. 10

0,68€/h/server5 0,68€/h/server 10 years 10 years

5 „Extra Large“ Amazon EC2 instance in the availability zone EU-

West or performance wise comparable instance on another vendor [5]

For the calculation of the quality properties, it is neces-sary to determine the time for the deployment of the replica ( ). As the time depends on the update frequency, it must be adopted via a function. We assume that 50% of the deployment process is fixed and 50% may be affected by the update interval. The strictly monotonically increasing function should have its lowest point at an update interval of 60 and approach the limit of the time for the initial deploy-ment at infinity (see Fig. 6):

1 0,560

, ∈ 60,∞

This function will in our future work be determined by in-terpolation of data points from real experiments.

Fig. 6. RS deployment time

6.1Ratioofoutagecoststothereplicationinterval.

With the help of the stationary distributions (see Sec-tion 4.3) and the costs in Table 5 the cost functions can now be defined depending on and cost using

formula , with 1 (Cloud Standby System) and 2 (No-Standby-System).

Representing the two functions in a graph (Fig. 7) re-veals combinations where has lower function values (total costs) and others where is lower. The intersection of the functions establishes a curve on which both systems have the same level of costs. This function is represented in Fig. 8. Besides the combinations leading to the same costs (grey line), the combinations in which the standby system is monetarily inferior to the normal system (grey area) as well as those in which the standby system is cheaper (white area) can be identified.


57

Fig. 7. Comparison of the total costs (colored area) and (grey area) at variable and

The limits of the function result in the interval in which a Cloud-standby approach on the basis of total costs makes sense:

lim→

6.79€/

lim→

8198.79€/

In the case of the costs for the outage being lower than the assumed values for server costs, costs for outage times, etc. at more than 8198.79 € per hour, a standby system should be deployed in any case. However, such high costs suggest the approach of a hot standby as two systems can be operated in parallel without any further costs. Given the above-mentioned assumptions, the use of a standby system does not make sense when the outage costs are less than 6.79 € per hour. In this case no matter how large the replica-tion interval is selected, the use of a simple, unsecured sys-tem makes more sense from a cost perspective (but not in terms of availability).

6.2Ratioofavailabilitytothereplicationinterval.

Applying the values from Table 5, the availability func-tions of and can be calculated depending on

. The overall availability of the system increases noticea-

bly by introducing the standby system. The limit of the function and the value of are:

lim→

0.9999883

lim→

0.9999940

0.999988201

Fig. 8. and combinations in which the standby system is more expensive (grey area), costs the

same (grey line) and is cheaper (white area).

Since an outage time of 0 was assumed, thus it

always makes sense in terms of availability to use the standby system as already presumed.

6.3Determiningthecostneutralupdateinterval.

Now the cost neutral update interval has to be defined, i.e. the time in which the no-standby system and the standby system produce the same costs. Therefore, it is exemplarily assumed that the outage costs are deter-mined: 400€/h . With the help of these outage costs, the new cost functions can be set up now:

, , 400 , 1,2

Consideration of the limit value easily depicts the mini-mal and maximal costs:

, lim→ ,

59650.34€/year

, lim→ ,

99772.07€/year

The costs for the use of the system without replication can be calculated with the function , . These costs are independent of t and thus constant. It is evident that the costs of , are reduced with an increasing update interval and at some point cut with , . By calculating the equation

, ,

0

1000

5000

1000

60000

78000

200

01000 5000

1000


58

to , the update interval that can be selected without additional monetary expenses can be determined:

1923.03 . Considering the outage costs, the system assumed in the

example can be made more available without higher costs at an update interval of 1923 minutes, which is a bit less than a daily update (every 1.33 days). The following changes in the availability arise from this: 1923 19230.000274.This means that the system in the given use case is within 10 years 1440 minutes or one day more available and consequently the availability class will rise from 3 to 4 with the same costs6.

7. CONCLUSIONIn this work we presented a novel approach for warm

standby in the Cloud. Our Cloud Standby approach repli-cates the modeled primary system periodically to another Cloud provider. The quality attributes of this new Cloud Standby System are formalized by a novel Markov chain based approach. In this paper it was presented that this for-mal model can be used to calculate the availability and long-term costs of a Cloud Standby System. It was also shown that a Cloud Standby System has an advantage over a no-standby system in matters of availability even if the replica-tion is not even performed once. It was also shown how the model can be used to configure a Cloud Standby System. Since it was proven that a Cloud Standby System provides a higher availability by design, future work is to develop a reference architecture for this kind of systems. Future work might concentrate on the support of the different roles when formalizing the distributed system. It can also concentrate on the introduction of more dynamic parameters regarding provider costs, outage costs, etc. into the model presented herein.

8. REFERENCESAlhazmi, O. H., & Malaiya, Y. K. (2012). Assessing Disaster Recovery Alternatives: On-site, Colocation or Cloud. In Software Reliability Engi-neering Workshops (ISSREW), 2012 IEEE 23rd International Symposium on (pp. 19–20). IEEE.

AWS Inc. (2013). Amazon Web Services, Cloud Computing: Compute, Storage, Database. Retrieved May 30, 2013, from https://aws.amazon.com/

Cully, B., Lefebvre, G., Meyer, D., Feeley, M., Hutchinson, N., & Warfield, A. (2008). Remus: High Availability Via Asynchronous Virtual Machine Replication. In Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation (pp. 161–174).

Dantas, J., Matos, R., Araujo, J., & Maciel, P. (2012). An availability model for eucalyptus platform: An analysis of warm-standy replication mechanism. In Systems, Man, and Cybernetics (SMC), 2012 IEEE Interna-tional Conference on (pp. 1664–1669). Retrieved from http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6377976

6 The introduction of the Cloud-Standby-System may, however,

introduce other costs that are not included herein but are subject to future work.

Gilks, W. R., Richardson, S., & Spiegelhalter, D. J. (1996). Markov chain Monte Carlo in practice (Vol. 2). CRC press.

Henderson, C. (2008). Building scalable web sites. O’reilly.

Hiles, A. (2010). The definitive handbook of business continuity manage-ment. Wiley.

Klems, M., Tai, S., Shwartz, L., & Grabarnik, G. (2010). Automating the delivery of IT Service Continuity Management through cloud service orchestration. In Network Operations and Management Symposium (NOMS), 2010 IEEE (pp. 65–72).

Konstantinou, A. V., Eilam, T., Kalantar, M., Totok, A. A., Arnold, W., & Snible, E. (2009). An architecture for virtual solution composition and deployment in infrastructure clouds. In Proceedings of the 3rd international workshop on Virtualization technologies in distributed computing (pp. 9–18).

Kurze, T., Klems, M., Bermbach, D., Lenk, A., Tai, S., & Kunze, M. (2011). Cloud federation (pp. 32–38). Presented at the CLOUD COMPUTING 2011, The Second International Conference on Cloud Com-puting, GRIDs, and Virtualization.

Lenk, A., Klems, M., Nimis, J., Tai, S., & Sandholm, T. (2009). What’s inside the Cloud? An architectural map of the Cloud landscape. In Software Engineering Challenges of Cloud Computing, 2009. CLOUD’09. ICSE Workshop on (pp. 23–31).

Lenk, Alexander, & Pallas, F. (2013). Modeling Quality Attributes of Cloud-Standby-Systems. In Service-Oriented and Cloud Computing (pp. 49–63). Springer. Retrieved from http://link.springer.com/chapter/10.1007/978-3-642-40651-5_5

Maximilien, E. M., Ranabahu, A., Engehausen, R., & Anderson, L. C. (2009). Toward cloud-agnostic middlewares. In Proceeding of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications - OOPSLA ’09 (p. 619). Orlando, Florida, USA. doi:10.1145/1639950.1639957

Mell, P., & Grance, T. (2011). The NIST definition of cloud computing. NIST special publication, 800, 145.

Mietzner, R., Unger, T., & Leymann, F. (2009). Cafe: A generic configura-ble customizable composite cloud application framework. On the Move to Meaningful Internet Systems: OTM 2009, 357–364.

Object Management Group, Inc. (OMG). (2011). Unified Modeling Lan-guage (UML), Superstructure Specification Version 2.4.1. Retrieved from http://www.omg.org/spec/UML/2.4.1/Superstructure/PDF

Rajagopalan, S., Cully, B., O’Connor, R., & Warfield, A. (2012). Sec-ondSite: disaster tolerance as a service. In Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments (pp. 97–108). Retrieved from http://dl.acm.org/citation.cfm?id=2151039

Schmidt, K. (2006). High availability and disaster recovery. Springer.

Symantec. (2011). 2011 SMB Disaster Preparedness Survey - Global Results. Retrieved from http://www.symantec.com/content/en/us/about/media/pdfs/symc_2011_SMB_DP_Survey_Report_Global.pdf?om_ext_cid=biz_socmed_twitter_facebook_marketwire_linkedin_2011Jan_worldwide_dpsurvey

Tanenbaum, A. S., & Van Steen, M. (2002). Distributed systems (Vol. 2). Prentice Hall.

Thulasiraman, K., & Swamy, M. N. (2011). Graphs: theory and algorithms. Wiley-Interscience.

Trieu Chieu, Karve, A., Mohindra, A., & Segal, A. (2010). Simplifying solution deployment on a Cloud through composite appliances. In Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 2010 IEEE International Symposium on (pp. 1–5). doi:10.1109/IPDPSW.2010.5470721


59

Wittern, E., Kuhlenkamp, J., & Menzel, M. (2012). Cloud service selection based on variability modeling. In Service-Oriented Computing (pp. 127–141). Springer.

Wood, T., Lagar-Cavilla, H. A., Ramakrishnan, K., Shenoy, P., & Van der Merwe, J. (2011). PipeCloud: using causality to overcome speed-of-light delays in cloud-based disaster recovery. In Proceedings of the 2nd ACM Symposium on Cloud Computing (p. 17).

Wood, Timothy, Cecchet, E., Ramakrishnan, K. K., Shenoy, P., Van der Merwe, J., & Venkataramani, A. (2010). Disaster recovery as a cloud service: Economic benefits & deployment challenges. In 2nd USENIX Workshop on Hot Topics in Cloud Computing. Retrieved from http://www.usenix.org/event/hotcloud10/tech/full_papers/Wood.pdf

Authors

Alexander Lenk is department manager at the FZI Research Center for Information Technology (Berlin office) and researcher at the Karlsruhe Institute of Technology. He has been focusing on Cloud Compu-ting since 2008 with his main research interests in disaster recovery, deployment

description, and distributed systems.

Frank Pallas is postdoc researcher at the FZI Research Center for Information Technology (Berlin office) as well as at the Karlsruhe Institute of Technology. His main research areas include the managea-bility and governability of complex sys-tems like Cloud Computing and Smart

Grids. Furthermore, he holds a professorship for data pro-tection and information economics at the TU Berlin.


60

EFFICIENTPRIVATECLOUDOPERATIONUSINGPROACTIVEMANAGEMENTSERVICE

DapengDongandJohnHerbertMobileandInternetSystemsLaboratory

UniversityCollegeCork{d.dong,j.herbert}@cs.ucc.ie

AbstractOperation management for a private cloud infrastructure faces many challenges including efficient resourceallocation, load‐balancing, and quick response to real‐time workload changes. Traditional manual IT operationmanagement is inadequate for this highly dynamic and complex environment. This work presents a distributedservicearchitecturewhichisdesignedtoprovideanautomated,shared,andoff‐siteoperationmanagementserviceforprivateclouds.Theservicearchitectureincorporatesimportantconceptssuchas:MetricTemplatesforminimisingthenetworkoverheadfortransmissionofcloudmetrics;aCloudProjectionthatprovidesaglobalviewofthecurrentstatusandstructureofthecloud,supportingoptimaldecisionmaking;andaCalendar‐basedDataStorageModeltoreducethestoragerequiredforcloudmetricdataandincreaseanalysisperformance.Aproactiveresponsetocloudeventsisgeneratedbasedonstatisticalanalysisofhistoricalmetricsandpredictedusage.Thearchitecture,functionalcomponents, and operation management strategies are described. A prototype implementation of the proposedarchitecture was deployed as a service on the IBM SmartCloud. The effectiveness and usability of the proposedproactiveoperationmanagementsolutionhasbeencomprehensivelyevaluatedusingasimulatedprivatecloudwithdynamicandreal‐worldworkloads.Keywords:Architecture,Cloud,OperationManagement__________________________________________________________________________________________________________________1. INTRODUCTION

Cloud computing introduces a new computing paradigm to IT organizations. The cloud deployment of services is maturing at pace. It seems that market momentum makes the widespread adoption of cloud computing inevitable. At the same time, the use of a public cloud poses concerns, such as security, privacy, data confidentiality, infrastructure control, and vendor lock-in, etc., as discussed, for example, (Josyula, Orr, & Page, 2012) (Finn, Vredevoort, Lownds, & Flynn, 2012) (Gartner, 2012). In this context, private and hybrid clouds become important alternatives for many organizations.

Acquiring a private or hybrid cloud brings IT management responsibilities back to the IT organizations. In particular, cloud operation management is different from traditional IT operation management. The new cloud concepts, such as: asynchronous architecture, virtualization, Virtual Machine migration, and resource fabric, etc., require IT personnel to gain new knowledge and skills in order to efficiently manage the cloud infrastructure. Most cloud vendors provide private cloud operation management suites (Finn, Vredevoort, Lownds, & Flynn, 2012), (VMware, 2011); these are essentially a set of tools given to IT personnel to ease operation management processes. Faced with the problem of optimal placement of several hundreds of Virtual Machines (VMs) and the need to respond to thousands of randomly occurring system events, it is easy to conclude that the reactive management approach is no longer suitable for cloud management. As well as

management complexity, tools for managing cloud infrastructure are often available to those with large IT budgets. An off-site and multi-tenant operation management service can lower such cost as well as cost on operations and facility.

To better respond to business demands on IT resources, the term Proactive Management has been stressed by many industrial cloud management solution pioneers (Williams & Wolfe, 2011) (Netuitive, 2012) (CA Technologies, 2012). The Proactive Management, in essence, deals with the management life-cycle of information collection, event detection and analysis, and response. Consideration must also be given to aspects such as cost of transmission of metric data to the management service components, synchronization between cloud and management service components, metric data storage, anomaly detection and analysis, resource management, and appropriate timely event response. These challenges and their solutions characterize the proposed architecture and differentiate this work from others.

A prototype implementation of the proactive operation management service was deployed on the IBM SmartCloud, and a simulated private cloud was connected to this service. A set of real-world workloads were given to each simulated entity of the simulated private cloud. The important aspects of the proposed architecture were evaluated in terms of communication cost, Cloud Projection transmission cost, and the effectiveness of the Calendar-based Data Storage Model, as well as the mathematical analysis engine. The


61

evaluation demonstrated the effectiveness and usability of the architecture.

The remainder of the paper is organized as follows. Section 2 presents and discusses the proposed architecture. Section 3 evaluates a prototype implementation of the architecture. A discussion of related work follows in section 4, and the final section concludes this work and points out directions for further research.

2. ARCHITECTUREOVERVIEWThe proposed architecture (Figure 1) has two high-level

components: a Service Delegator and an Operation Management Service.

2.1THESERVICEDELEGATOR.

The Service Delegator acts as a middleware between a private cloud and the cloud Operation Management Service (OMS). A key design consideration for the Service Delegator is that the Service Delegator must not provide any publicly accessible point. Network traffic between the Service Delegator and the OMS can be bidirectional, but the communication session can only be initiated from the Service Delegator to ensure security. To satisfy this design goal, the components of the Service Delegator need to actively and periodically check with the management services. Each component of the Service Delegator is a self-contained program; they can also be gathered together and provided as a VM image.

Essential for the operation of this architecture are the pre-deployed metric monitors on each VM and hypervisor, and a cloud model. Once the Service Delegator is in place, the metric monitor will periodically emit measured metrics to a central point -- the Metric Publisher. The Metric Publisher acts as a hub. Its main role is to keep a subscribed private cloud and the OMS synchronized. Cloud metrics sent from metric monitors are often raw data, and usually contain large amounts of redundant and useless information.

In order to minimize the impact of sending metric data to the management service on the local (private cloud) network, the Metric Publisher filters the received cloud metrics, and stores only the necessary information in the Cloud Model Storage. The Cloud Model Storage is essentially an object storage held by the Metric Publisher component in computer memory, rather than an independent persistent storage. It has structure shown in Figure 3 (detailed in Section 2.2).

In order to keep the two parties (private cloud and OMS) synchronized, the Metric Publisher sends the cached cloud metric data from the Cloud Model Storage to the OMS (the Cloud Modeler component) periodically. Firstly, the Metric Publisher extracts information from the Cloud Model Storage to fill up Metric Templates. A Metric Template is essentially a compact data structure which contains a set of ID tags of cloud entities (Servers and VMs), each ID tag is associated with a series of floating point numbers (metrics), and the sequence of metrics are known to both the Service Delegator and OMS. The number of metrics of interest and the sequence of the metrics are defined by Metric Template meta-data. The Metric Template meta-data also contains other auxiliary information including compression scheme, Metric Template Publishing Interval (MTPI), etc., that keeps the Service Delegator and OMS synchronized. It is the responsibility of the OMS to generate Metric Template meta-data, and implement it through interaction with the Service Delegator. The Metric Template meta-data is also used to control the subscription service level (such as: bronze, silver, and gold) by manipulating the number of metrics of interest and MTPI, etc.

There are four types of Metric Template defined in the prototype implementation:

1. Metric Template for Server Configuration (MTSC) which contains a list of physical servers and their configuration information (e.g., installed memory size, local storage size, CPU architecture, speed, server model, and manufacture. Especially, the server model and manufacture

Figure 1. The Proactive Operation Management Architecture Component Diagram


62

information are important for the Analyzer component of OMS to determine power consumption model of a server, described in Section 2.2), current status and a Server_ID.

2. Metric Template for VM Configuration (MTVC) which contains a list of VMs and their configuration information, current status, a VM/Server_ID, and a Service_ID.

3. Metric Template for VM Utilization (MTVU) which contains a list of VMs with current utilization status of each VM component (e.g., current memory, CPU, and storage utilization readings, etc.), current status, and a VM/Server_ID.

4. Metric Template for Server Utilization (MTSU) which contains a list of servers with I/O related information, such as memory read/write throughputs, storage read/write throughputs, and a Server_ID.

Within each Metric Template, entities (VMs/Servers) and elements (metrics) are separated by selected delimiters accordingly. After filling up a Metric Template, the Metric Publisher compresses it; prefixes a message-type tag, a time stamp, and a Subscriber_ID to the compressed Metric Template; then encapsulates everything into a message using Base64 encode and sends it to the management service Exchange.

The Metric Publisher publishes Metric Templates at a regular time interval -- the Metric Template Publishing Interval (MTPI). For the purpose of bandwidth conservation and due to the fact that configuration information rarely changes, the MTPI for MTSC and MTVC templates are set to be much longer than the one for MTVU and MTSU metric templates. Notice that metric monitors may emit their measurements at different point of time. Therefore, within a MTPI, a Metric Template may not be in a completed form. For instance, a MTVU may not contain all active VMs in the cloud; and any missing data (e.g., CPU utilization information) of a VM listed in the Metric Template is indicated by a special character in the Metric Template. The Request Publisher does a similar function but deals with customized requests, such as requests for suggestions for a new VM placement, and these customized requests will be sent immediately. In this work, the MTPI is a fixed time interval. Ideally it would be dynamically adjusted by the activity level of the private cloud, but this is primarily limited by the statistical analysis based optimization engine, and it will be investigated further in future work.

Figure 2. The Proactive Operation Management Architecture Communication Diagram

Figure 3. Cloud Model Hierarchy


63

The Suggestion Subscriber component actively and periodically checks with the management service provider whether there is any information available. The frequency of checking Suggestion queues shall be much higher than the MTPI to avoid missing and/or disordered Suggestions. It only receives Suggestions. Suggestions are encapsulated in the payload of the subscribed messages in JSON (JavaScript Object Notation) format, so that the received Suggestions can be embedded into Action Temples directly (or with minimum changes). Code segment (Listing 1) shows a fragment of a Suggestion for migration of a VM to another host (Different actions are associated with different sets of pre-defined attributes. Furthermore, each action is also associated with a list of reasons which are used to identify the causes of such an action). In order to achieve automation in the operation management life-cycle, an Action Manager component is also provided. It contains a set of Action Templates which are written in RESTful (Representational State Transfer) Style. Upon receiving a Suggestion, the Action Manager will firstly check the validity of the Suggestion (Using the "reason" field). If this Suggestion is still valid, it passes on the Suggestion to the Action Manager. The Action Manager will use the Suggestion to complete a corresponding Action Template and carry out suggested actions in the private cloud. Otherwise, the Suggestion will be discarded.

Listing 1. Example of Suggestion for VM Migration

2.2THEOPERATIONMANAGEMENTSERVICE.The Operation Management Service (OMS) is provided

as a multi-tenant service. The Service Engine (Figure 1) is the core of the OMS, and it is supported by a sophisticated Mathematical Analysis Engine.

The Service Engine. The Service Engine of the OMS receives requests and cloud metric data from subscribers through Exchange (Input). The Exchange (Input) module acts as a common communication interface among subscribers. It essentially is a queuing system which buffers incoming messages (Metric Templates and Requests). Messages are directly consumed by the Event Monitor. The Event Monitor decodes messages, filters expired and disordered messages based on the time stamp and

Subscriber_ID, then dispatches decoded messages to the designated event-group queue according to the message type. The Event Monitor defines three groups of events (Figure 2): CM (Cloud Modeling), DM (Data Modeling), and Req (Requests) by default; each group is called a "Topic", and Topics are sent to corresponding topic specific queues. In addition, the Event Monitor is also responsible for dispatching control events and scheduled events (e.g., trigger the data modeling process; generate Suggestions for consolidation of VMs, etc.). Behind each topic queue, four compulsory modules (Cloud Modeler, Optimizer, Data Modeler, and Analyzer) are built into the architecture. They are functionally independent.

The Cloud Modeler builds a cloud model for each subscribed private cloud. In order to make correct and accurate decisions on cloud operations, such as resource provisioning, VM placement, and consolidation of VMs, a global view of a subscriber (private cloud) is absolutely necessary. The Cloud Modeler organizes cloud objects in a hierarchical way, and cloud objects are stored in Cloud Model Storage (same concept as defined in Section 2.1). There are four levels (Cloud, Server, VM, and Component) in the hierarchy illustrated in Figure 3. Ideally, a full cloud model needs to be built at the beginning of a service subscription. In a real industrial deployment, private clouds may already be up and running, and it’s often hard to get all information about a private cloud at once. For these reasons, a cloud model can be built gradually. In another words, a private cloud needs to be connected to the OMS for a certain period of time (mainly driven by the size of the private cloud) to ensure the cloud model is relatively consistent with the actual private cloud. The consistency level is based on the number of occurrences of Server creation processes in the cloud model, and is influenced by the MTPI parameter.

The Cloud object is created at the service registration phase. The Server, VM, and Component objects are created upon receiving MTSC and MTVC respectively. If a server/VM has already been created in the cloud model, the received data is then used for updating purposes. For example, if the received MTSC contains a server ID (svr_1), and the same server ID cannot be found in the current Cloud Model Storage, then the Cloud Modeler will create a Server object with server ID equals to svr_1; and the configuration information of the newly created server will be given by the svr_1 and its associated metrics in the corresponding MTSC. Upon receiving MTSUs or MTVUs, utilization information of servers/VMs will be temporarily stored in the Utilization_Cache (Figure 3). The Utilization_Cache is a FIFO (First In First Out) queue. It is used to cache a certain length (a day) of utilization histories, which will be used by the Optimizer and Data Modeler. If a server/VM listed in the MTSU/MTVU doesn't exist in the current cloud model, then it will be ignored. Because MTSUs or MTVUs do not contain any server/VM configuration information, creating host/VM objects without configuration information is

…… { "migration": { "server_id": "vm_id", "host": "Host_B", "block_migration": false, "disk_over_commit": false, "reason": "server_over_utilized" } } ……


64

meaningless in the cloud model. This can be remedied by receiving subsequent MTSC/MTVC. If both MTVC and MTVU for a VM have not been received for a certain period of time, the VM will be considered to be in sleep mode, and will be removed eventually. The Cloud model is used by the Optimizer directly.

The Optimizer is event driven. It is triggered upon receiving requests, MTSUs, or MTVUs. The Optimizer is tightly coupled with the Cloud Modeler. At the beginning of the service subscription, a dedicated Optimizer will be assigned to a subscriber (in fact, it is assigned to a cloud model which is specially built for the subscriber). The Optimizer and the Cloud Modeler run in the same program process but in separate threads, and listening on their own topic specific queues. The number of types of cloud events defines the scope of the Optimizer. Example cloud events are server/VM over/under utilized and server/VM crashed, etc. A proactive response to these cloud events is generated based on statistical analysis of historical metrics cached in the Utilization_Cache (Figure 3) and guided by policies. Various constraints are defined in the Policy including VM affinity, thresholds for triggering load balancing events, etc. For instance, given a load-balance policy of trigging load-balance process when the CPU utilization of a VM reaches 85% of its full capacity for at least 10 minutes; assume MTPI for MTVU is two minutes; upon receiving a MTVU, the Optimizer will check each VM in the Cloud Model Storage is to whether it is over utilized for at least 10 minutes by seeking at least five consecutive CPU utilization >85% in the CPU Utilization_Cache of the VM. If there is any match found, a load-balance response will be generated for that VM. The generated responses are called Suggestions. A Suggestion consists of three parts: Action, Attributes, and Reasons, as described in Section 2.1. There are six fundamental Actions defined in the current implementation, including VM Creation/Deletion, VM Suspension/Resumption, VM Load-balancing, and VM Migration. The core algorithms for these actions are VM placement algorithms including new VM placement and VM replacement. Suggestions are formatted in JSON style, Base64 encoded, and sent to the Exchange (Output). The Exchange (Output) routes Suggestions based on the Subscriber_ID (each subscriber has dedicated Suggestion queues).

The Data Modeler builds resource usage models for each VM and services. Data models are stored and organized in a Calendar-based Storage Model (CBSM). In simple terms, the CBSM provides storage for program objects. Objects (data models) stored in the CBSM are indexed by calendar date, so that data models can be associated with calendar events (such as weekends and public holidays, etc.). There are mainly three reasons for storing resource usage data models rather than the original data. The first reason is to reduce the storage required for cloud metric data. The OMS is continuously receiving cloud metrics from subscribers, storing this accumulated data has serious cost implications.

The Data Modeler builds resource usage models for VMs/services on a daily, weekly, monthly, seasonal, and yearly basis. Data models are in fact program objects, specifically generic Java objects, because there are many choices for modeling data, data models are casted to generic objects and tagged (identify modeling techniques used), then stored in CBSM. The compressed data model objects are much smaller than the compressed original data (as discussed in Section 3). The second reason is to improve the performance of analysis through model reuse. Modeling data is often a CPU intensive and time consuming process. Using pre-built data models can significantly boost the performance of the Analyzer component. The third reason is for more accurate decision-making (e.g., more accurate forecast results for resource provisioning). This is because usage models are indexed by calendar date, depending on the current date, corresponding usage models will be selected to do for example forecast (e.g., a dialy usage model built on Sunday may not be useful when doing forecast for Monday.

The schemes used for organization of the data models in the CBSM in the current implementation are as follows. 1) Utilization data models for each VM are built daily. 2) Utilization data models for each service are built weekly, monthly, seasonal, and yearly. 3) Generalized utilization data models are built for each VM in such a way that daily data models of each VM within a season (minimize seasonal effects on a generalized model) are collected together to build a generalized model for each week day (Monday to Friday) and weekend days (Saturday and Sunday), except for special days, such as public holidays. For instance, the total twelve daily data models for Monday in the first season of 2013 will be grouped together to build a generalized data model for Monday. The same procedures apply to other week days, as well as weekend days. After building the generalized data models, the individual daily models within the season will be removed from CBSM storage permanently. 4) Raw data is purged monthly.

Two points should be noted. 1) A service is identified by the Service_ID (Figure 3). The Service_ID only exists in the cloud model. If a VM is load balanced, the same Service_ID will be shared among them. On the other hand, a Service_ID can be used to determine whether a VM is load balanced. If a service is load balanced, the resource usage for the service will be the sum of the resource usages of the same kind. 2) The source of the original data is from the cloud model (Utilization_Cache). The cloud model caches resource utilization data for a day in the Utilization_Cache; when the Utilization_Cache is full, it is sent to the Data Modeler to build daily data model. Rather than sending thousands of Utilization_Cache data for each cloud entity (Server/VM) individually, the OMS sends the most recent Cloud Projection to the Data Modeler. A Cloud Projection is simply a serialized cloud model object, which contains a projection of the current cloud, including any cached data, structure of the cloud, and organization of cloud entities.


65

The cached utilization information will also be temporarily stored for a longer period (a month). After building a monthly data model, the raw data will be removed permanently (a yearly model can also be built based on the weekly/monthly model).

There is no resource usage data models built for physical servers. The cloud environment is highly dynamic. Events of VM creation, deletion, migration, load-balancing, and resizing, etc., occur frequently and randomly on physical servers. In such a dynamic environment, long term utilization patterns and trends of physical servers contribute no explicit insight for improvement of QoS (Quality of Service).

The data modeling process is triggered by the Scheduler as well as the Analyzing process.

The Analyzer has two built-in functions: 1) Resource Provisioning. This is done by forecasting resource demands for each service using service usage data models, and the aggregated forecast results from each service are the recommended total resources required by a subscriber in the future. 2) Consolidation of VMs. In general, the Analyzer analyzes global status of the cloud using the most recent Cloud Projection to determine whether VMs are sparsely distributed in the cloud or too densely grouped on some physical servers; and calculates optimal VM to server arrangement. Again, the core algorithm for this process is also the VM placement algorithm realized by N-step ahead Forecast-based Power Aware Best Fit Decreasing (FnPABFD) heuristic algorithm described in a previous work (Dong & Herbert, 2013). In this particular implementation, the FnPABFD has been modified whereas the original FnPABFD uses forecasted values as reference points for preventing performance degradation of VMs in the calculation, this implementation uses the experienced values (superimposed resource usage models) for the same purpose, aka, N-step backward Experience-based Power Aware Best Fit Decreasing (EnPABFD). The algorithm explained as follows.

The main considerations for VM placement are minimizing both power consumption and performance degradation. Research indicates that in many organizations the average utilization level of servers is often below 30% of their full capacity (Barroso & Holzle, 2007) (Sargeant, 2010). Therefore, allocating more VMs on fewer hosts has the potential to greatly reduce power consumption of the hardware (servers). Based on the research results from Beloglazov & Buyya (2012), due to the heterogeneity of hardware and the characteristics of power consumption of various types of servers, the energy consumption footprint in a data center may vary largely with different VMs/servers arrangement. Performance degradation is mainly due to the fact that requests on CPU resources (extra storage space can be easily mounted on demand; and static memory assignment for VMs is assumed) from VMs cannot be satisfied by server. In order to systematically describe the

problem, the mathematical formulation of the problem is outlined in Equation 1.

Where and are sets of servers and VMs in a given private cloud, respectively; requirements indicate the configured hardware specifications (including memory and storage size) for VM ; indicates the generalized CPU utilization data model for VM (retrieved from Data Modeler, if there is not a generalized data mode, the most recently built mode will be used); indicates resource

capacity of server ; is power consumption which is calculated based on the power consumption model of each type of server. The power consumption model of a server can be determined by the Model and Manufacture field of a Server object contained in the Cloud Projection. Given all ; represents server is on, and 0

otherwise; represents allocation of VM to server ,

0 otherwise.

The first constraint ensures the sum of configured resources of VMs on server does not exceed the resource capacity of server ; the second constraint tries to avoid performance degradation of all VMs on server ; the third statement describes the fact that a single VM can be allocated on one server at a time; the fourth statement forces a server to be switched on just if there is any active VM(s) on that server; the last two statements state the possible status (on or off) of a VM/server.

Equation 1 ensures the optimal placement of VMs on servers. In addition, according to researches (Clark et al., 2005) (Liu et al., 2011) (Voorsluys, Broberg, Venugopal, & Buyya, 2009), VM migration results in VM and server performance degradation, heavy burden on cloud network, and notable energy consumption. Therefore, reducing frequency of occurrence of VM migration events is another primary goal of the Analyzer Component. In this work, we superimpose the data mode of each VM on a particular physical server, so that each sampling point from the


66

superimposed model will not exceed the capacity of the physical server. It is mathematically described as follows.

Let denotes number of VMs

assigned to host s. denotes data model for VM j. Giving a sampling interval i and the length of the model l ( steps

backward), can be represented with a finite number of

discrete values { , , … }. For all VMs that run

on host , construct matrix .

Find:

The α (0 < α < 1) value is a factor that is used to tolerate

burst requests and compensate some inaccuracy exist in data models.

The Mathematical Analysis Engine. The Mathematical Analysis Engine (MAE) supplies a set of sophisticated statistical and mathematical functions to the Analyzer, Optimizer, and Data Modeling components. Because the consumer components require a wide range of functions across branches of mathematics (e.g., Structured Time Series modeling and forecast technique used by Analyzer; sampling techniques used by Optimizer; and Auto Regressive Integrated Moving Average data modeling technique used by Data Modeler, etc.), an extensible and comprehensive mathematical analysis platform is needed. A cluster of R frameworks (R Development Core Team, 2010) was employed at the heart of the MAE. R is an open source, statistical framework widely used in the field of data analytics. Its flexible and extensible architecture allows packages (various types of functions) to be installed in a plug-and-play style, which best meets our design requirements. Notes that the R framework works with the R language, and all components of OMS were written in Java, therefore, an R Java language bridge is needed. The Rserve package (Urbanek, 2003) was installed on each

member of the R cluster, and acts as the language bridge. It allows a client (written in a different language, such as Java, C/C++, etc.) to call R functions remotely via socket connections.

Due to the fact that the OMS service is shared among subscribers, there are potentially vast amount of requests for accessing the MAE (R cluster), and so a software load balancer was placed in front of the MAE. It works as follows:

1. After joining or leaving the MAE, an R instance will register/deregister its IP (Internet Protocol) address with the load balancer.

2. The load balancer maintains a pool of IP addresses for all members of MAE.

3. Before consuming MAE functions, each requester receives the IP address of an R instance from the load balancer, and then uses the assigned IP address to establish a socket connection.

4. The software load balancer works in a round-robin fashion for the current implementation.

If the OMS service is deployed on a private cloud, the system itself is also a subscriber of its own services. Figure 2 illustrates the communication diagram of the proposed architecture and it is:

1. Scalable -- each topic subscriber (a functional module) can have multiple instances listening on the same topic queue, and tasks can then be load balanced on multiple topic subscriber instances which perform the same function.

2. Extensible -- as long as new topic definitions are configured at Event Monitor and topic specific queues are in place, new functional modules can be added in at any time, without interfering with other modules. For instance, a Web-based cloud monitoring service was added to the OMS (the service takes the most recent Cloud Projection as data source to visualize the status and organization of a private cloud). The only additional tasks for this newly added service were the creation of a dedicated queue between Cloud Modeler and the new service component, and a schedule police in the Scheduler Component (Figure 4).

3. Flexible -- introducing and removing any functional modules has no effect on the operation of other modules.

The Cloud Monitoring Service. Although the OMS organizes and controls a private cloud automatically,

Figure 5. OMS Experimental Deployment

Figure 4. Cloud Monitoring Service


67

providing current status of the private cloud (visually) to subscribers is also important. The visualization service is Web-based. The data source comes from Cloud Modeler. A new scheduler policy is also created, which tells the Cloud Modeler to generate Cloud Projection more frequently with only recent status (rather than the entire information cached in the Utilization_Cache) and send them to Cloud Monitoring queue. On the other side, a dedicated Java Servlet was deployed to constantly listening on the same queue. Whenever a Cloud Projection object is received, the Java Servlet de-serializes it and interprets it. (Note: Java major version must be match on both side, otherwise, the de-serialization process may crash). Note that the private cloud structure layout (visual layout, e.g., which VMs assigned to which server; which server assigned to which rack, etc.) can be constructed from the de-serialized Cloud Projection object. Because the Cloud Projection contains not only the data, but also the structure information about the private cloud. Clients access this service through browsers by providing account information and associated subscriber identifier. In the current implementation, users of the monitoring service are not allowed to make changes to the private cloud. In future work, this interactive management will be supported.

3. EVALUATIONA prototype implementation of the architecture has been

deployed on the IBM SmartCloud platform (Infrastructure as a Service - IaaS) (Figure 5). It is a full implementation of the architecture with all essential core functionalities. Five VM instances were employed for the OMS deployment. All VM instances were configured with two virtual CPUs (2.4GHz), 4GB memory, and 60GB local storage. Redhat 6 Enterprise (64-bit) Linux operating system and JRE (Java Runtime Environment) v1.7.0_25 were installed on all VM instances. They were located in the IBM Data Centre, Ehningen, Germany. VMware RabbitMQ v3.1.2-1 queuing system was deployed on instance-1 acting as the Exchange server. Both Analyzer and Data Modeler were deployed on instance-4, but they run as separate programs. The R framework v3.0.1 was deployed on instance-5 acting as the Mathematical Analysis Engine.

The Service Delegator components ran on a VM. The VM was configured with single virtual CPU (2.2GHz), 512MB memory, 10GB local storage, and Ubuntu Linux 12.10 server (64-bit). The private cloud simulator ran on a Windows 7 Pro system with configuration of quad-core CPU (2.2GHz), 8GB memory and 500GB local storage. It simulated 260 servers and 50 ~ 500 VMs depending on the purpose of the experiments. Each simulated VM has memory assignment uniformly distributed between 256MB ~ 1GB; storage assignment uniformly distributed between 10GB ~ 20GB; CPU frequency are randomly selected from {1.4, 1.6, 2.2} GHz; server configurations and power consumption models were set based on the specifications of HP ProLiant ML110 {G4, G5} servers, and the number of each model of servers were randomly selected to build a heterogeneous environment; MTPI for MTSC, MTVC, and MTVU were set to one minute across all experiments. A collection of mixed (including DHCP, DNS, Web server, etc.) and real-world server workloads were given to VMs during simulation. The complexity of the architecture was fully exercised, and important aspects were evaluated.

Figure 6. Network Bandwidth Consumption for Service Delegator and OMS Communication over 30 Minutes

Figure 7. Comparison of Metric Template Size


68

3.1SERVICEDELEGATORANDOMSCOMMUNICATIONCOST.

Figure 6 shows the network bandwidth consumption for communication between Service Delegator and OMS (indicated by the dashed line ellipse A, in Figure 5) over 30 minutes. In this experiment, only MTSC, MTVC, and MTVU were used. The simulator simulated 260 servers and 300 VMs. Each Metric Template was compressed using the ZIP stream algorithm provided by the standard Java package before sending to the OMS. In Figure 6, the blue solid-line indicates the bandwidth consumed by sending Metric Templates to the OMS; the cost for transmission of Metric Templates is found to be relatively small, and increases linearly with the number of VMs (Figure 7). The red solid-line indicates the bandwidth consumed by receiving Suggestions. The received data is much larger than the sent data. This is mainly driven by the number of Suggestions received, and Suggestions are not compressed in the current implementation. Suggestion compression and encryption will be implemented in future work. Notice that the received data size varies over time. This is because the number of Suggestions received is influenced by the number of abnormal events detected and how many types of abnormal events are defined in the system. For instance, if CPU utilization of a host reaches 85% of its capacity, an abnormal event will be triggered and a VM migration Suggestion may be generated depending on the policy defined. A VM migration Suggestion (Listing 1) will be sent to the Suggestion queue of subscribers accordingly and eventually received by the Service Delegator. We also found that after 15 minutes, the subscriber (private cloud) becomes more stable (less abnormal events detected), and this proves our service and strategies built into the service are effective.

With the scale of 260 servers and 300 VMs, the average bandwidth consumption is approximately 15KB/Sec (this figure is expected to be bigger as more information will be included in the Metric Template to enrich the functionality of the OMS). The required network bandwidth can be

lowered significantly, when the MTPI is set to be longer (for instance, five minutes). However, reducing MTPI will affect the quality of the service, as less information about a private cloud will be received. There is also a slight time-delay between sending Metric Templates and receiving corresponding Suggestions. Because the responses (Suggestions) were generated by the Optimizer based on the defined policies and abnormal events detected in the system; and the scale of the private cloud is relatively small, the time used for generation of responses was measured in seconds, it is hard to observe in the figure. On the other hand, the scheduled tasks will take much longer to generate optimal results; e.g., depending on the size of a private cloud, generation of optimal VMs to Servers arrangement may take up to several minutes.

3.2CLOUDPROJECTIONTRANSMISSIONCOST.

The distributed deployment of the Analyzer, Web-based monitoring service, and Data Modeler modules requires a snapshot of the current status of the cloud as well as the organization of the structure of the cloud, supporting optimal decision making and data modeling. One of the main design concerns was the Cloud Projection transmission overhead between Cloud Modeler and other components (indicated by the dashed line ellipse B, in Figure 5). Figure 8 shows that the serialized, compressed, and encoded Cloud Projection size increases linearly and slowly with the number of VMs. The timer for measuring the Cloud Projection transmission cost starts at the beginning of the cloud model object serialization process at the Cloud Modeler, and stops at the end of the Cloud Projection de-serialization process at its consumer components. Figure 9 shows the Cloud Projection transmission cost for 50 ~ 500 VMs and the cost in terms of time increases rapidly with the number of VMs. Such a time delay is tolerable, however, because the consumers (Analyzer, Data Modeler, and Web-based monitoring services) of Cloud Projections are scheduled processes, and they are primarily used for consolidation of VMs, resource

Figure 8. Comparison of Cloud Projection Size

Figure 9. Cloud Projection Transmission Cost


69

provisioning, long term decision support, and storage conservation.

Intuitively, sending Cloud Projections to various components is not an optimal solution, because each Cloud Projection object contains repeated information (e.g., Server/VM_ID, server model, manufacture, etc.). Sending changed data will result much lower transmission cost. However, parsing changed data at each receiver is somehow complex task and error-prone. It is also affect the extensibility of the architecture. E.g., adding additional metric readings requires new parsers at each receiver.

3.3CALENDAR‐BASEDSTORAGEMODEL.

Figure 10 shows the comparison of the original data size and its data model object size. With larger data sets the CBSM can save three times less storage space. The original data was one week of CPU utilization for 300 VMs. Sampling intervals were set to be 1 ~ 5 minutes (corresponding to the MTPI). The bigger MTPI indicates less metric readings. Both original data and data model objects were compressed using a ZIP stream algorithm provided by the standard Java package. Data models were built using a Local Polynomial Fitting algorithm provided

by the R framework (the size of data model varies slightly with different modeling techniques). Because all data models were built with a fixed sampling interval (one hour), therefore the size of data models doesn't change with the MTPI.

In the initially configuration, data models are serialized and stored in cloud-based object storage -- OpenStack Swift (Cabrera & Long, 1991) (OpenStack, 2013). It is after found that the RESTful style of accessing (searching, writing, and retrieving) data models are very inefficient, especially when there are a large number of data models stored in a single container (Figure 11). Note that the test was conducted with single user connection; the Swift object storage was configured with single node and single replica. A hybrid storage architecture, which incorporates with traditional SQL database and cloud-based object storage, is under development as part of ongoing work.

3.4MAEPERFORMANCETEST.

Because the Mathematical Analysis Engine carries out most of the computation intensive and time consuming tasks, it is the potential bottleneck of the architecture. Effectively elastic the MAE and load balance between members of MAE are important for the smooth running of the system. The MAE performance test (Figure 12) was conducted for a single R instance. Each connection indicates an independent task (data modeling via ARIMA – Auto Regressive Integrated Moving Average). It was found that the performance of the R instance degraded linearly, then exponentially with the increased number of connections. When the concurrent connections are greater than 110, the R instance becomes extremely slow and unstable. 80 concurrent connections seem to be acceptable for each MAE member in general. The size of the raw data is another factor that affects the performance of the MAE node. The performance test results were used to determine the size of the initial MAE deployment, as well as auto-scaling and load-balancing of MAE. There are some other modeling and forecast techniques were evaluated in a previous work (Dong & Herbert, 2013).

Figure 10. Comparison of Row Data Size and Data Model Object Size

Figure 12. MAE Performance Test

Figure 11. CBSM Searching Performance using

OpenStack Swift as Back-end


70

4. RELATEDWORKClouds and their services operate in a virtualized

environment. The adoption of virtualization technology decouples the traditional relationship between operating systems and physical machines. It offers opportunities for inserting layers of infrastructure management and operation automation.

Cloud technology vendors, such as Cisco Systems, Microsoft, and VMware; provide their own on-site and proprietary cloud system management suites. Cisco Systems has outlined a notable cloud capacity management strategy based on the ITIL v3 (Information Technology Infrastructure Library Version 3) reference architecture. Its key concept is to build a Cloud Capacity Model (Josyula, Orr, & Page, 2012). The capacity model consists of three planes: Component, Service/Domain, and Business. The Component plane contains all available resources, and they are the building blocks of the Service/Domain plane. These resource building blocks are divided into a number of component catalogues. Example catalogues are network, storage, and compute. In the Service/Domain plane, each component catalogue is associated with a Service Model, Demand Model, and Service Forecast. The Business plane consists of Service Catalogue and Business Forecast. Capacity plans are produced based on the Business Forecast and Service Forecast as the two primary inputs. Authors outlined a full cloud management framework whereas our solution focuses on cloud infrastructure management. Microsoft as a major cloud player also provides a private cloud management solution -- VMM (Virtual Machine Manager) (Finn, Vredevoort, Lownds, & Flynn, 2012).

A noteworthy component of VMM is the Library. A Library acts as a resource repository. It contains various resources including VM images, scripts, and best practice templates, etc. Leveraging the Library maximizes the resource reusability and avoids error-prone tasks. This is one of the ideas we want to draw in our future work. CapacityIQ (VMware, 2011) is another cloud infrastructure management solution offered by VMware Technologies. Its basic function is to collect statistical/historical information about cloud objects for management personnel. Its unique capability is of modeling potential changes to the virtualized environment of clouds. These solutions are categorized as passive management. They require IT personnel to operate and lack automation. In contrast, this work aimed to provide an automated operation management solution.

There are also third parties providing cloud operation management solutions. BMC Software (Williams & Wolfe, 2011) provides comprehensive solutions for managing cloud services and infrastructures. Service performance is proactively analyzed by an Application Behavior Learning Engine, which is based on statistical analytic techniques, and cloud resources are continuously optimized. Netuitive (Netuitive, 2012) is a similar commercially available

solution. Architecturally, it consists of three tiers: Aggregation, Correlation, and Presentation. The Aggregation tier transmits cloud information to the service (Correlation and Presentation) tier. The Correlation tier provides self-learning mechanisms that learn cloud service behaviors. Abnormal events are then predicted based on advanced statistical analytic techniques. The Presentation tier provides a visualized presentation of current cloud status and reports. CA Technologies (CA Technologies, 2012) is another third party cloud management solution provider. Its Virtual Placement and Balancing solution automatically optimizes cloud resource usage based on both statistical and optimization techniques.

Beside industrial solutions, Sotomayor, Montero, Llorente, & Foster (2009), presented a private, hybrid cloud management suite -- OpenNebula. The OpenNebula provides essential tools for managing cloud infrastructure rather than reactive responding compared to this work. Vasić, Novaković, Miučin, Kostić, & Bianchini (2012) introduced the DejaVu framework for virtual resource management. It classifies workloads into a small number of categories using signatures. Categories are distinguished by resource usage patterns which are learned from the past. As workloads change with time, virtual resources are automatically adjusted based on the usage patterns of the workload category into which the workloads fall.

5. CONCLUSIONANDFUTUREWORKThis work aimed to provide an automated and cost-

efficient solution for modern private cloud operation management. An innovative distributed service architecture designed to provide an off-site, automated, and shared operation management service for private clouds has been presented. The architecture has developed several useful mechanisms to solve various challenges in the field; these include the Metric Template, Action Template, Calendar-based Storage Model, and Cloud Projection. A prototype implementation of the service architecture was developed, and important aspects were evaluated under simulated realistic workload conditions. Evaluation has demonstrated the effectiveness and the usability of the architecture. One issue for future work is to develop a comprehensive coordination solution to cater for the complexity introduced by an increased number of functional modules. Schemes for ensuring consistency between the private cloud and cloud model will also be investigated in detail in future work.

6. ACKNOWLEDGMENTThis work is supported by the Telecommunications

Graduate Initiative (TGI) program which is funded by the Higher Education Authority under the Programme for Research in Third-Level Institutions (PTRLI) Cycle 5 and co-founded under the European Regional Development Fund (ERDF).


71

Putyourphotohere.

Putyourphotohere.

7. REFERENCESJosyula, V., Orr, M., & Page, G. (2012). Cloud computing automating the virtualized data centre (pp. 263-276). Indianapolis: Cisco Press. Finn, A., Vredevoort, H., Lownds, P., & Flynn, D. (2012). Microsoft private cloud computing (pp. 89-116). Indiana: John Wiley & Sons, Inc. Gartner, Inc. (2012). Gartner outlines five cloud computing trends that will affect cloud strategy through 2015. Retrieved June 12, 2013, from http://www.gartner.com/it/page.jsp?id=197151. VMware, Inc. (2011). vCenter CapacityIQ installation guide. Retrieved June 12, 2013, from https://www.vmware.com/pdf/ciq151-installguide.pdf. Williams, D. & Wolfe, M. L. (2011). Why your applications need behavior learning "Therapy". Retrieved June 12, 2013, from http://documents.bmc.com/products/documents/88/81/ 208881/208881.pdf. Netuitive, Inc. (2012). Netuitive private cloud management. Retrieved June 14, 2013. from http:// www.netuitive.com/resources/pdf/ds-netuitive-for-cloud-management.pdf. CA Technologies, Inc. (2012). Can you optimize your use of your virtualization and cloud resources both now and in the future? Retrieved June 14, 2013, from http://www.ca.com/us/textasciitilde/media/files/ solutionbriefs/cs1565-virt-place-bal-sol-sb-0711d.aspx. Dong D. and Herbert J. (2013). Energy efficient VM placement algorithm supported by data analytic service. In Proceedings of the 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, 648 – 655. Barroso, L. & Holzle, U. (2007). The case for energy proportional computing. Computer. 40(12), 33 – 37. Sargeant, P. (2010). Data centre transformation: How mature is your IT? Gartner, Inc. Beloglazov, A. & Buyya, R. (2012). Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Future Generation Computer System, 28, 755 – 768. Clark, C. Fraser, K. Hand, S. Hansen, J. G. Jul, E. Limpach, C. Pratt, I. and Warfield, A. (2005). Live migration of virtual machines. In Proceedings of the 2nd Conference on Symposium on Networked Systems Design & Implementation, USENIX Association, 2, 273 – 286. Liu, H. Xu, C. Z. Jin, H. Gong, J. and Liao, X. (2011). Performance and energy modeling for live migration of virtual machines. In Proceedings of the 20th International Symposium of High Performance Distributed Computing. ACM, 117 – 182. Voorsluys, W. Broberg, J. Venugopal, S. and Buyya, R. (2009). Cost of virtual machine live migration in clouds: a performance evaluation. In Proceedings of the 1st Internal Conference on Cloud Computing. Springer, 254 – 265.

Cabrera, L. F. and Long, D. D. E. (1991). Swift: a storage architecture for large objects. In Mass Storage Systems. 11th IEEE Symposium, 123 – 128. OpenStack, (2013). Retrieved November 25, 2013, from OpenStack Swift, http://www.openstack.org/software/ openstack-storage/. R Development Core Team (2010). R: A language and environment for statistical computing. Retrieved June 14, 2013, from http://www.R-project.org/ Urbanek, S. (2003, March). Rserve: A fast way to provide R functionality to applications. Proceedings of the 3rd International Workshop on Distributed Statistical Computing. Cherkasova, L., Ozonat, K., Mi, N., Symons, J., & Smirni, E. (2009, November). Automated anomaly detection and performance modeling of enterprise applications. ACM Trans. Comput. Syst., 27, 6:1-32. Sotomayor, B., Montero, R. S., Llorente, I. M., & Foster, I. (2009, October). Virtual infrastructure management in private and hybrid clouds. Internet Computing IEEE, 13, 14 – 22. Vasić, N., Novaković, D., Miučin, S., Kostić, D., & Bianchini, R. (2012, March). Dejavu: Accelerating resource allocation in virtualized environments. SIGARCH Comput. Archit. News, 40, 423 – 436.

Authors Dapeng Dong obtained a M.Sc. in Software and Systems for Mobile Networks from University College Cork. He is currently pursuing the Ph.D. degree in the Mobile and Internet Systems Laboratory at the University College Cork, Ireland. Prior to his current

position, he has worked for Cisco Systems as a Software Engineer. His researches interests are broadly in the areas of cloud computing and Big Data analytics.

John Herbert obtained a Ph.D. in computer science from the University of Cambridge, M.Sc. in Physics, and B.Sc. in Physics and Mathematics from University College Cork. He is currently a senior lecturer in the Department of

Computer Science, University College Cork. Prior to his present position, he has worked for SRI International, USA, Cambridge, UK, and the University of Cambridge Computer Laboratory.

IEEE 11th International Conference on Services Computing (SCC 2014) SCC 2014 will focus on services innovation lifecycle e.g., enterprise modeling, business consulting, solution creation, services orchestration, optimization, management, and BPM. Visit http://conferences.computer.org/scc/ . IEEE 3rd International Conference on Mobile Services (MS 2014)MS 2014 will feature all aspects of mobile services including modeling, construction, deployment, middleware, and user experience with a special emphasis on context-awareness in mobile settings. Visit http://themobileservices.org/2014. International Congress on Big Data (BigData 2014) BigData 2014 aims to explore various aspects of Big Data including modeling, storage architecture (NOSQL), enterprise transformation, text mining, social networks, applied analytics and various applications, Big Data As A Service, Visit http://ieeebigdata.org/2014.

IEEE 7th International Conference on Cloud Computing (CLOUD 2014) Cloud Computing is becoming a scalable services delivery and consumption platform in the field of Services Computing. The technical foundations of Cloud Computing include Service-Oriented Architecture and Virtualizations. Major topics cover Infrastructure Cloud, Software Cloud, Application Cloud, Social Cloud, & Business Cloud. Visit http://thecloudcomputing.org.

IEEE 21th International Conference on Web Services (ICWS 2014)ICWS 2014 will feature web-based services modeling, design, development, publishing, discovery, composition, testing, QoS assurance, adaptation, and delivery technologies and standards. Visit http://icws.org.

Sponsored by

IEEE

Technical

Committee on

Services

Computing

(TC-SVC,

http://tab.com

puter.org/tcsc)Conference proceedings are EI indexed. Extended

versions of invited ICWS/SCC/CLOUD/MS/BigData papers will be published in IEEE Transactions on

Services Computing (TSC, SCI & EI indexed), International Journal of Web Services Research

(JWSR, SCI & EI indexed), International Journal of Business Process Integration and

Management (IJBPIM), IEEE IT Pro (SCI & EI Indexed).

Sponsored by IEEE Technical Committee

on Services Computing

(TC-SVC, http://tab.computer.org/tcsc) Conference proceedings are EI indexed. Extended versions of invited ICWS/SCC/CLOUD/MS/BigData papers will be published in IEEE Transactions on Services Computing (TSC, SCI & EI indexed),International Journal of Web Services Research (JWSR, SCI & EI indexed), International Journal of Business Process Integration and Management (IJBPIM), IEEE IT Pro (SCI & EI Indexed).

Submission Deadlines

ICWS 2014: 1/15/2014

CLOUD 2014: 1/15/2014

SCC 2014: 1/31/2014

MS 2014: 1/31/2014

BigData 2014: 2/15/2014

SERVICES 2014: 2/15/2014

Contact: Liang-Jie Zhang (LJ)

[email protected] (Steering Committee Chair)

IEEE 7th

International

Conference on

Cloud Computing

(CLOUD 2014) Cloud Computing is becoming a

scalable services delivery and consumption platform in the field of Services Computing. The technical foundations of Cloud Computing include Service-Oriented Architecture and Virtualizations. Major topics cover Infrastructure Cloud, Software Cloud, Application Cloud, Social Cloud, & Business Cloud. Visit http://thecloudcomputing.org

IEEE 21th

International

Conference on Web

Services (ICWS 2014)

ICWS 2014 will feature web-based services modeling, design, development, publishing, discovery, composition, testing, QoS assurance, adaptation, and delivery technologies and standards. Visit http://icws.org

IEEE 11th

International

Conference on

Services Computing

(SCC 2014) SCC 2014 will focus on services innovation lifecycle e.g., enterprise modeling, business consulting, solution creation, services orchestration, optimization, management, and BPM. Visit

http://conferences.computer.org/scc/

International

Congress on Big Data

(BigData 2014)

BigData 2014 aims to explore various aspects of Big Data including model ing, storage architecture (NOSQL), enterprise transformation, text mining, social networks, applied analytics and various applications, Big Data As A Service, Visit http://ieeebigdata.org/2014

IEEE 3rd International

Conference on Mobile

Services (MS 2014)

Visit

MS 2014 will feature Wearable technology andapplications. Topics cover but not limited to all innovative aspects of wearabledevices, programming models, integration,and domain specific solutions.http://themobileservices.org/2014

June 27—July 2, 2014, Anchorage, Alaska, USA

(http://www.servicescongress.org)Federation of 5 Service-Centric Conferences from Different Angles

IEEE 10th

World Congress on Services

(SERVICES 2014)

MS 2014 will feature Wearable technology andapplications. Topics cover but not limitedto all innovative aspects of wearabledevices, programming models, integration,and domain specific solutions.

Visit themobileservices.org/2014.

Call for Articles International Journal of Services Computing

Mission The International Journal of Services Computing (IJSC) aims to be a reputable resource providing leading technologies, development, ideas, and trends to an international readership of researchers and engineers in the field of Services Computing. To ensure quality, IJSC only considers extended versions of papers published at reputable international conferences such as IEEE ICWS. From the technology foundation perspective, Services Computing covers the science and technology needed for bridging the gap between Business Services and IT Services, theory and development and deployment. All topics regarding Web-based services lifecycle study and management align with the theme of IJSC. Specially, we focus on: 1) Web-based services, featuring Web services modeling, development, publishing, discovery, composition, testing, adaptation, and delivery, and Web services technologies as well as standards; 2) services innovation lifecycle that includes enterprise modeling, business consulting, solution creation, services orchestration, services optimization, services management, services marketing, business process integration and management; 3) cloud services featuring modeling, developing, publishing, monitoring, managing, delivering XaaS (everything as a service) in the context of various types of cloud environments; and 4) mobile services featuring development, publication, discovery, orchestration, invocation, testing, delivery, and certification of mobile applications and services.

Topics The International Journal of Services Computing (IJSC) covers state-of-the-art technologies and best practices of Services Computing, as well as emerging standards and research topics which would define the future of Services Computing. Topics of interest include, but are not limited to, the following: -Services Engineering -XaaS (everything as a service) -Cloud Computing for Internet-based services -Big Data services -Internet of Things (IoT) services -Pervasive and Mobile services -Social Networks and Services -Wearable services -Web 2.0 and Web X.0 in Web services -Service-Oriented Architecture (SOA) -RESTful Web Services -Service modeling and publishing -Service discovery, composition, and recommendation -Service operations, management, and governance -Services validation and testing -Service privacy, security, and trust -Service deployment and evolution -Semantic Web services -Scientific workflows -Business Process Integration and management -Service applications and implementations -Business intelligence, analytics and economics for Services

Call for Articles International Journal of Big Data

Mission Big Data has become a valuable resource and mechanism for the practitioners and researchers to explore the value of data sets in all kinds of business scenarios and scentific investigations. New computing platforms such as cloud computing, mobile Internet, social network are driving the innovations of big data. From government initiative perspective, Obama Administration in United States launched "Big Data" initiative that announces $200 Million in new R&D investments on March 29, 2012. European Union also announced "Big Data at your service" on July 25, 2012. From industry perspective, IBM, SAP, Oracle, Google, Microsoft, Yahoo, and other leading software and internet service companies have also launched their own innovation initiatives around big data. The International Journal of Big Data (IJBD) aims to provide the first Open Access publication channel for all authors working in the field of all aspects of Big Data. Big Data is a dynamic discipline. One of the objectives of IJBD is to promote research accomplishments and new directions. Therefore, IJBD welcomes special issues in any emerging areas of big data.

Topics IJBD includes topics related to the advancements in the state of the art standards and practices of Big Data, as well as emerging research topics which are going to define the future of Big Data. Topics of interest to include, but are not limited to, the following: Big Data Models and Algorithms (Foundational Models for Big Data, Algorithms and Programming Techniques for Big Data Processing, Big Data Analytics and Metrics, Representation Formats for Multimedia Big Data)

Big Data Architectures (Cloud Computing Techniques for Big Data, Big Data as a Service, Big Data Open Platforms, Big Data in Mobile and Pervasive Computing)

Big Data Management (Big Data Persistence and Preservation, Big Data Quality and Provenance Control, Management Issues of Social Network enabled Big Data)

Big Data Protection, Integrity and Privacy (Models and Languages for Big Data Protection, Privacy Preserving Big Data Analytics Big Data Encryption)

Security Applications of Big Data (Anomaly Detection in Very Large Scale Systems, Collaborative Threat Detection using Big Data Analytics)

Big Data Search and Mining (Algorithms and Systems for Big Data Search, Distributed, and Peer-to-peer Search, Machine learning based on Big Data, Visualization Analytics for Big Data)

Big Data for Enterprise, Government and Society (Big Data Economics, Real-life Case Studies, Big Data for Business Model Innovation, Big Data Toolkits, Big Data in Business Performance Management, SME-centric Big Data Analytics, Big Data for Vertical Industries (including Government, Healthcare, and Environment), Scientific Applications of Big Data, Large-scale Social Media and Recommendation Systems, Experiences with Big Data Project Deployments, Big Data in Enterprise Management Models and Practices, Big Data in Government Management Models and Practices, Big Data in Smart Planet Solutions, Big Data for Enterprise Transformation)

Documents

· International Journal of Cloud Computing (ISSN 2326-7550) Vol. 1, No. 2, October-December 2013 i IJCC Editorial Board Editors-in-Chief Hemant Jain, University of Wisconsin–Milwa