Towards energy-aware VM scheduling in IaaS …Towards energy-aware VM scheduling in IaaS clouds through empirical studies Qingwen Chen Grid Computing University of Amsterdam A thesis

Towards energy-aware VM

scheduling in IaaS clouds through

empirical studies

Qingwen Chen

Grid Computing

University of Amsterdam

A thesis submitted for the degree of

Master of Science (MSc)

Supervised by

Dr. Paola Grosso

System and Network Engineering research group

Amsterdam, the Netherlands

August 29, 2011

mailto:[email protected]

http://student.uva.nl/grid/announcements.cfm

http://www.uva.nl

2

Abstract

Energy-efficient computing has become increasingly important to mod-

ern HPC systems such as clouds. In this thesis we explore the ’green’

opportunities with virtualization technologies in clouds through system-

level optimizations, and specifically focus on energy-savings by energy-

aware scheduling of virtual machines.

A system-level approach of optimization for green cloud computing

requires in-depth understanding of the power characteristics of virtual

machines with respect to patterns of workloads running on them. The

first step we took in this direction is to deploy a private cloud system

with facilities provided by DAS-4 clusters and thoroughly characterize

its power behavior.

We broadly identified three power metrics, i.e. power, power efficiency

and energy. by executing several types of high performance computing

workloads on both the VM and host, we compare their performance

with respect to these three metrics. In addition to that, we also

analyze the composition of the total power consumption of a single

work node, and evaluate the contributions of individual components,

i.e., CPU, memory and HDD.

As a result of these profiling experiments, a linear power model is

derived to represent the power behavior of a single work node, which

is further extended to describe an entire cloud. With the help of

this model, an novel energy-savvy scheduler is proposed to make use

of the monitoring system to support system-level optimization with

on-the-fly VM scheduling and dynamic resource adaptation.

ii

Acknowledgements

I would firstly like to thank my supervisor Paola Grosso for her in-

valuable guidance and insights throughout this project, without which

this thesis couldn’t have been done.

Furthermore, I would also like to thank Kees Verstoep and Rutger

Hofman for always providing me with suggestions on my work.

Moreover, I am also grateful to Cosmin Dumitru, Ralph Koning and

all other members from System & Network Engineering (SNE) Group

for their valuable advices and feedback about my initial work.

Finally I would like to thank Vesselin Hadjitodorov for paving the way

for the work presented here, and DAS-4 for offering me their facilities.

iv

Contents

Contents v

1 Introduction 1

1.1 Green computing . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Research opportunity . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Background 5

2.1 Cloud computing . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 IaaS Cloud managers . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 Virtualization: Hypervisors . . . . . . . . . . . . . . . . . . . . . 10

3 Experimental Setup 13

3.1 Power measurement method . . . . . . . . . . . . . . . . . . . . . 13

3.2 Power metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.3 Hardware & Software environment . . . . . . . . . . . . . . . . . 16

3.4 Data logger and Monitoring . . . . . . . . . . . . . . . . . . . . . 18

3.5 Workload generator . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.5.1 Linpack . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.5.2 Dhrystone and Fhourstones . . . . . . . . . . . . . . . . . 20

3.5.3 Stress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.5.4 Customized scripts . . . . . . . . . . . . . . . . . . . . . . 22

3.6 Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

v

CONTENTS

4 Profiling VMs’ power consumption 23

4.1 Component benchmarks . . . . . . . . . . . . . . . . . . . . . . . 25

4.1.1 CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.1.2 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.1.3 HDD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.2 Overall benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2.1 Floating-point operation performance . . . . . . . . . . . . 31

4.2.2 Integer operation benchmark . . . . . . . . . . . . . . . . . 32

4.2.3 Impact of Hyper-Threading . . . . . . . . . . . . . . . . . 34

4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

5 Towards energy-efficient scheduling 39

5.1 The power model . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.1.1 Prerequisites & Assumptions . . . . . . . . . . . . . . . . . 39

5.1.2 Power model for a single work node . . . . . . . . . . . . . 42

5.1.3 Power model for a cloud system . . . . . . . . . . . . . . . 45

5.1.4 The opportunity revisited . . . . . . . . . . . . . . . . . . 46

5.2 Energy-aware scheduler . . . . . . . . . . . . . . . . . . . . . . . . 46

5.2.1 Symbolic description . . . . . . . . . . . . . . . . . . . . . 48

5.2.2 Placement scheduler . . . . . . . . . . . . . . . . . . . . . 49

5.2.3 Migration scheduler . . . . . . . . . . . . . . . . . . . . . . 51

6 Related Work 53

6.1 Performance and energy profiling . . . . . . . . . . . . . . . . . . 53

6.2 Energy-aware clouds and power models . . . . . . . . . . . . . . . 54

7 Conclusions 57

References 59

List of Figures 67

List of Tables 69

vi

Chapter 1

Introduction

1.1 Green computing

Cloud computing has emerged as a new paradigm of computing and gains in-

creasing attention from both academic and business community. Its utility-based

usage model allows users to pay per use, similar to other public utility such as

electricity, with relatively low investment on the end devices that access the cloud

computing resources. From the environmental perspective this new computing

model is already a great improvement [1] since the computing resources are shared

among all users and provisioned on-demand. This also puts tremendous pressure

on the cloud service providers who manage these resources, however it also opens

a lot of possibilities in further energy savings.

Energy consciousness and energy efficiency are two increasingly important as-

pects in designing and operating ICT infrastructures. As Belady pointed out in

[2], the annual energy cost for a 1U server in 2008 has surpassed its purchasing

cost. An US EPS’s report [3] on server and data center energy efficiency also indi-

cated that the energy consumed by nation’s servers and data centers was around

61 billion kilowatt-hours (kWh) (1.5% of total electricity consumption) in 2006,

and it could be doubled to more than 100 billion kWh by 2011. It also predicted

that about 10% of annual savings in energy consumption could be achieved by

2011 through state-of-the-art hardware and virtualization technologies.

The state-of-the-art hardware, on one hand, may improve the energy-efficiency

1

1. Introduction

of data centers’ computing infrastructures; on the other hand, upgrading current

infrastructures for the purpose of green computing is also a great investment,

and it is not always economically feasible for all data centers, especially if the

investment surpasses the benefits. It is therefore interesting to look for energy-

savvy alternatives that can satisfy users’ requirements on performance under

current computing infrastructure. The answer lies in our opinion on the use of

virtualization technologies in distributed systems such as clouds augmented with

global system-level optimizations which take focus on providing energy-efficient

optimizations for the entire data center while taking the characteristics of each

individual as input parameters.

The Dutch Organization for Scientific Research (NWO), as a source of funds

for many Dutch research programs, has the initiative to explore the ’green’ op-

portunities in Dutch data centers (e.g. DAS-4, SARA etc.) and therefore issues

the GreenClouds project. This GreenClouds research, where this work originated

from, stems from the following three promising ideas:

• Hardware diversity : computations should run on the architectures (e.g.

GPUs, multicores) that execute them in the most energy-efficient way;

• Elastic scalability : the number of resources should dynamically adapt to

the application needs, taking both computational and energy efficiency into

account;

• Hybrid networks : optical and photonic networks can transport data and

computations in a more energy-efficient way.

The combination of hardware diversity, elastic scalability and hybrid networks

in a distributed setting provides the basic components for a system-level approach,

where we are not limited to local optimizations (e.g., reducing clock speeds), but

we look at the behavior of the overall system.

1.2 Research opportunity

Virtualization technologies expose several new opportunities for green computing

in clouds which we haven’t experienced in non-virtualized environments such as

2

Grid Computing.

Firstly, with help of virtualization technologies, it is possible to partially or

fully migrate the up-running applications to more energy efficient hardware, i.e.

’greener’. The research issues arisen here are how the energy-efficiency is ap-

propriately defined and how to evaluate the energy efficiency of various types of

hardware within a heterogeneous data center.

Another opportunity for green clouds lies in minimizing the number of up-

running work nodes within a data center. This is two-folded:

a) When creating a VM, a VM placement scheduler may be involved to place

the VM on the most appropriate work node so that the number of active

work nodes is minimized while still getting reasonable performance;

b) A VM migration scheduler may make sophisticated decisions on migrating

applications from two or more lightly loaded work nodes to one work node

so that the other idle nodes may be powered off to save energy.

Nevertheless a thorough understanding on the power behavior of underlying

hardware is needed for both opportunities. The challenges in these opportunities

are:

• What is the power behavior of each individual, i.e. virtual machine (VM) in

virtualized environment, and how would it be affected by different patterns

of application running on it and/or different hardware architectures hosting

it?

• How could this power characteristic information be incorporated into the

system-level optimization?

This thesis strives to address the first challenge and provides heuristic pro-

posals on a system-level optimization for the second challenge. The work in this

thesis is broken down into two sub-tasks. We firstly carry on comprehensive

power benchmarks to thoroughly characterize the energy consumption of both

VMs and hardware components used in virtualized environments. Within this

part the state-of-the-art facilities provided by DAS-4 clusters [4] are extensively

utilized as the hardware of our test cloud system. Secondly we provide our power

3

1. Introduction

models for a cloud system based on the profiling results, and propose heuristic

recommendations on how these results and power models could be included in

an energy-conscious elastic scheduler. Overall we also aim to provide the reader

with an analysis of the results which is of general applicability to virtual ma-

chines running in clouds, and not specific to the hardware on which we obtained

the results.

1.3 Structure

The rest of this thesis is organized as follows: we briefly introduce the concepts

and technologies included in our thesis work in Chapter 2, and further elaborate

our experimental setup in Chapter 3. Chapter 4 presents the results of power

benchmarks, as corresponding to the first challenge described above. Our power

model for a cloud system and heuristic proposal on system-level optimization for

energy-efficient computing are illustrated in Chapter 5. Following the related

work in Chapter 6, we conclude with a brief discussion and future directions.

4

Chapter 2

Background

Cloud computing has emerged as a computing paradigm where the computing

resources could be provisioned on-demand to deliver services in a scalable manner.

In this chapter, we will briefly introduce its concept from technical perspectives

and focus specifically on the Infrastructure as a Service (IaaS) cloud systems.

2.1 Cloud computing

A number of researchers has attempted to define cloud computing from different

perspectives [5, 6, 7, 8, 9]. The most widely-accepted and accurate technical defi-

nition comes from the recommendation of National (U.S.) Institute of Standards

and Technology (NIST) in [10], where cloud computing is defined as a model for

enabling ubiquitous, convenient, on-demand network access to a shared pool of

configurable computing resources that can be rapidly provisioned and released

with minimal management effort or service provider interaction.

Although classifying the details of cloud models has been subject of debate

[11, 12, 13], it is generally agreed that a cloud framework may include 5 layers

as presented in Figure 2.1. It consists of three different service layers depending

on the type of resources provided by the cloud, and two underlying framework

layers:

• Server layer consists of hardware resources (such as computer, storage and

network devices) and some basic software products which are used to man-

5

2. Background

Figure 2.1: Typical stack of the cloud model

age these hardware resources.

• Virtualization layer connects the server layer with higher level of service

layers by virtualizing the hardware resources and providing the service layers

with on-demand provision.

• Infrastructure as a Service (IaaS) is built right on top of the virtualiza-

tion layer and provides the users with processing, storage, networks, and

other fundamental computing resources, typically in terms of virtual ma-

chines (VMs). Cloud users may be able to deploy their applications on

these virtualized resources and customize the runtime environment (includ-

ing operating systems of virtualized resources) as if they would normally do

on their local hardware resources.

• Platform as a Service (PaaS) provides users with pre-configured runtime

environment (e.g. Java runtime environment) and enables users to deploy

their own applications within it. At this layer, the users only have full

control over their own applications, and may customize the runtime envi-

ronment configurations in a limited range.

• Software as a Service (SaaS) provides the users with ready-to-use applica-

tions running on a cloud infrastructure. These applications are developed

by the service cloud providers, and the users at this layer may only have

limited ability to modify user-specific application configuration settings.

6

Even though the higher level of service layers may build their underlying in-

frastructures on top of lower level of layers, they are able to provide corresponding

services to the public separately and independently. Table 2.1 summarizes some

of the major players/technologies at each layer.

Table 2.1: Summary of cloud layers and their major players/technologies

Layers Major players and/or technologies

SaaSGmail, Google Docs, Salesforce, Public cloudstorage services such as Dropbox

PaaS Google App Engine, Windows Azure

IaaSAmazon EC2, Rackspace, Eucalyptus, Open-Nebula, OpenStack, Nimbus etc.

VirtualizationKVM, Xen, Microsoft Hyper-V, VMware ESXetc.

ServerMulti-core, Storage, Memory, accelerators suchas GPU and FPGA etc.

Another criterion to classify the cloud infrastructure is the deployment model.

Under this classification we have four categories of clouds: private, public, com-

munity, and hybrid. Private and public clouds are self-explained as they are

available to a single organization and to general public respectively. In commu-

nity cloud, the cloud infrastructure is shared by several organizations. Hybrid

cloud is more or less the combination of two or more of the other three clouds.

Amazon, Google and Microsoft are three major players in the area of public cloud,

while open source cloud infrastructures such as OpenNebula[14], Eucalyptus[15],

and Nimbus[16] etc. play significant roles in the other fields.

Within the task of this thesis work, we will fully utilize the state-of-the-art

hardware provided by DAS-4 cluster and focus on the IaaS service model deployed

with OpenNebula.

2.2 IaaS Cloud managers

Cloud managers at IaaS layer provides the easy-to-use management interfaces to

make the virtualized resources (i.e. VMs) accessible by users. There are plenty

7

2. Background

Table 2.2: The cloud managers compared

OpenNebula Eucalyptus Nimbus

PhilosophyPrivate, highlycustomizablecloud

Mimic AmazonEC2

Cloud resourcestailored for scien-tific researchers

CompatibilityOpen, multi-platform

Compatible withEC2, S3 and EBS

Compatible withEC2

CustomizabilityBasically every-thing

Some for admin,less for user

except imagestorage andglobus credentials

Hypervisorssupported

Xen, KVM,VMware

Xen, KVM(VMware innon-open source)

Xen, KVM

Unique featuresVM migrationsupported

User manage-ment web inter-face

Nimbus contextbroker

of cloud managers at IaaS layer in the open source community, among which

Eucalyptus, OpenNebula, Nimbus are the dominators.

Eucalyptus [17] is designed to be a private cloud computing platform that im-

plements the API of Amazon EC2, S3 and EBS. The Nimbus[18, 19] is an open-

source toolkit which is built on top of its predecessor, Globus Toolkit in Grid

Computing. OpenNebula aims at building the industry standard open source

cloud computing tool to manage the complexity and heterogeneity of distributed

data center infrastructures. Sempolinski et al. has provided a thorough com-

parison of OpenNebula, Eucalyptus, and Nimbus in [20] which we summarize in

Table 2.2.

Despite their differences of detailed implementations, they all share the com-

mon purpose of managing VMs in an easy-to-use way and the procedure of pro-

visioning VMs, as described in Figure 2.2. A typical procedure of provision of

virtual resources (specifically VMs) follows the steps below:

1) Cloud users access to the head node of the cloud through a piece of client

application (or web management interface depending on which is provided by

the cloud provider), and issue the action of requesting a VM with additional

8

Figure 2.2: Typical procedure of VM provisions in IaaS cloud

user-specified configuration (if any).

2) Head node then pushes the fresh image of VM from the pre-configured image

repository to one of the work nodes. Sometimes a VM scheduler is involved

in this step to make sophisticated decisions on which work node this image

should be pushed to.

3) After receiving the image, the hypervisor on the work node creates and starts

the VM requested by the user.

4) During the startup of the VM, bridged network is configured, and it then

requests a network address from the DHCP server (dhcpd).

5) At the end, the user may access to the VM in the way as if he/she may do to

a regular remote machine.

Since OpenNebula has been deployed by DAS-4 and SARA on their HPC clus-

ters, we follow their choice and use OpenNebula as our cloud manager. Moreover,

OpenNebula’s high customizability also makes it a good candidate for us.

9

2. Background

2.3 Virtualization: Hypervisors

Virtualization generally refers to the process of creating one or more virtual ver-

sions (with regard to the actual version) of computer resources such as hard-

ware platforms, operating systems, storage/memory devices or network resources.

Within the scope of IaaS cloud platform, it is usually limited to the technology

of creating multiple virtual machines, specifically system virtual machines1. A

system VM is a complete and isolated guest OS installation within the host OS.

With the help of system VM technology, multiple OS environments can co-exist

on the same computer, in which case the underlying physical resources are shared

among all VMs and the resources which may be occupied by applications running

inside a VM are limited to the resources provided by the VM.

VMs and their interactions with the underlying physical resources are man-

aged by a software layer called virtual machine monitor (VMM) or hypervisor.

Depending on where they are running on, hypervisors are generally classified into

two categories [21]:

• Type-I (or native) hypervisor runs directly on the host’s hardware (see Fig-

ure 2.3(a)) to manage the hardware and VMs, as well as their interactions.

• Type-II (or hosted) hypervisor runs as a normal application on top of host’s

OS (see Figure 2.3(b)). The communication between the hypervisor and the

hardware has to pass through the host’s OS.

Theoretically, performance differences of a native OS and a guest OS running

in a Type-I VM are, in general, barely noticeable; but a guest OS running in a

Type-II VM has significant performance degradation even compared to Type-I

VMs, because with a hosted hypervisor the guest OS has to go through more

software layers to reach the hardware, as described in Figure 2.3. Therefore to

meet the requirements of scientific applications, our work will focus on Type-I

hypervisors only2.

1Sometimes called hardware virtual machines. Another type of virtual machine is processVM which runs a single process as a normal application inside a host OS, with Java VirtualMachine (JVM) as a widely-known example.

2Another practical reason is that only native hypervisors are supported by OpenNebula.Therefore Type-II hypervisors are beyond our concern.

10

(a) Type-I (native) hypervisor (b) Type-II (hosted) hypervisor

Figure 2.3: Architecture comparison: native hypervisor vs. hosted hypervisor

We restrict our discussion to the two widely-known open source Type-I hyper-

visors, i.e. Kernel-based Virtual Machine (KVM)[22] and Xen[23], even though

there are many others such as VMware ESX/ESXi1 and Microsoft Hyper-V. Both

KVM and Xen are virtual machine monitors for x86, x86-64, and IA-32/-64. The

most significant difference between them is to which degree the hardware is vir-

tualized by them. KVM is a full virtualization technology which supports guest

operating systems running unmodified; the guest operating systems running on

Xen need to be specifically modified, which we call para-virtualization. As of

Xen version 3.0, full virtualization is also supported if the CPU supports x86

virtualization (such CPUs include Intel VT-x and AMD-V).

There are debates on whether KVM is a native hypervisor or a hosted hy-

pervisor. On one hand KVM exists in Linux as a kernel module, which qualifies

it as a Type-II hypervisor; on the other hand, however, it exposes the hardware

virtualization extensions (such as Intel VT-x and AMD-v) to the Linux kernel,

which effectively turns the kernel into a Type-I hypervisor. Therefore, we clas-

sify it as a Type-I hypervisor. Besides, its performance is also comparable with

other Type-I hypervisors such as Xen. Table 2.3 summarizes the similarities and

differences between Xen and KVM.

1These should be distinguished with other products of VMware Inc., such as VMwareWorkstation and VMware Server, which are basically Type-II hypervisors since they run asnormal applications within Linux or Windows operating system.

11

2. Background

Table 2.3: A comparison between Xen and KVM

Xen KVM

Para-virtualizaiton Yes No

Full virtualization Yes Yes

Host CPUx86, x86-64, Itanium,IA-64, ARM

x86, x86-64, IA-64,PowerPC

Host OSModified versions ofLinux, NetBSD andSolaris

Linux

Guest OSModified Unix-like OS,Windows

Linux, Windows, Unix

Intel VT-x / AMD-V Optional Required

Live migration Yes Yes

There has been a lot of research on benchmarking and comparing the perfor-

mance of these two hypervisors [24, 25, 26, 27, 28]. To summarize their work, Xen

performs slightly better in network virtualization, while for other workloads such

as Linpack, Fast Fourier Transformation (FFT) and IOzone, its performance is

much worse than KVM. This finally leads to our decision of adopting KVM as

the hypervisor of our test cloud platform.

12

Chapter 3

Experimental Setup

As planned in Chapter 1, we will first characterize the power consumption of both

the hardware and VMs. This knowledge will then be used to develop an energy-

aware VM scheduler for the purpose of energy-efficient computing. A power

measurement environment, as an evaluation to carry on power benchmarks and

to verify the effectiveness of our energy-aware scheduler, is essential to both of

these two tasks.

In this chapter, we will describe our power measurement setup, followed by

more details on both the hardware and software configurations of our power

benchmark test-bed as well as various types of workload generators used within

our experiments.

3.1 Power measurement method

The commonly used approach of measuring the power consumption of a system is

the one adopted by the Green500 list[29, 30], as described in [31]. This approach

consists of the following three basic entities:

• a System Under Test (SUT).

• a power meter which provides the value of power consumed by the SUT. It

usually resides between the power supply and the SUT.

• a data logger to record and analyze the power data.

13

3. Experimental Setup

The power measurement setup is illustrated in Figure 3.1. Depending on

the use scenarios, a SUT could be either a single work node within a cluster,

a collection of several work nodes, or even the entire cluster. However with

more work nodes included within a single SUT, the measurement results are less

meaningful due to the low granularity of the system. By sitting between the

SUT and the power supply, the power meter is able to provide the actual power

consumed by the SUT. The data logger is a piece of software which normally run

on a device other than the SUT to eliminate its impact on the power consumption

of the SUT.

Figure 3.1: Power measurement setup

This approach is simple and intuitive, but has its limitation; since the SUT is

measured as a single unit and the minimal unit is a single work node, it cannot

provide detailed information on the power consumed by each component of a work

node. Therefore, we propose a workaround (see Chapter 4) for this by identifying

workload patterns.

14

3.2 Power metrics

There are generally five metrics within our power benchmarks, which are classified

into two categories - measurement metrics and derived metrics. Measurement

metrics are the ones that could be directly read from the hardware monitoring

tools or software applications without advanced calculations, such as runtime of

workloads, power consumption and performance. The derived metrics are derived

from measurement metrics. Table 3.1 describes each metric in detail.

Table 3.1: Definition of benchmark metrics

Metric Definition Value

Performance∗evaluation of how well theSUT performs benchmarks

value as reported by bench-mark tools themselves

Powerconsumption∗

average power consumedby the SUT during bench-marks

average of values reportedby the power meter

Execution time∗time duration of bench-marks

wall clock time reported bythe data logger

Power-efficiency‡evaluation of how efficientthe power is used

performance divided bypower

Energyconsumption‡

cumulative power con-sumption over executiontime

power multiplied by execu-tion time

∗ Measurement metrics, as their values are directly obtained from devices orfrom benchmark tools.‡ Advanced metrics which are derived from measurement metrics.

Depending on the purpose of performance benchmarks, the Performance met-

ric may have various forms. For instance, throughput of a system is a good

performance metric for system I/O benchmarks, but it would be less desirable

for computational benchmarks which may be better represented by, for exam-

ple, number of unit operations per unit time. Our benchmarks will focus on

the SUT’s computational performance. A typical performance metric for it is

FLoating-point OPerations per Second, namely FLOPS or Flops. It is widely

accepted as a performance metric in ranking supercomputers on the TOP500

list[32]. Besides Flops, the performance metric of integer operations per second,

usually measured in Million Instructions per Second (MIPS), will also be exam-

15


ined in our benchmarks.

The power consumption metric in our benchmarks refers to the average power

over the execution time of benchmarks, instead of instant power which is defined

as the power consumed by the SUT at a specific time point.

The power efficiency metric is defined as performance per Watt, as described

in Eq. 3.1:

epower =Performance

Power(3.1)

where performance and power are averaged values for static analysis. In [33], Hsu

et al. discussed several possible types of power-efficiency metrics and proposed

to use GFlops/W as an appropriate one, where GFlops is short for Giga Flops.

Since it is well accepted as the power efficiency by Green500 list, we will also use

it as one of our power-efficiency metrics. Besides this, we also take MIPS/W as

a complementary to GFlops/W. The energy consumption metric is calculated by

multiplying the average power consumption with execution time.

Among the five metrics, the power, power-efficiency and energy are our major

concerns as they are directly related to the green aspect of computing. The pur-

pose of research on energy-aware computing is to reduce the energy consumption

of applications without sacrificing their performance, or with reasonable sacrifice

as long as it still meets the users’ or applications’ requirements. Thus the energy

consumption metric turns to be the appropriate metric for applications with def-

inite execution time; for applications with unlimited execution time (e.g. hosting

a web server), however, it is not as suitable as the power-efficiency metric.

3.3 Hardware & Software environment

The power meter and SUT are two basic hardware entities in our test-bed. De-

pending on the use scenarios, the SUT could be a single server (work node) or

a cluster of servers. In order to make our work more generic and practical to

production cloud systems, we will fully utilize the hardware provided by the Dis-

tributed ASCI Supercomputer 4 (DAS-4 [4]) which is designed as a six-cluster

wide-area distributed system to provide a common computational infrastructure

for researchers within ASCI in Netherlands. The features of the six clusters are

16

described in Table 3.21. The two clusters hosted by VU University Amsterdam

(VU) and University of Amsterdam (UvA) have almost the same features except

that VU’s cluster has been equipped with GPUs.

Table 3.2: Heterogeneous design of DAS-4 clusters

Cluster Nodes Type Speed Memory StorageNodeHDDs

Network Accelerators

VU 74Dual-quad-core

2.4GHz 24GB 2*1TB 30TBIB &GbE

16*GTX480+ 2*C2050

UvA 16Dual-quad-core

2.4GHz 24GB 1TB 30TBIB &GbE

futureupgrade

LU 16Dual-quad-core

2.4GHz 48GB 50TB5*2TB+.5TBSSD

IB &GbE

futureupgrade

TUD 32Dual-quad-core

2.4GHz 24GB 18TB 2*1TBIB &GbE

8*GTX480

UvA-MN

36Dual-quad-core


8*GTX480+7*C2050+2xGTX480

ASTRON 24Dual-quad-core


8*GTX580+1*C2050+1*HD6970

In order to get better granularity in our benchmarks, our SUT is configured

to be a single standard DAS-4 work node with a dual-quad-core CPU (Intel

E5620), 24 GB memory and roughly 1 TB of storage. Table 3.3 lists part of Intel

E5620’s specifications2 which is vital to our benchmarks. The operating system

running on the SUT is a fresh CentOS 5.6 (Kernel version: 2.6.18-238.9.1.el5)

with Dynamic Voltage-Frequency Scaling (DVFS) enabled.

The power meter used in our experiment is a 32A PDU gateway from Schleifen-

bauer. It could provide power data through public APIs in PHP, Perl, and SNMP

with the precision of 1 V in voltage and 0.01 A in current. The instant power

consumption is calculated by multiplying the voltage, the current and the power

factor together.

1Cited from http://www.cs.vu.nl/das4/clusters.shtml2For the full list of specifications of Intel E5620, please refer to http://ark.

intel.com/products/47925/Intel-Xeon-Processor-E5620-(12M-Cache-2_40-GHz-5_

86-GTs-Intel-QPI)

17

http://www.cs.vu.nl/das4/clusters.shtml

http://ark.intel.com/products/47925/Intel-Xeon-Processor-E5620-(12M-Cache-2_40-GHz-5_86-GTs-Intel-QPI)




Table 3.3: Specifications of Intel E5620

Essentials Advanced features

# of Cores 4 Intel Turbo Boost Technology# of Threads 8 Intel Hyper-Threading TechnologyClock Speed 2.4GHz Intel Virtualization Technology (VT-x)Max Turbo Frequency 2.66GHz Idle StateMax TDP 80W Enhanced Intel SpeedStep Technology

3.4 Data logger and Monitoring

Our long-term purpose is to deploy a monitor system to easily and user-friendly

collect and present the status of both the entire cluster (overview) and each single

work node; therefore, we chose Ganglia[34] and deployed it on the front-end of

our private cloud system. The data logger was implemented as an extension to

it.

Ganglia is a scalable distributed monitoring system for HPC systems. It was

born from the UC Berkeley Millennium Project[35], and also widely deployed on

many other HPC systems. As shown in Figure 3.2, it consists of two types of

daemons, namely Ganglia Meta Daemon (gmetad) and Ganglia Monitor Daemon

(gmond) which are run on the front-end and work nodes respectively, a Round-

Robin Database (RRD) to store data, and a web-front to visualize it.

• gmond resides on each work node that is being monitored, and collects the

local system’s status metrics such as CPU and memory usage, and then

sends them to gmetads.

• gmetad is a daemon running on the front-end of a cluster. It periodically

polls gmonds and store their metrics into a storage engine like RRD.

• A RRD, on one hand, is adopted by gmetad as a database to store all

metrics collected by gmetad; on the other hand, the metrics stored in RRD

are retrieved and visualized on a web front-end.

Though Ganglia has plenty of built-in monitoring metrics, the power metric

is not included due to its dependency on the APIs provided by the manufacturer

18

Figure 3.2: The Ganglia monitoring system

of the power meter. In our experiments, we integrated the power metric into

Ganglia with the Perl APIs provided by Schleifenbauer1.

For stress tests which normally will run for more than 10 minutes and where

power consumption of the work node will not vary much, we collect the power data

every 5 seconds in order to avoid adding too much overhead on the work node; for

instant workload tests where both the resource usage and power consumption of

the work node vary dramatically with time, power data is collected every second

to get accurate and reliable data.

3.5 Workload generator

Our workload generators are carefully selected to profile the power consumption

of the SUT in generic and extreme use cases. The generic use case corresponds

to applications that keep running but do not fully utilize the resource, i.e. a

database server or a webserver. The extreme case corresponds to applications

that stress resource usage to its limits.

Within a SUT, we broadly identified CPU, Memory, Hard disk drive (HDD),

and GPUs (if present) as the major components which consume the most power

1Available at http://sdc.sourceforge.net/index.htm

19

http://sdc.sourceforge.net/index.htm


of a work node in high-performance computing environments. Since GPU is not

presented on our test machines, we focus our work on energy profiles of CPU,

Memory and HDD at the moment.

Our goal in generating the workloads is to separately profile the impact of

each component with respect to the total power consumption. Thus we choose

Stress [36], the Intel optimized LINPACK [37] benchmark and the Fhourstones

benchmark as our three major workload generators. Besides them we also write

our own scripts to generate other specific workload patterns as complementaries.

3.5.1 Linpack

The Linpack benchmark[38], as a tool to evaluate a system’s floating-point com-

puting power by letting the SUT solve a dense N by N system of linear equations,

is both CPU- and Memory-intensive, but has few operations on HDD. Its com-

putational complexity highly depends on the number of equations, i.e. N , which

is practically restricted solely by the memory available to the system. Normally

larger N will result in better performance reported by Linpack. Thus we set N to

its maximum possible value in our experiments to get the best performance. Dur-

ing each run, Linpack internally records the CPU time instead of wall clock time

as the execution time, and at the end reports its result in millions of floating point

operations per second (MFLOPS), or sometimes in GFLOPS. The way Linpack

calculates the execution time may be reasonable for traditional supercomputers

and clusters; it may, however, provide misleading information in virtualized en-

vironments where applications are running on a host’s physical CPUs through

virtual CPUs (vCPUs). This is especially obvious in over-committed environ-

ments where a VM has more vCPUs than the available physical CPUs on the

host. We will explain this in more detail in Section 4.2.1.

3.5.2 Dhrystone and Fhourstones

Another set of benchmarks similar to Linpack is the integer operation bench-

marks, which we achieved with Dhrystone and Fhourstones.

As described in [39], the Dhrystone benchmark is a synthetic integer bench-

mark tool which is carefully designed to statistically mimic the processor usage

20

of some common set of programs. Reinhold P. Weicker released its first version in

1984 after carefully characterizing a broad range of software in terms of various

common constructs such as procedure calls, pointer indirections, assignments, etc.

The benchmark result is reported in number of Dhrystones per second, which is

basically the number of iterations of the main code loop per second.

Named as a pun of Dhrystone but unlike the Dhrystone, the Fhourstones is a

problem-oriented integer benchmark which aims to efficiently solve the positions

in the game of Connect-41, as played on a vertical 7x6 board. The benchmark

result is expressed in number of fhourstones, where a fhourstones is taken as a

thousand positions searched per second.

Note that neither Dhrystone nor Fhourstones uses the intuitive and straight-

forward MIPS as the unit to report its results. This is to hide the details of

the underlying instruction set and make the results comparable even on ma-

chines with different instruction sets (e.g. RISC vs. CISC). However, since we

are benchmarking on the same hardware, it’s of less important to us in which

way the results are reported. Moreover it’s thus also meaningless to compare be-

tween Dhrystone and Fhourstones. Therefore we will continue to adopt their own

units, instead of MIPS, as the performance metric of the SUT’s integer operation

capability2.

3.5.3 Stress

Stress [36] is a simple stress tool which is designed to spawn one or more processes,

named workers, for a pre-specified amount of time. Each worker is dedicated to

stretch either the CPU, Memory, or HDD usage on a single work node. Basically,

Stress works as follows:

• a CPU-stress worker persistently carries on sqrt() operations on random

variables.

1Connect-4 is a two-player game which is normally played on a 7x6, 8x7, 9x7, or 10x7 board.See http://en.wikipedia.org/wiki/Connect_Four for more details.

2Another popular MIPS-normalized representation of the Dhrystone benchmark’s result isthe DMIPS (Dhrystone MPIS). It is calculated by normalizing it with the number of Dhrystonesper second (1757) on a 1 MIPS machine (VAX 11/780), i.e. dividing the Dhrystone score by1757

21

http://en.wikipedia.org/wiki/Connect_Four


• a Memory-stress worker repeatedly malloc()s a certain amount of memory

as specified by the user and then fills it with random data before free()ing

it.

• a HDD-stress worker frequently writes random data to the disk.

3.5.4 Customized scripts

Besides the two stress workload generators described above, we also wrote our own

scripts to perform divisions on random numbers to mimic generic non-stressful

workloads. The time interval between operations is automatically adjusted so that

the workload, i.e. CPU usage, will be increased gradually and then decreased in

a similar way after it reaches 100%.

3.6 Virtualization

Our virtualization is done with Kernel-based Virtual Machine (KVM)[22][40]

which is a full virtualization for Linux on supported x86 hardware. A guest

VM running with KVM, in principal, lives as a regular linux process on its host.

In our experiments all of the VMs are configured with the same amount of

virtual memory but with different numbers of virtual CPUs (vCPUs). For each

VM we allocate 20GB memory out of 24GB available physical memory in order

to get the best performance from Linpack. The number of vCPUs of a VM varies

from 1, 2, 4, 8, to 16, where the last case is an over-committed VM since it has

more vCPUs than available physical cores.

22

Chapter 4

Profiling VMs’ power

consumption

In the previous chapter, we have elaborated our test environments which are used

to profile the power behavior of VMs and hosts in this chapter. As have been

presented in Section 3.2, there are three power metrics to express the energy

profile of a system:

• Power (W ), which provides the consumed wattage as reported by the power

meter;

• Power efficiency (GFlops/W ), which is the system performance expressed

in GFlops divided by the power;

• Energy(kJ ), which is the power integrated over the execution time.

The ultimate goal of green computing is to reduce the total energy consumed

while running applications; however, not all metrics are able to be measured

for applications. For instance, it is usually not feasible to obtain the energy

information for applications that run indefinitely on a resource. In these cases,

an energy-aware scheduler will have to base its decision on power and possibly

power efficiency alone. For applications with limited execution time a scheduler

can use the energy metric in its optimization process.

All of the tests we have performed can be classified into two categories: com-

ponent tests and overall tests, as summarized in Table 4.1 and 4.2 respectively.

23

4. Profiling VMs’ power consumption

Each benchmark runs on both the SUT and virtual machines with different num-

ber of vCPUs which we had explained in Chapter 3.6. We call a Guest VM a

virtual machine configured with the same number of vCPUs as the number of

available physical cores, i.e. 8 cores in our case.

Table 4.1: Summary of component benchmarks

Component Test type Workload Metric

CPUCPU usage Script Power consumption

Freq. scaling LinpackPerformance, power effi-ciency, energy consumption

Different VMs LinpackPerformance, power effi-ciency, energy consumption

Memory# of workers Stress Power consumptionMemory usage Stress Power consumption

HDD Timeline StressCPU usage, Power con-sumption

Table 4.2: Summary of overall benchmarks

Test type Workload Metric

Floating-point operation LinpackPerformance, power efficiency,energy consumption

Integer operationDhrystone &Fhourstones

Performance, power efficiency,energy consumption

Hyper-Threading LinpackPerformance, power efficiency,energy consumption

In the remainder of this chapter, we will first profile the contribution of each

major hardware component, e.g. CPU, memory and HDD, to the total power

consumption of the SUT in Section 4.1. Then we continue with the overall tests to

explore the ’green-ness’ of both floating-point and integer operations on the SUT.

Finally we will study the impact of Hyper-threading and finish with discussions

on all benchmark results.

24

4.1 Component benchmarks

In this section, we will separately examine the impact of each individual compo-

nent of a SUT (i.e. CPU, memory or HDD) to the total power consumption. As

this knowledge will be accepted to build our power model in Chapter 5, we will

particularly focus on their variations.

4.1.1 CPU

The process of power profiling for CPU is divided into two sections, the CPU

usage test and the CPU frequency scaling test. The CPU usage test characterizes

the variation of total power consumption with respect to the CPU usage, while

the other one examines the energy efficiency of the SUT with different CPU

frequencies.

CPU usage test

The CPU usage test measures the total power consumption of a SUT with respect

to its CPU usage. For our test machine, a symmetric multiprocessor (SMP)

system with 8 cores, we vary the CPU usage in the following two ways:

• Case I : vary the workload on all available cores and take their average value

as the CPU usage;

• Case II : change the number of cores being used and stretch each used core

to its maximum usage immediately when it starts up.

Dynamic CPU frequency scaling in our SUT is handled by Enhanced Intel

SpeedStep technology. The Linux kernel will switch it to the highest frequency

immediately when the load reaches the threshold. With frequency scaling en-

abled, we observed a clear turning point for Case I when CPU usage reached

∼65% in Figure 4.1(a).

For the case with fixed CPU frequency we chose Turbo mode for the CPU

speed where CPU frequency is fixed to be the maximum value, i.e. 2.66GHz

for Intel E5620 in our case. Turbo mode uses the TurboBoost technology from

Intel [41], which enables the processor to run above its base operating frequency

25


(a) Case I: Gradual increase of the CPUloads on all available cores

(b) Case II: Gradual increase of number ofcores, where each core is at its maximumusage

Figure 4.1: Power consumption versus CPU usage

(2.40GHz in our case) via dynamic control of the CPU’s clock rate. In this case

the total power consumption is nearly linear with the CPU load except a sharp

increase when the CPU load goes from idle to ∼5%, as shown in Figure 4.1(a).

Figure 4.1(b) shows the results for case II. In this case, the CPU frequency of

each used core jumps to the Turbo model immediately when it starts up because

it is stretched to 100% usage. The rest idle cores remain at the lowest frequency

(i.e. 1.60GHz in our case). According to the results shown in Figure 4.1(b),

the power consumption grows also linearly with the number of threads, up to

the point when we start to have more than 8 threads because both the number

of physical cores and vCPUs of the Guest VM are 8. We observe a negligible

difference in power consumption between the host and the Guest VM since both

of them stretch the usage of each used core to its limit.

CPU frequency scaling test

In this series of tests we disabled dynamic CPU frequency scaling and manually

varied the CPU frequency among several available frequencies within our SUT.

The benchmark results are shown in Figure 4.2. The maximum available fre-

quency supported by our SUT, namely 2.66GHz, corresponds to the turbo mode.

Figure 4.2(a) presents the power consumption and performance of our exper-

iments on the SUT and the guest VM. We observe that the guest VM has nearly

the same power consumption as the host but with worse performance. As we have

26

explained in Chapter 3.5 that the complexity of Linpack benchmark is determined

by its problem size (i.e. N) and limited by the amount of available memory on

the machine, the results of these two sets of benchmarks are comparable because

they have the same problem size (N = 45000) and memory usage of ∼16GB.

Since the CPU is fully utilized, we conclude that the performance degradation

in the Guest VM comes from the virtualization solely. Other researches [42] [28]

have shown similar pattern in the case of Linpack benchmark on KVM VMs: the

processing efficiency of KVM on float-point operations is lower than the host as

KVM checks every time whether an executing instruction is an interrupt, a page

fault, I/O or a common instruction in order to decide if to exit from the guest

mode or to stay in it.

(a) Power & Performance

(b) Power efficiency (c) Total energy consumption

Figure 4.2: CPU frequecy scaling benchmark

Another important result of this series of benchmarks is about the total en-

ergy consumed for each experiment. As shown in Figure 4.2(a) and 4.2(b), per-

formance and power efficiency improve almost linearly as the frequency scales

27


up, but the total energy consumption decreases (see Figure 4.2(c)) in a similar

manner. Notice that by setting the CPU to turbo mode, the SUT consumes

more power than in all other cases; however, it takes less time to complete the

Linpack benchmark which finally results in less energy used. Therefore we come

to the general conclusion that it’s ’greener’ to set the CPU to turbo mode for

CPU-bound workloads.

We also measured the idle power consumption of the SUT for different CPU

frequencies. Through our experiments we have seen that the idle power con-

sumption remains constant, regardless of the CPU frequency. The value for our

SUT is ∼90W across the whole frequency range. However as we also observed

from the CPU usage test in Figure 4.1(a), there is a sharp increase in power

consumption when the CPU usage increases up to ∼5% with higher frequency.

It is therefore advised to scale down the CPU frequency when the work node is

idle or lightly-loaded in order to be more energy-efficient.

4.1.2 Memory

We have performed two types of tests to quantify the impact of the memory on

the energy profile of the SUT and the guest VM:

• Worker tests, where we vary the number of workers spawned by the Stress

benchmark (see Chapter 3.5);

• Memory usage tests, where we gradually increase the size of memory allo-

cated by the malloc() call on each Stress worker, from 1GB to 18GB in

total.

In the worker test we observe again in Figure 4.3(a) that the increase in power

consumption levels off when the number of threads equals the number of physical

cores on the host. In the memory usage test reported in Figure 4.3(b), less power

is consumed by the VM compared to the host for the same test, and the total

power consumption remains nearly constant, regardless of memory usage. From

Figure 4.3(b) we also identify that the variation of total power consumption with

respect to memory usage is less than ∼5W.

28

(a) with various numbers of workers (b) with different memory usage

Figure 4.3: Memory stress tests

Furthermore we have tried to separate the CPU’s and memory’s contribution

to the total power consumed by the host, by using the Stress tool described in

Chapter 3.5. During the memory usage tests the CPU is fully occupied by system

threads performing malloc(), reading/writing and then free() operations. By

combining the measurements reported by the CPU usage test and the memory

usage test for the host, we can estimate the power consumed by memory, as shown

in Figure 4.4. In the figure we see a separation of no more than 5W in the whole

range of number of threads.

Figure 4.4: CPU and Memory Stress tests on the host

A recent DARPA commissioned study on the challenges for ExaFLOP com-

puting reports in [43] that the power needed for memory systems remains con-

stant regardless of the workloads, but that power is proportional to the number

of memory chips. Our benchmark results presented above verified that the varia-

29


tion in power consumption of memory systems is ignorable compared to the total

power consumption. Therefore, we will consider it as a constant throughout our

benchmarks and incorporate it with the idle power consumption in the future

research.

4.1.3 HDD

For our HDD stress tests, 8 workers are spawned spinning on write()/unlink()

operations with each worker writing chunks of 1MB random data to an temporary

file until it reaches 1GB, and then unlink() it. In the tests we have observed

little memory usage but high (system) CPU usage. Figure 4.5 shows the SUT’s

CPU usage and total power consumption with respect to time.

Figure 4.5: HDD stress tests on the host

In the timeline plot in Figure 4.5, we observe a strong relation between the

total power consumption and the CPU usage, i.e., the total power consumption

scales up when high CPU usage is observed. However both the CPU usage and

power consumption changes quickly and dramatically, which makes it difficult

to quantify the impact of HDD on total power consumption. Therefore while

building the power models in Chapter 5, we will put aside the impact of HDD

and focus on the CPU and memory by restricting our analysis in the scope of

CPU- and memory-intensive applications.

30

4.2 Overall benchmarks

Overall benchmarks examine the overall performance of VMs, including the per-

formance of both floating-point operations and integer operations. At the end we

also examine how the HT technology affects the SUT’s overall performance.

4.2.1 Floating-point operation performance

In our experiments VMs are configured to have different number of vCPUs, as we

explained in Chapter 3.6. We have profiled the performance, the power consumed,

the power efficiency and the total energy of different VMs by varying the number

of threads used in Linpack benchmarks. Therefore within this series of tests,

there are two variables:

• the number of threads running Linpack benchmark

• the number of vCPUs of a VM.

The actual number of physical cores involved in the benchmark is determined

by their minimal value and bounded by the number of available physical cores

(i.e. 8 in our case).

Figure 4.6 shows our results. We see that all measured parameters increase

until they reach a plateau when the number of threads is the same as the num-

ber of vCPUs for all non-overcommitted cases. Besides that, the performance

increases linearly with respect to the number of involved physical cores (see Fig-

ure 4.6(a)); so does the power consumption (Figure 4.6(b)). However, this is not

the case for power efficiency (Figure 4.6(c)) and energy (Figure 4.6(d)). We will

explain this phenomena in details with our power model in Chapter 5.

Our outputs also indicate that virtualization results in fixed amount (∼30%)

of overall performance degradation with respect to the Linkpack benchmark.

Another interesting result in our experiments comes from the over-committed

VM with 16 vCPUs. When 16 threads are used to run the Linpack benchmark,

it performs less satisfactorily than the same case on the VM with 8 vCPUs. Even

though it consumes less power, its execution time (see Table 4.3) is around 13

times longer, which further leads to much more energy consumed in the test, as

31


(a) Performance (b) Power consumption

(c) Power efficiency (d) Energy

Figure 4.6: Linpack tests on different VMs

shown in Figure 4.6(d). This is in line with what is known about the performance

degradation when over-committing symmetric multiprocessing guests with KVM

which is caused by dropped requests and unusable response times [44].

Table 4.3: Execution time of Linpack benchmark on different VMs# of threads 1 2 4 8 16

Host 6279 3236 1764 951 964VM 16 vCPUs 9102 4601 2387 1342 109683VM 8 vCPUs 8992 4529 2346 1307 1321VM 4 vCPUs 8982 4523 2340 2356 2365VM 2 vCPUs 8961 4516 4544 4543 4543VM 1 vCPU 9146 8992 8965 8994 8975

4.2.2 Integer operation benchmark

In this section we will examine the integer operation performance of our SUT as

an complementary to its floating-point operation performance evaluated in the

32

previous section.

Dhrystone

Specifically, the Dhrystone v2.1 from UnixBench benchmark suite [45] is used in

this series of benchmarks. Even though Dhrystone v2.1 is originally designed as

a single-threaded application, we varied the number of instances of Dhrystone

which can run concurrently on either the host or the Guest VM.

(a) On the host (b) On the Guest VM

Figure 4.7: Dhrystone benchmark on the host and the Guest VM with differentnumber of threads

Figure 4.7 presents the power consumption and performance of Dhrystone

running on the host and the Guest VM respectively. As shown in Figure 4.7(a),

their power usage is unstable, which makes it less meaningful to continue calcu-

lating the power efficiency in the similar way.

Fhourstones

The Fhourstones benchmark is also a single-threaded application with small code

size. Thus in this section we perform the benchmark only on the host and Guest

VM. The benchmark results are presented in Table 4.4.

Even though we still observed ∼7% of performance degradation in Guest VM,

it is much less than the one in floating-point operation (i.e. Linpack) benchmarks

stated in Section 4.2.1. It is also ∼ 7% less energy-efficient with virtualizaiton

according to the statistics in Table 4.4.

33


Table 4.4: Performance of the Fhourstones benchmark on the host and guest VM

Performance(KPOS/s)

Executiontime (s)

Power (W)Powerefficiency(KPOS/s/W)

Energy(KJ)

Host 8013 209.5 95.83 83.62 20.08GuestVM

7480 224 96.42 77.58 21.60

4.2.3 Impact of Hyper-Threading

By enabling multiple threads to run on each core simultaneously, Hyper-Threading

(HT) technology improves the overall performance of the CPU and uses it more

efficiently, especially for threaded applications. Within the previous benchmarks,

HT is disabled by default on our SUT. In this section, we will enable HT tech-

nology and explore its impact on the overall performance of the SUT.

We examined its impact for both non- and over-committed VMs and focused

on their floating-point operation performance. The same VMs are used in this

series of power benchmarks in order to make results comparable with our previous

discoveries.

Non-overcommitted case

In this experiment the guest VM is used to perform the Linpack benchmark while

HT is enabled on the host. Figure 4.8 presents the results of Linpack benchmark

running on the guest VM and host.

With HT enabled on the host, there is a significant difference in performance

of how the host performs Linpack benchmark, as shown in Figure 4.8(a). It is

suggested by Intel in [46] that HT is better disabled for compute-efficient appli-

cations, because there is little to be gained from HT technology if the processor’s

execution resources are already well used. What even worse is that spawning a

second process on the same physical core will force the physical resources such as

cache to be shared. If that happens, more cache-miss may be captured and fur-

ther degrades the performance. This issue has also been discussed in [47] which

generates the same conclusion. Another possible explanation for this is that the

host OS is not aware the HT technology on the underlying hardware. In this

34

case, the thread scheduler of host OS may treat the doubled virtual cores equally

and have scheduled, for example, 8 threads of the application to 4 physical cores.

It then results in half of the performance.



Figure 4.8: Impact of Hyper-Threading for Linpack tests on non-overcommittedVM (with 8 vCPUs) and the host

However the guest VM is surprisingly not affected by the HT technology

according to the results presented in Figure 4.8 where the two data series of the

guest VM almost overlap each other for all of the four metrics.

Over-committed case

Figure 4.9 presents the results of both with and without HT on the host. With HT

enabled, the over-committed VM (with 16 vCPUs) has significant increment in

performance, power consumption and power efficiency, compared with the case

where HT is disabled. Moreover, HT technology enables much more efficient

scheduling on vCPUs, which then results in great improvement in total energy

consumption, as shown in Figure 4.9(d).

35


Moreover by comparing the cases where number of threads running by Linpack

is less than 16, we also observed slight improvement in performance and power

efficiency while HT is enabled on the host, even though they consumed almost

the same power (see Figure 4.9(b)).



Figure 4.9: Impact of Hyper-Threading for Linpack tests on overcommitted VM(with 16 vCPUs)

While running Linpack with more than 16 threads (e.g. 24 threads), signif-

icant performance degradation has been observed even when HT is disabled on

host, but not when HT is enabled. However, when over-committing the host with

more than 16 vCPUs, we experienced dramatic performance degradation regard-

less of whether HT is on or off. Therefore we conclude that HT can handle any

number of threads but up to 16 vCPUs on an 8-core machine.

Integer operation performance

In this series of experiments, we performed Dhrystone benchmarks on the host, a

guest VM with 8 vCPUs and an over-committed VM with 16 vCPUs, and varied

36

Figure 4.10: Impact of HT technology for Dhrystone benchmarks

the number of parallel Dhrystone instances from 1 up to 16. Figure 4.10 presents

the results with HT on and off for Dhrystone benchmarks.

It is observed that the performance reaches a plateau after increasing linearly

till 8 Dhrystone instances. The HT technology has no significant impact on

their performance. Moreover, no significant performance degradation has been

experienced in virtualized environments.

4.3 Summary

Our experiments showed that the idle power consumption of the SUT remained

flat with respect to the CPU frequency. However we observed a steep rise when

the CPU load went from idle to 5% (see Figure 4.1(a)). It is therefore our primary

recommendation to maintain CPU frequency scaling in all systems. The effect

of scaling will be significant till the CPU load reaches ∼ 65%. Furthermore

we consider variation of clock speeds local optimizations which are not of great

interest when aiming, as we do, for system-level optimization.

The performance of floating-point operations (as shown by Linpack bench-

marks) and integer operations (as shown by Dhrystone benchmarks) are linear

37


to the amount of CPU resources used by applications. Thus we came to the

conclusion with a performance model that can be represented as:

P = cpUcpu (4.1)

where Ucpu is the CPU usage and cp is the performance parameter.

We also observed that the total power consumption of the SUT is linear to the

CPU load (see Figure 4.1). The contribution from the memory to the total power

consumption is ∼ 5W for all applications, which is negligible since it accounts for

only ∼ 5.5% of the idle power consumption (90W) or ∼ 3.5% of the maximum

power consumption (140W). As for the HDD, we will keep it in the power model

for the moment since it is difficult to quantify the HDD’s impact on total power

consumption, as presented in Chapter 4.1.3. Therefore by integrating the power

consumption of memory into c0 in Equation 6.1 and calling it Pidle, we believe

that the power model proposed by Bohra et al (see Equation 6.1) can be modified

to:

Ptotal = Pidle + c1Ucpu + c3Uhdd (4.2)

where Pidle is the idle power consumption of the host, Ucpu and Uhdd are the usage

of CPU and HDD respectively, and c1 and c3 are power parameters for CPU and

HDD. A next step is to add the contribution of hardware accelerators, such as

GPU to the above simplified formula.

Though HT technology does little help for systems with less running threads

than the number of physical cores since there are enough cores to host the run-

ning threads, it is of great help in virtualized environments, especially for over-

committed VMs. Therefore we recommend to keep HT enabled so that we are

able to take advantages from over-committing. When over-committing VMs on

lightly-loaded hosts, less work nodes are needed to host all applications. There-

fore energy can be saved by powering off the unneeded idle machines. However,

it’s not wise to overload a work node since it will significantly degrade the per-

formance. We will discuss this in detail in next chapter.

38

Chapter 5

Towards energy-efficient

scheduling

After having thoroughly profiled the power characteristics of both VMs and hard-

ware components in the previous chapter, we are able to provide our novel power

models for green VM scheduling in this chapter.

5.1 The power model

A power model is a mathematical description of the power behavior. We will

specifically focus on the power models for CPU- and/or memory-intensive appli-

cations, in which case the CPU and memory are the only two subcomponents

which may be variated in resource usage.

5.1.1 Prerequisites & Assumptions

We start the development of our power models with several prerequisites which

come from the profiling results in Chapter 4, and some basic assumptions. There

are three essential conclusions, which we call prerequisites later on, from the work

in Chapter 4:

Prerequisite 1. The performance of a work node (in terms of GFlops) is linear

39

5. Towards energy-efficient scheduling

to the CPU usage, as expressed in Equation 5.1

P = cpUcpu (5.1)

where Ucpu is the CPU usage and cp is the per-core performance parameter. For

a multi-core system with Ncpu, we calculate Ucpu as the sum of the usage of all

cores, e.g. 0 < Ucpu ≤ Ncpu. To calculate the value of cp, suppose the maximum

performance Pmax is achieved when Ucpu = Ncpu, then cp = Pmax/Ncpu.

Prerequisite 2. The variation in power consumed by memory is negligible, there-

fore the total power consumption of a work node (host) when running CPU-

and/or memory-intensive applications is linear to its CPU usage, as shown in

Equation 5.2.

P = Pidle + ceUcpu (5.2)

where Pidle and UCPU are the idle power consumption and CPU usage respectively,

ce is the per-core power parameter. Since the maximum power consumption Pmax

is achieved when Ucpu = Ncpu, we have ce = (Pmax − Pidle)/Ncpu.

Prerequisite 3. For virtualized environments (i.e. VMs), a VM has nearly

identical power characteristic as the host does, but with less performance achieved.

Therefore for VMs

c′e = ce and c′p < cp

where c′e and c′p are the per-core power and performance parameters of VMs respec-

tively. Note that the value of c′p depends on the host solely and has no significant

relation with the VM’s configuration because c′p is also the per-core performance

parameter. For example, a VM with 2 vCPUs has the same c′p as another VM

with 4 vCPUs if they run on the same host.

Prerequisite 4. Over-committing on a work node will not cause any additional

performance degradation for applications running on it if the work node is not

overloaded, especially when HT technology is enabled on the work node.

To provide a mapping to our work in Chapter 4, Table 5.1 provides the sample

values of Pidle, ce and cp for our test machine.

40

Table 5.1: Sample values of Pidle, ce and cp for our test machinePidle Pmax Pmax P′max cp ce

Value 90W 150W 75GFlops 52GFlops 9.4GFlops 7.5W

Besides the prerequisites that are obtained from the power profiling, we also

have several practical assumptions to establish our power models.

Assumption 1. A VM is shut down immediately when it is idle, i.e. no appli-

cations running within it. And a work node is powered off (i.e. turned into sleep

state) immediately when no VMs are running within it. Here we also assume that

the cost of ’waking up’ a work node is negligible.

Assumption 2. We assume that the idle power consumption Pidle, performance

and power parameters (i.e. cp and ce respectively) mentioned in Prerequisite 1

and 2 are only hardware-dependent. In other words, all work nodes have their

own values of Pidle, cp and ce; but for all applications running within a single

work node, they have the same cp and ce.

When establishing the power models below, we do not distinguish between

VMs and other applications running on the host because a VM actually lives

as a normal application in KVM virtualization. The difference is that a VM’s

performance is defined by how efficiently, compared with the host, the application

can run within the VM. And while applications run within a VM, the total

computations (e.g. in terms of GFlops) are determined by computations of the

applications and the overhead caused by the VM. In the previous chapter, we have

observed significant but steady performance degradations for applications running

in VMs. Therefore to simplify the case, we will calculate the computations of a

VM (along with all applications running within it) as follows.

Definition 1. The computations of a VM, along with all applications running

within it, is defined as the total computations of all applications runs on the VM

discounted by cp/c′p, where c′p and cp are the performance parameters of the VM

and the host of the VM and c′p < cp. For example, if the total computations of all

applications running on a VM are G, then the computations of the VM are

G′ =cpc′pG

41


since the performance of the VM is discounted by c′p/cp compared to the host.

Therefore, if an application with (original) computations of G running on a

VM, this VM’s computations are equivalent to an application with computations

of Gcp/c′p running on the host, because their execution time are identical to each

other. In other words, if an VM has computations of G, all applications running

within it has total computations of Gc′p/cp. In this way we can treat the VM

(along with all applications running on it) as a normal application running on

the host. And with this definition, we can uniformly establish the models for a

single work node and a cloud system regardless of whether the applications are

VMs or not.

5.1.2 Power model for a single work node

Theorem 1. For an application (e.g. a VM) with fixed amount of computations

G (e.g. in terms of number of Giga Floating-operations) running on a single

work node exclusively with idle power consumption Pidle, the power parameter ce

and the performance parameter cp, the total energy consumption of the work node

during the application’s runtime T can be expressed as the form in Equation 5.3

regardless of the dynamic CPU usage during the runtime.

Enode = PidleT +cecpG (5.3)

Proof. Suppose we start the application at t = 0 and the CPU usage at t = t is

Ucpu(t), then according to Prerequisite 1, the total amount of computations G is

calculated as

G =

∫ T

0

cpUcpu(t)dt⇒∫ T

0

Ucpu(t)dt =G

cp(5.4)

where T is the runtime of the application. And according to Prerequisite 2, the

total power consumption of the work node during the application’s life time is

42

expressed as

Enode =

∫ T

0

(Pidle + ceUcpu(t))dt

= PidleT + ce

∫ T

0

Ucpu(t)dt

By substituting the value in Equation 5.4, we finally get

Enode = PidleT +cecpG

When multiple applications (e.g. multiple VMs) running on the same work

node, the component of Pidle then crosses the lives of all applications, and the

power model evolves to the following one.

Corollary 1. When N applications (e.g. N VMs) run on a single work node

where each application has Gi computations during its runtime (ti0, ti1] (0 < i ≤

N), the total energy consumption of the work node has the form as expressed in

Equation 5.5.

Enode = PidleT +cecp

N∑i=1

Gi (5.5)

where T = |⋃

0<i≤N(ti0, ti1]| is the joint life time of all applications running within

this work node.

Proof. Suppose at time t, application i has the CPU usage U icpu(t) if t ∈ (ti0, t

i1]

(otherwise U icpu(t) = 0), then the total CPU usage at time t is

Ucpu(t) =N∑i=1

U icpu(t)

Similarly, we have (with Prerequisite 1)

Gi =

∫ ti1

ti0

cpUicpu(t)

43


Therefore, (with Prerequisite 1 and 2, Assumption 1 and 2)

Enode =

∫⋃

0<i≤N (ti0,ti1]

(Pidle + ceUcpu(t)

)dt

=

∫⋃

0<i≤N (ti0,ti1]

(Pidle + ce

N∑i=1

U icpu(t)

)dt

= Pidle · T +cecp

N∑i=1

Gi

where T = |⋃

0<i≤N(ti0, ti1]|.

Discussion

From the power model for a single work node above, we noticed that the total

power consumption of a work node can be decomposed into two components:

static and application-dependent energy consumption.

The first one corresponds to the idle power consumption of the work node

across the life time of all applications (i.e. PidleT ). It is the minimum energy that

a work node has to consume when running applications. Though it has no direct

relation with the work node’s performance and power efficiency, the life time of

all applications (i.e. T ) is implicitly determined by the work node’s performance.

The second one, i.e. application-dependent energy consumption, is determined

by the total computations of all applications, regardless of their dynamic CPU

usages during their runtime. It also means that overloading the work node has

no benefits nor harms for this component of energy consumption; however the

overloading may degrade the performance of each application, which will result in

longer T in the static component mentioned above, therefore overloading should

be avoided while making scheduling decisions.

Therefore the energy efficiency (i.e. ’green-ness’) of a work node is jointly

defined by the collection of (Pidle, ce, cp). It is greener to run applications on work

nodes with high cp and low Pidle and ce/cp.

44

5.1.3 Power model for a cloud system

In the analysis above, we notice that the energy-efficiency characteristic of a work

node i is directly identified by (with Assumption 2)

wni = (P iidle, ci, c

ip) (5.6)

where ci = cie/cip. cip is included as the third elements above because it has direct

impact on the execution time of an application. Similarly, an application i can

be identified as

appi = (Gi, Ti) (5.7)

where Ti = (ti0, ti1] represents the life of the application, e.g. the application starts

at t = ti0 and finishes at t = ti1. It is influenced by the host’s cp and the amount

of CPU resource it occupies. Therefore a cloud with M work nodes and the

collection of N applications can be represented as

C = {wni|0 < i ≤M} and APPS = {appi|0 < i ≤ N}

respectively. The placement of applications on the cloud turns to be a many-to-

one mapping between APPS and C:

f : appi(Gi, Ti)→ wni(Piidle, ci) (5.8)

With these symbols, the total power consumption of a cloud with M work

nodes and N applications running on them is then calculated as

Ecloud =M∑i=1

Einode

=M∑i=1

(P iidle · |

⋃k∈{j|f(appj)=wni}

Tk|+ ci ·∑

j∈{j|f(appj)=wnj}

Gj) (5.9)

where |⋃

k∈{j|f(appj)=wni} Tk| represents the joint life time of all applications run-

ning on work node wni.

45


5.1.4 The opportunity revisited

With the power model for a cloud system in Equation 5.9, the energy optimization

for a cloud system with M work nodes turns out to be

min(Ecloud) = minf

( M∑i=1

(P iidle·|

⋃k∈{j|f(appj)=wni}

Tk|+ci·∑

j∈{j|f(appj)=wnj}

Gj))

(5.10)

Notice that the power models elaborated above have one practical assumption

– the work node is powered off immediately when it is idle (i.e. no applications

or VMs running on it). A green work node is then characterized with small Pidle

and c (i.e. ce/cp), and large cp.

In Chapter 1.2 we briefly presented several opportunities for energy-aware

computing. By combining the mathematical descriptions above, we rephrase

them and provide three basic conceptual directions on placing and migrating

VMs across heterogeneous hardware within a data center in an energy-efficient

way:

a) Schedule VMs to work nodes with higher energy-efficiency, i.e. with smaller

ci;

b) Minimize the number of active nodes (i.e. M) by grouping multiple VMs

to a smaller number of work nodes through live migration and/or exploring

the possibility of over-committing on lightly-loaded work nodes.

c) Overloading a work node is not recommended since it comes with lost in

performance, with which the application may run longer and cost more

energy.

5.2 Energy-aware scheduler

There are two sub-schedulers which complement with each other to achieve the

goal of energy-aware scheduling:

Placement scheduler deals with incoming provision requests and distributes

these requests to either active work nodes or new activated work nodes

46

from the resource pool according to the energy-aware placement scheduling

algorithm.

Migration scheduler makes system-level optimizations according to the energy-

aware scheduling algorithm to discover any energy-saving opportunities through

live migration.

A typical working scenario of energy-aware scheduler is described in Figure 5.1.

The scenario consists of a service queue of VM provision requests, the placement

and migration schedulers, a set of active work nodes and a resource pool of idle

work nodes. The idle work nodes in the resource pool are powered off but ready

for provision.

Figure 5.1: Working scenario of energy-aware scheduler

When a sequence of VM provision requests arrive at the front-end of the

cloud system, they are put into a service queue, and the placement scheduler is

triggered to make the decision of mapping VMs to work nodes for the purpose

of green computing. The details of placement scheduler are explained in Section

5.2.2. Within the placement scheduling, new VMs are placed without interfering

other existed VMs, and no live migration is involved. The opportunity with live

47


migration is explored by the migration scheduler (see Section 5.2.3), where a

global optimization is issued on all active VMs across the entire cloud.

5.2.1 Symbolic description

In the working scenario of the scheduler, there are two basic types of roles: a) VMs

being scheduled and 2) work nodes (or workers) where the VMs are scheduled to.

Their static capabilities and dynamic status can be mathematically described as

collections of resources, as presented in Table 5.2.

Table 5.2: Mathematic description of a Work Node (WN) and a VMWork Node (WN) VM

Static (Ncpu, Pidle, c, cp) (Nvcpu)Dynamic (nvcpu, Ucpu) (nvcpu, Ucpu)

Parameters Where:- Ncpu is the number of physical CPUs,- Pidle is the idle power consumption,- c = ce/cp is the energy-efficiency pa-rameter,- ce is the power parameter as defined inEquation 5.2,- cp is the performance parameter,- nvcpu is the number of active vCPUs,- Ucpu is the current physical CPU usage,0 ≤ Ucpu ≤ Ncpu.

Where:- Nvcpu is the number ofvCPUs requested,- nvcpu is the number ofvCPUs actually used byvm, nvcpu ≤ Nvcpu,- Ucpu is the physicalCPU usage of this VM,Ucpu ≤ nvcpu ≤ Nvcpu.

With the mapping of placement of VMs defined in Formula 5.8, we may

calculate the number of active vCPUs on a work node wn as

wn.nvcpu =∑

vm∈{vm|f(vm)=wni}

vm.nvcpu

And all work nodes of the cloud system may be divided into two sets:

C = Cactive ∪ Cidle and Cactive ∩ Cidle = ∅

where Cactive and Cidle are, respectively, the sets of active work nodes and of idle

work nodes which is in sleep state (i.e. powered off but ready for provision). They

48

are described as

Cactive = {wni|wni.Ucpu 6= 0, wni ∈ C}

And

Cidle = {wni|wni.Ucpu = 0, wni ∈ C}

With Hyper-Threading enabled on the host, the number of (virtual) cores

exposed to the hypervisor is doubled. But the maximum performance is limited

by the number of physic cores, as profiled in Chapter 4. Though more virtual

cores are of little help for CPU-intensive applications (and of little damage either),

lightly-loaded applications will benefit a lot from it through over-committing.

And the performance differences of over-committed VMs, compared with non-

overcommitted ones, are barely noticeable.

Therefore when designing the scheduling algorithm, we assume HT is enabled

on all work nodes, and the maximum number of vCPUs that can be committed

is limited by [Ncpu − Ucpu].

5.2.2 Placement scheduler

The algorithm for placement scheduler is described in Algorithm 1. The job of

this scheduler is to schedule a VM (vm(Nvcpu)) to a suitable work node (wn).

We first sort all active work nodes in descending order of c and sleeping work

nodes in ascending order of Pidle. We then check whether there is an active work

node which can host this VM. If yes, this work node is returned since it is the

most energy efficient candidate. Even though theoretically there is a chance that

a sleeping work node with small Pidle and c and large cp exists so that it may be

more energy efficient if the VM is scheduled to that node instead of an active one,

it’s technically unfeasible at the moment. Moreover, even if it does exist, it will

be scheduled with higher priority by migration scheduler which we will discuss

in next section. If there is no suitable active node to host the VM, we resort to

sleeping work nodes.

If no work node returns at the end, it may be because all resources are in

use. Another possible reason is that work nodes are partially loaded but none of

them meets the VM’s requirement. If this is the case, it’s better to trigger the

49


Algorithm 1 Energy-aware placement scheduling algorithm

Input: vm(Nvcpu) being scheduledOutput: wn where the VM being scheduled to

1: Sort all wn ∈ Cactive in descending order of c2: Sort all wn ∈ Cidle in descending order of Pidle

3:

4: for all wni ∈ Cactive do5: if wni.Ncpu − wni.Ucpu ≥ vm.Nvcpu then6: return wni

7: end if8: end for9:

10: for all wni ∈ Cidle do11: if wni.Ncpu ≥ vm.Nvcpu then12: return wni

13: end if14: end for

migration scheduler described in next section to carry on global optimization.

Sophisticated decisions may be made (as future work) to determine the value

of Ucpu. Some suggestions may be to take average CPU load of one minute or five

minutes as the value of Ucpu.

An over-committed work node may be overloaded sometime because every

VM running on it has the right to use up to the number of physical cores the

VM has requested in terms of vCPUs. Therefore another algorithm should be

proposed to reschedule workloads to other lightly loaded work nodes according

to Algorithm 1.

Before we go into details of the algorithm, we first determine in which situation

a work node is overloaded in terms of CPU usage. It is intuitive that a work node

is overloaded if its physical CPU usage exceeds a pre-defined threshold T . But

we should also be aware that it is useless to reschedule workloads on a non-

overcommitted work node even if it is overloaded. For example, 8 single-threaded

CPU-intensive applications (VMs) running on a 8-core machine will definitely

overload the work node, but it’s unnecessary to migrate one or multiple VMs to

other work nodes.

Algorithm 2 describes the process of rescheduling workloads. The VM with

50

Algorithm 2 Replacement scheduling algorithm for overloaded work nodes

Input: wn, the work node that is overloadedInput: T , the threshold that defines overloading

1: Let VM = {vmi|f(vmi) = wn} be the collection of all VMs running withinwn

2: Sort all vm ∈ VM in ascending order of vm.Ucpu.3:

4: while wn.Ucpu ≥ T do5: Let vm = VM [1] be the first element in VM6: Apply Algorithm 1 on vm with input parameter vm.Ucpu

7: wn.Ucpu = wn.Ucpu − vm.Ucpu

8: VM = VM − {vm}9: end while

least CPU usage has the highest priority to be migrated on other work nodes,

even though this may require multiple migrations compared to the other way

around. The reason is that this VM is more likely to fit into other work nodes

because it requires less vCPUs, especially when we take its actual CPU usage

instead.

5.2.3 Migration scheduler

For the migration scheduler, global optimization of mapping all VMs to the entire

cloud system will be made. Basically our algorithm (as described in Algorithm

3) is built on top of several practical assumptions:

• If a work node is activated at this round, it will be up and running for a

long time. It is reasonable because if it is selected at this time, it means

that it is also a preferable candidate for other VMs.

• If a work node is activated, it will be fully loaded or near-fully loaded since

it is a greener work node and it will likely be selected to host VMs if a new

provision request arrives when it meets the VM’s requirement.

Based on the two assumptions above, we jointly define the energy efficiency

of a work node as a combination of Pidle and c in order. Taking into account

the fact that all work nodes’ idle power consumption Pidle may only have subtle

51


differences with each other, we then classify them in different power levels and

treat them equally if they fall into the same power level. This is where sorting

according to c comes into effect. The green sorting algorithm is described in

Algorithm 4. cp is reserved for maximizing performance which is not our major

concern here.

Algorithm 3 Energy-aware migration scheduling algorithm

1: Sort all wn ∈ C according to Algorithm 42: Sort all VMs being provisioned in descending order of their Nvcpu

3:

4: for all vmi ∈ VMs do5: for all wnj ∈ C do6: if wnj.Ncpu − wnj.Ucpu ≥ vm.Nvcpu then7: Schedule vmi on wnj

8: continue9: end if

10: end for11: end for

Algorithm 4 Green sorting of work nodes

1: Sort all wn ∈ C in ascending order of Pidle

2: Group them according to their power levels3:

4: for all power levels do5: Sort all work nodes in this level in descending order of c6: end for

Even though the migration scheduler may optimize the energy consumption

of a cloud system from a global perspective, it involves a lot of migrations across

the cloud, which all come along with cost in both energy and performance. All of

these penalties should be thoroughly profiled before deploying this scheduler. For

the same reason, it is also recommended that this migration scheduling should

not be frequently triggered.

52

Chapter 6

Related Work

Previous work which relates to our project can be classified into two categories.

The first one is energy and performance profiling which corresponds to our work

in Chapter 4; the other one is the framework for green clouds, especially the one

focuses on energy savings through sophisticated VM scheduling. The rest of this

chapter will elaborate on related work done in these two areas.

6.1 Performance and energy profiling

A lot of research has been done on profiling the performance of scientific applica-

tions on either public cloud systems such as Amazon EC2 [48, 49, 50] or private

clouds[51, 52, 53, 54]. Although these works focus on the performance solely, the

approaches in identifying the performance metrics are valuable to our work. We

borrowed some of these ideas but emphasized the ’green’ aspect of computing

without losing the focus on performance.

The most famous and active work about energy profiling of HPC systems is the

work done by green500.org on maintaining the Green500 List[29]. The Green500

List ranks supercomputers from the TOP500 List[32] according to their energy-

efficiency expressed in terms of performance per Watt. It is still an active project,

and the rank is updated twice a year. However their work mainly focuses on

ranking the supercomputer as a single unit, not on improving the energy-efficiency

of supercomputers. Moreover, energy efficiency of virtualized environments are

53

6. Related Work

out of their concern, which makes them less relevant to our work presented in

this thesis.

Hackenberg et al. used SPEC MPI benchmarks to quantify the variations of

energy consumption of HPC systems in [55]. Linpack was introduced to get the

peak power consumption of their test systems. They also identified that the idle

power consumption is 26% of the peak power consumption, and observed 70%

∼ 83% of the peak power was consumed while running SPEC MPI benchmarks.

However they focused on profiling the power consumption of HPC systems solely,

while both the performance and power consumption are our major concerns.

6.2 Energy-aware clouds and power models

Dynamic Voltage/Frequency Scaling (DVFS)[56] provides the handles to adjust

server power states. Together with turning on and off servers, or putting them

to sleep, it is among the most basic power management techniques that can be

applied to servers operating in a cloud [57]. Still all these methods cannot solve

the power consumption optimization problem in the presence of virtual machines,

unless one combines them with forced migration of VMs to concentrate them in

fewer servers. This migration can clearly be undesirable to guarantee application

performance.

One therefore needs to look at the behavior of the individual VMs, and con-

sequently at the availability of correct power models and energy profiles. These

can be obtained by actively using power benchmarks or by closely monitoring the

energy profile of individual system components such as CPU, cache, disk, and

memory at run time. We have been looking at previous work to determine if and

which power model we could use in GreenClouds.

Stoess et al. [58] were among the first to present methods to determine the

energy usage of VMs. They relied on the availability of models for each hardware

component to create a framework for power optimization and the development of

energy-aware OS.

Kansal et al. [59] proposed Joulemeter, a power meter for virtual machines.

Also Joulemeter makes use of power models of individual hardware resources;

at runtime software components monitor the resource usage of VMs and they

54

convert it to energy usage using the available model.

The power modeling technique vMeter, proposed by Bohra et al. [60] is most

relevant for us. They observed a correlation between the total system’s power

consumption and component utilization. They created a four-dimensional linear

weighted power model for the total power consumed P (total):

Ptotal = c0 + c1PCPU + c2Pcache + c3PDRAM + c4Pdisk (6.1)

where PCPU ,Pcache,PDRAM and Pdisk are specific performance parameters for

CPU, cache, DRAM and disk, and c0, c1, c2, c3, c4 are weights. The weights are

calculated per workflow. They refined the power model by separating the contri-

bution of each active domain in a node, either a VM or dom0 :

Ptotal = Pbaseline +N∑k=1

Pdomain(k) (6.2)

where Pdomain(k) is the power consumed by an active domain k, and N is the

number of active domains (including dom0). We reuse this model and provide

our results of empirical studies on it in Chapter 4.

The work done by Liu et al. [61] for the GreenCloud architecture and by

Dhiman et al. [62] in vGreen as well as by Srikantaiah et al. [63] for energy

aware consolidation are also of relevant for us. The GreenCloud architecture

utilizes live migration of VMs based on power information of the physical nodes.

With this technique Liu and his colleagues show a significant energy reduction

for applications running in clouds, specifically for online gaming. They define

an integrated approach similar to the one we are setting out ourselves to follow.

vGreen also consists of a multi-tiered software system, where policies are used to

schedule across the available machines.

Younge discusses a novel green framework for cloud data centers in [64]. He

studied the power behavior of virtual machines and incorporates DVFS-enabled

scheduling within the framework. Moreover, the size of the VM image is also

take into consideration as one more technique to reduce the energy consumption

of data centers.

Van et al. proposed an utility-based VM provisioning and placement pol-

55

6. Related Work

icy in [65] to maximizing the profit of a data center by leveraging applications’

performance and the energy consumption. They took the response time of web

applications, which are hosted within a cluster of Apache servers on multiple ho-

mogeneous VMs, as the performance metric. Instead of Van’s profit-based VM

scheduling policy, Garg et al. proposed carbon/energy based scheduling policy in

[66]. We took a similar approach but focus on the ’green’ aspect of computing.

Besides that, we also explored the diversity of VMs.

Our efforts differ from the above as we aim to explore the benefits of using

very heterogeneous hardware in creating, managing, and when needed migrating,

application-tailored VMs.

56

Chapter 7

Conclusions

The research work presented here aimed to explore the green opportunities with

virtualization technologies in clouds through system- level optimizations, and

focus on energy-saving opportunities by energy- aware scheduling of virtual ma-

chines.

As a pre-research for further deploying system-level energy-savvy scheduling

policy on HPC clouds, we characterized the power consumption of a single work

node of our DAS-4 clusters and quantified the relation between the node’s total

power consumption and the resource usage of its components, i.e. CPU, memory

and HDD. To make our work close to the reality, we mimicked both the lightly-

loaded and stressful workloads with LINPACK benchmark, Stress tool and our

customized scripts.

We identified that the CPU, as the core component of a SUT, has the major

contribution to the total power consumption and dominates the variation part

of the total power consumption when varying the resource usage. However, the

situation may be changed if other accelerators like GPU and FPGA are presented,

which would be an interesting subject as part of our future work. Moreover, the

CPU’s power consumption is almost linearly to its usage. On the other hand,

memory has the constant and also least significant contribution to the total power

consumption which could be further treated as part of the SUT’s idle power

consumption.

Moreover, we also studied the power characteristics of different VMs. VMs

consume nearly the same amount of power as the host does when both running

57

7. Conclusions

same CPU-intensive workloads; however, VMs normally perform ∼ 30% worse in

performance, especially when the VM is over-committed.

At the end we provided our power model for a cloud system based on the

power profiling. We also proposed two novel energy-aware scheduling policies for

instant placement of VMs and for global optimization through live migration.

We believe the power characteristics of both the host and VMs presented in

this paper is not restricted by specific hardware; they generally apply to other

HPC nodes with similar hardware components, even though there may be some

slight differences on the value of parameters in the power model. Besides it, by

integrating the power characterization module into clusters’ monitoring system,

our measurement environment could be further extended to an on-line analyzing

system to support system-level optimization with on-the-fly VM scheduling and

dynamic resource adaptation.

58

References

[1] A. Berl, E. Gelenbe, M. Di Girolamo, G. Giuliani, H. De Meer, M. Q.

Dang, and K. Pentikousis, “Energy-efficient cloud computing,” The Com-

puter Journal, vol. 53, no. 7, pp. 1045–1051, 2010. 1

[2] P. Christian L. Belady, “In the data center, power and cooling costs more

than the IT equipment it supports,” Electronics Cooling Magazine, February

2007. 1

[3] ENERGY STAR program, “Report to congress on server and data

center energy efficiency,” U.S. Environmental Protection Agency,

August 2007, in response to Public Law 109-431. [Online].

Available: http://www.energystar.gov/ia/partners/prod development/

downloads/EPA Datacenter Report Congress Final1.pdf 1

[4] DAS-4 website. [Online]. Available: http://www.cs.vu.nl/das4/ 3, 16

[5] J. Geelan, “Twenty-one experts define cloud computing,” Cloud Computing

Journal, pp. 1–5, 2009. [Online]. Available: http://cloudcomputing.sys-con.

com/node/612375 5

[6] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski,

H. Lee, D. Patterson, A. Rabkin, I. Stoica, and et al., “Above the

clouds: A berkeley view of cloud computing (tr ucb/eecs-2009-28),” 2009.

[Online]. Available: http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/

EECS-2009-28.pdf 5

59

http://www.energystar.gov/ia/partners/prod_development/downloads/EPA_Datacenter_Report_Congress_Final1.pdf

http://www.energystar.gov/ia/partners/prod_development/downloads/EPA_Datacenter_Report_Congress_Final1.pdf

http://www.cs.vu.nl/das4/

http://cloudcomputing.sys-con.com/node/612375


http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf

http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf

REFERENCES

[7] R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud com-

puting and emerging IT platforms: Vision, hype, and reality for delivering

computing as the 5th utility,” Future Generation Computer Systems, vol. 25,

no. 6, pp. 599 – 616, 2009. 5

[8] M. A. Vouk, “Cloud computing — issues, research and implementations,”

ITI 2008 30th International Conference on Information Technology Inter-

faces, vol. 16, no. 4, pp. 31–40, 2008. 5

[9] I. Foster, Y. Zhao, I. Raicu, and S. Lu, “Cloud computing and grid computing

360-degree compared,” in Grid Computing Environments Workshop, 2008.

GCE ’08, nov. 2008, pp. 1 –10. 5

[10] P. Mell and T. Grance, “The NIST definition of cloud computing

(draft),” National Institute of Standards and Technology, January

2011. [Online]. Available: http://csrc.nist.gov/publications/drafts/800-145/

Draft-SP-800-145 cloud-definition.pdf 5

[11] K. Stanoevska-Slabeva and T. Wozniak, “Cloud basics an introduction to

cloud computing,” in Grid and Cloud Computing. Springer Berlin Heidel-

berg, 2010, pp. 47–61. 5

[12] A. Lenk, M. Klems, J. Nimis, S. Tai, and T. Sandholm, “What’s inside

the cloud? an architectural map of the cloud landscape,” in Proceedings

of the 2009 ICSE Workshop on Software Engineering Challenges of Cloud

Computing, ser. CLOUD ’09. Washington, DC, USA: IEEE Computer

Society, 2009, pp. 23–31. 5

[13] E. D. LEON, “The five layers within cloud computing,” Cloud Computing

Journal, 2009. [Online]. Available: http://cloudcomputing.sys-con.com/

node/1200642 5

[14] OpenNebula: the open source toolkit for cloud computing. [Online].

Available: http://opennebula.org/ 7

[15] Eucalyptus homepage. [Online]. Available: http://www.eucalyptus.com/ 7

60

http://csrc.nist.gov/publications/drafts/800-145/Draft-SP-800-145_cloud-definition.pdf

http://csrc.nist.gov/publications/drafts/800-145/Draft-SP-800-145_cloud-definition.pdf



http://opennebula.org/

http://www.eucalyptus.com/

REFERENCES

[16] Nimbus project. [Online]. Available: http://www.nimbusproject.org/ 7

[17] D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and

D. Zagorodnov, “The Eucalyptus open-source cloud-computing system,” in

Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster

Computing and the Grid, ser. CCGRID ’09. Washington, DC, USA: IEEE

Computer Society, 2009, pp. 124–131. 8

[18] K. Keahey and T. Freeman, “Science clouds: Early experiences in cloud

computing for scientific applications,” Cloud Computing and Its Applications

2008 (CCA-08), October 2008. 8

[19] K. Keahey, T. Freeman, J. Lauret, and D. Olson, “Virtual workspaces for

scientific applications,” Journal of Physics: Conference Series, vol. 78, no. 1,

p. 012038, 2007. 8

[20] P. Sempolinski and D. Thain, “A comparison and critique of Eucalyptus,

OpenNebula and Nimbus,” Cloud Computing Technology and Science, IEEE

International Conference on, vol. 0, pp. 417–426, 2010. 8

[21] R. P. Goldberg, “Architectural principles for virtual computer sys-

tems,” Harvard University, Tech. Rep., February 1973. [Online]. Avail-

able: http://www.dtic.mil/cgi-bin/GetTRDoc?AD=AD772809&Location=

U2&doc=GetTRDoc.pdf 10

[22] KVM homepage. [Online]. Available: http://www.linux-kvm.org/ 11, 22

[23] Xen project homepage. [Online]. Available: http://www.xen.org/ 11

[24] T. Deshane, Z. Shepherd, J. Matthews, M. Ben-Yehuda, A. Shah, and

B. Rao, “Quantitative comparison of Xen and KVM,” in Xen summit.

Berkeley, CA, USA: USENIX association, June 2008. 12

[25] A. Heissler, “Performance analysis of xen virtual machines in real-world

scenarios,” University of Applied Sciences Technikum Wien, Tech. Rep.

arXiv:1009.5878, Sep 2010. 12

61

http://www.nimbusproject.org/

http://www.dtic.mil/cgi-bin/GetTRDoc?AD=AD772809&Location=U2&doc=GetTRDoc.pdf

http://www.dtic.mil/cgi-bin/GetTRDoc?AD=AD772809&Location=U2&doc=GetTRDoc.pdf

http://www.linux-kvm.org/

http://www.xen.org/

REFERENCES

[26] X. Xu, F. Zhou, J. Wan, and Y. Jiang, “Quantifying performance properties

of virtual machine,” in Information Science and Engineering, 2008. ISISE

’08. International Symposium on, vol. 1, dec. 2008, pp. 24 –28. 12

[27] A. J. Younge, R. Henschel, J. T. Brown, G. von Laszewski, J. Qiu, and G. C.

Fox, “Analysis of virtualization technologies for high performance computing

environments,” in Proceedings of The Fourth IEEE International Conference

on Cloud Computing, ser. CLOUD ’11, April 2011. 12

[28] M. Fenn, M. A. Murphy, and S. Goasguen, “A study of a KVM-based cluster

for grid computing,” in Proceedings of the 47th Annual Southeast Regional

Conference, ser. ACM-SE 47. New York, NY, USA: ACM, 2009, pp. 34:1–

34:6. 12, 27

[29] The Green500 List website. [Online]. Available: http://www.green500.org/

13, 53

[30] S. Sharma, C.-H. Hsu, and W.-C. Feng, “Making a case for a Green500

list.” in IEEE International Parallel and Distributed Processing Symposium

(IPDPS 2006)/ Workshop on High Performance - Power Aware Computing,

2006. 13

[31] R. Ge and X. Feng and H. Pyla and K. Cameron and W. Feng. (2007, June)

Power measurement tutorial for the Green500 list. [Online]. Available:

http://www.green500.org/docs/pubs/tutorial.pdf 13

[32] The TOP500 List website. [Online]. Available: http://www.top500.org/ 15,

53

[33] C.-H. Hsu, W. chun Feng, and J. S. Archuleta, “Towards efficient supercom-

puting: A quest for the right metric,” Parallel and Distributed Processing

Symposium, International, vol. 12, p. 230a, 2005. 16

[34] Ganglia homepage. [Online]. Available: http://ganglia.sourceforge.net/ 18

[35] University of California, Berkeley Millennium Project website. [Online].

Available: https://www.millennium.berkeley.edu/ 18

62

http://www.green500.org/

http://www.green500.org/docs/pubs/tutorial.pdf

http://www.top500.org/

http://ganglia.sourceforge.net/

https://www.millennium.berkeley.edu/

REFERENCES

[36] Homepage of Stress tool. [Online]. Available: http://weather.ou.edu/∼apw/

projects/stress/ 20, 21

[37] Intel optimized LINPACK benchmark. [Online]. Available: http://software.

intel.com/en-us/articles/intel-math-kernel-library-linpack-download/ 20

[38] Homepage of LINPACK benchmark. [Online]. Available: http://www.

netlib.org/linpack/ 20

[39] A. R. Weiss, “ECL Dhrystone White Paper,” The EEMBC Certification

Laboratories, LLC (ECL), November 2002. [Online]. Available: http://

www.johnloomis.org/NiosII/dhrystone/ECLDhrystoneWhitePaper.pdf 20

[40] K. Avi, “KVM: The Linux virtual machine monitor,” Proceedings of the

Linux Symposium, Ottawa, Ontario, 2007, 2007. 22

[41] Intel Turbo Boost Technology 2.0. [Online]. Available: http://www.intel.

com/technology/turboboost/ 25

[42] J. Che, Q. He, Q. Gao, and D. Huang, “Performance measuring and compar-

ing of virtual machine monitors,” in Embedded and Ubiquitous Computing,

2008. EUC ’08. IEEE/IFIP International Conference on, vol. 2, dec. 2008,

pp. 381 –386. 27

[43] P. Kogge, K. Bergman, S. Borkar, and et al., “ExaScale computing study:

Technology challenges in achieving exascale systems,” September 2008. 29

[44] Fedora Documentation Project, Fedora 13 Virtualization Guide. Fultus

Corporation, 2010, pp. 180–182. 32

[45] UnixBench benchmark suite. [Online]. Available: http://code.google.com/

p/byte-unixbench/ 33

[46] G. Drysdale, A. C. Valles, and M. Gillespie. Per-

formance insights to Intel R© Hyper-Threading technol-

ogy. [Online]. Available: http://software.intel.com/en-us/articles/

performance-insights-to-intel-hyper-threading-technology/ 34

63

http://weather.ou.edu/~apw/projects/stress/

http://weather.ou.edu/~apw/projects/stress/

http://software.intel.com/en-us/articles/intel-math-kernel-library-linpack-download/

http://software.intel.com/en-us/articles/intel-math-kernel-library-linpack-download/

http://www.netlib.org/linpack/

http://www.netlib.org/linpack/

http://www.johnloomis.org/NiosII/dhrystone/ECLDhrystoneWhitePaper.pdf

http://www.johnloomis.org/NiosII/dhrystone/ECLDhrystoneWhitePaper.pdf

http://www.intel.com/technology/turboboost/

http://www.intel.com/technology/turboboost/

http://code.google.com/p/byte-unixbench/

http://code.google.com/p/byte-unixbench/

http://software.intel.com/en-us/articles/performance-insights-to-intel-hyper-threading-technology/

http://software.intel.com/en-us/articles/performance-insights-to-intel-hyper-threading-technology/

REFERENCES

[47] O. Celebioglu, A. Saify, T. Leng, J. Hsieh, V. Mashayekhi, and R. Rooho-

lamini, “The performance impact of computational efficiency on HPC clus-

ters with Hyper-Threading technology,” Parallel and Distributed Processing

Symposium, International, vol. 15, p. 250b, 2004. 34

[48] K. R. Jackson, L. Ramakrishnan, K. Muriki, S. Canon, S. Cholia, J. Shalf,

H. J. Wasserman, and N. J. Wright, “Performance analysis of high perfor-

mance computing applications on the amazon web services cloud,” Cloud

Computing Technology and Science, IEEE International Conference on,

vol. 0, pp. 159–168, 2010. 53

[49] S. Ostermann, A. Iosup, N. Yigitbasi, R. Prodan, T. Fahringer, and

D. Epema, “A performance analysis of EC2 cloud computing services for

scientific computing,” in Cloud Computing, ser. Lecture Notes of the In-

stitute for Computer Sciences, Social Informatics and Telecommunications

Engineering. Springer Berlin Heidelberg, 2010, vol. 34, pp. 115–131. 53

[50] A. Iosup, S. Ostermann, M. N. Yigitbasi, R. Prodan, T. Fahringer, and

D. H. Epema, “Performance analysis of cloud computing services for many-

tasks scientific computing,” IEEE Transactions on Parallel and Distributed

Systems, vol. 22, pp. 931–945, 2011. 53

[51] J. Ekanayake and G. Fox, “High performance parallel computing with clouds

and cloud technologies,” in Cloud Computing, ser. Lecture Notes of the In-

stitute for Computer Sciences, Social Informatics and Telecommunications

Engineering. Springer Berlin Heidelberg, 2010, vol. 34, pp. 20–38. 53

[52] C. Baun and M. Kunze, “Performance measurement of a private cloud in the

OpenCirrus TM testbed,” in Euro-Par 2009 Parallel Processing Workshops,

ser. Lecture Notes in Computer Science. Springer Berlin / Heidelberg, 2010,

vol. 6043, pp. 434–443. 53

[53] G. V. Mc Evoy, B. Schulze, and E. L. M. Garcia, “Performance and deploy-

ment evaluation of a parallel application on a private cloud,” Concurrency

and Computation: Practice and Experience, 2011. 53

64

REFERENCES

[54] J. Tao, K. Furlinger, and H. Marten, “Performance evaluation of OpenMP

applications on virtualized multicore machines,” in OpenMP in the Petascale

Era, ser. Lecture Notes in Computer Science. Springer Berlin / Heidelberg,

2011, vol. 6665, pp. 138–150. 53

[55] D. Hackenberg, R. Schone, D. Molka, M. Muller, and A. Knupfer, “Quanti-

fying power consumption variations of HPC systems using spec mpi bench-

marks,” Computer Science - Research and Development, vol. 25, pp. 155–163,

2010, 10.1007/s00450-010-0118-0. 54

[56] G. Magklis, G. Semeraro, D. Albonesi, S. Dropsho, S. Dwarkadas, and

M. Scott, “Dynamic frequency and voltage scaling for a multiple-clock-

domain microprocessor,” Micro, IEEE, vol. 23, no. 6, pp. 62 – 68, nov.-dec.

2003. 54

[57] G. von Laszewski, L. Wang, A. Younge, and X. He, “Power-aware scheduling

of virtual machines in DVFS-enabled clusters,” in Cluster Computing and

Workshops, 2009. CLUSTER ’09. IEEE International Conference on, 31

2009-sept. 4 2009, pp. 1 –10. 54

[58] J. Stoess, C. Lang, and F. Bellosa, “Energy management for hypervisor-

based virtual machines,” in USENIX Annual Technical Conference, 2007, p.

114. 54

[59] A. Kansal, F. Zhao, J. Liu, N. Kothari, and A. A. Bhattacharya, “Virtual

machine power metering and provisioning,” in SoCC’10, 2010, pp. 39–50. 54

[60] A. Bohra and V. Chaudhary, “VMeter: Power modelling for virtualized

clouds,” in Parallel Distributed Processing, Workshops and Phd Forum

(IPDPSW), 2010 IEEE International Symposium on, april 2010, pp. 1 –8.

55

[61] L. Liu, H. Wang, X. Liu, X. Jin, W. B. He, Q. B. Wang, and Y. Chen,

“GreenCloud: a new architecture for green data center,” in Proceedings of

the 6th international conference industry session on Autonomic computing

and communications industry session, ser. ICAC-INDST ’09. New York,

NY, USA: ACM, 2009, pp. 29–38. 55

65

REFERENCES

[62] G. Dhiman, G. Marchetti, and T. Rosing, “vGreen: a system for energy

efficient computing in virtualized environments,” in Proceedings of the 14th

ACM/IEEE international symposium on Low power electronics and design,

ser. ISLPED ’09. New York, NY, USA: ACM, 2009, pp. 243–248. 55

[63] S. Srikantaiah, A. Kansal, and F. Zhao, “Energy aware consolidation for

cloud computing,” in Proceedings of the 2008 conference on Power aware

computing and systems, ser. HotPower’08. Berkeley, CA, USA: USENIX

Association, 2008, pp. 10–10. 55

[64] A. J. Younge, “Towards a green framework for cloud data cen-

ters,” Master’s thesis, Rochester Institute of Technology, May 2010.

[Online]. Available: http://cyberaide.googlecode.com/svn-history/r5110/

trunk/papers/thesis-younge/ajy-thesis.pdf 55

[65] H. N. Van, F. D. Tran, and J.-M. Menaud, “Performance and power man-

agement for cloud infrastructures,” Cloud Computing, IEEE International

Conference on, vol. 0, pp. 329–336, 2010. 56

[66] S. K. Garg, C. S. Yeo, A. Anandasivam, and R. Buyya, “Environment-

conscious scheduling of HPC applications on distributed cloud-oriented data

centers,” Journal of Parallel and Distributed Computing, vol. 71, no. 6, pp.

732 – 749, 2011, special Issue on Cloud Computing. 56

66

http://cyberaide.googlecode.com/svn-history/r5110/trunk/papers/thesis-younge/ajy-thesis.pdf

http://cyberaide.googlecode.com/svn-history/r5110/trunk/papers/thesis-younge/ajy-thesis.pdf

List of Figures

2.1 Typical stack of the cloud model . . . . . . . . . . . . . . . . . . 6

2.2 Typical procedure of VM provisions in IaaS cloud . . . . . . . . . 9

2.3 Architecture comparison: native hypervisor vs. hosted hypervisor 11

3.1 Power measurement setup . . . . . . . . . . . . . . . . . . . . . . 14

3.2 The Ganglia monitoring system . . . . . . . . . . . . . . . . . . . 19

4.1 Power consumption versus CPU usage . . . . . . . . . . . . . . . 26

4.2 CPU frequecy scaling benchmark . . . . . . . . . . . . . . . . . . 27

4.3 Memory stress tests . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.4 CPU and Memory Stress tests on the host . . . . . . . . . . . . . 29

4.5 HDD stress tests on the host . . . . . . . . . . . . . . . . . . . . . 30

4.6 Linpack tests on different VMs . . . . . . . . . . . . . . . . . . . . 32

4.7 Dhrystone benchmark on the host and the Guest VM with different

number of threads . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.8 Impact of Hyper-Threading for Linpack tests on non-overcommitted

VM (with 8 vCPUs) and the host . . . . . . . . . . . . . . . . . . 35

4.9 Impact of Hyper-Threading for Linpack tests on overcommitted

VM (with 16 vCPUs) . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.10 Impact of HT technology for Dhrystone benchmarks . . . . . . . . 37

5.1 Working scenario of energy-aware scheduler . . . . . . . . . . . . 47

67

LIST OF FIGURES

68

List of Tables

2.1 Summary of cloud layers and their major players/technologies . . 7

2.2 The cloud managers compared . . . . . . . . . . . . . . . . . . . . 8

2.3 A comparison between Xen and KVM . . . . . . . . . . . . . . . . 12

3.1 Definition of benchmark metrics . . . . . . . . . . . . . . . . . . . 15

3.2 Heterogeneous design of DAS-4 clusters . . . . . . . . . . . . . . . 17

3.3 Specifications of Intel E5620 . . . . . . . . . . . . . . . . . . . . . 18

4.1 Summary of component benchmarks . . . . . . . . . . . . . . . . 24

4.2 Summary of overall benchmarks . . . . . . . . . . . . . . . . . . . 24

4.3 Execution time of Linpack benchmark on different VMs . . . . . . 32

4.4 Performance of the Fhourstones benchmark on the host and guest

VM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

5.1 Sample values of Pidle, ce and cp for our test machine . . . . . . . 41

5.2 Mathematic description of a Work Node (WN) and a VM . . . . . 48

69

Documents

Towards energy-aware VM scheduling in IaaS …Towards energy-aware VM scheduling in IaaS clouds through empirical studies Qingwen Chen Grid Computing University of Amsterdam A thesis