Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Towards energy-aware VM
scheduling in IaaS clouds through
empirical studies
Qingwen Chen
Grid Computing
University of Amsterdam
A thesis submitted for the degree of
Master of Science (MSc)
Supervised by
Dr. Paola Grosso
System and Network Engineering research group
Amsterdam, the Netherlands
August 29, 2011
2
Abstract
Energy-efficient computing has become increasingly important to mod-
ern HPC systems such as clouds. In this thesis we explore the ’green’
opportunities with virtualization technologies in clouds through system-
level optimizations, and specifically focus on energy-savings by energy-
aware scheduling of virtual machines.
A system-level approach of optimization for green cloud computing
requires in-depth understanding of the power characteristics of virtual
machines with respect to patterns of workloads running on them. The
first step we took in this direction is to deploy a private cloud system
with facilities provided by DAS-4 clusters and thoroughly characterize
its power behavior.
We broadly identified three power metrics, i.e. power, power efficiency
and energy. by executing several types of high performance computing
workloads on both the VM and host, we compare their performance
with respect to these three metrics. In addition to that, we also
analyze the composition of the total power consumption of a single
work node, and evaluate the contributions of individual components,
i.e., CPU, memory and HDD.
As a result of these profiling experiments, a linear power model is
derived to represent the power behavior of a single work node, which
is further extended to describe an entire cloud. With the help of
this model, an novel energy-savvy scheduler is proposed to make use
of the monitoring system to support system-level optimization with
on-the-fly VM scheduling and dynamic resource adaptation.
ii
Acknowledgements
I would firstly like to thank my supervisor Paola Grosso for her in-
valuable guidance and insights throughout this project, without which
this thesis couldn’t have been done.
Furthermore, I would also like to thank Kees Verstoep and Rutger
Hofman for always providing me with suggestions on my work.
Moreover, I am also grateful to Cosmin Dumitru, Ralph Koning and
all other members from System & Network Engineering (SNE) Group
for their valuable advices and feedback about my initial work.
Finally I would like to thank Vesselin Hadjitodorov for paving the way
for the work presented here, and DAS-4 for offering me their facilities.
iv
Contents
Contents v
1 Introduction 1
1.1 Green computing . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Research opportunity . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Background 5
2.1 Cloud computing . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 IaaS Cloud managers . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Virtualization: Hypervisors . . . . . . . . . . . . . . . . . . . . . 10
3 Experimental Setup 13
3.1 Power measurement method . . . . . . . . . . . . . . . . . . . . . 13
3.2 Power metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Hardware & Software environment . . . . . . . . . . . . . . . . . 16
3.4 Data logger and Monitoring . . . . . . . . . . . . . . . . . . . . . 18
3.5 Workload generator . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5.1 Linpack . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.5.2 Dhrystone and Fhourstones . . . . . . . . . . . . . . . . . 20
3.5.3 Stress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5.4 Customized scripts . . . . . . . . . . . . . . . . . . . . . . 22
3.6 Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
v
CONTENTS
4 Profiling VMs’ power consumption 23
4.1 Component benchmarks . . . . . . . . . . . . . . . . . . . . . . . 25
4.1.1 CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1.2 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.3 HDD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Overall benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.1 Floating-point operation performance . . . . . . . . . . . . 31
4.2.2 Integer operation benchmark . . . . . . . . . . . . . . . . . 32
4.2.3 Impact of Hyper-Threading . . . . . . . . . . . . . . . . . 34
4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5 Towards energy-efficient scheduling 39
5.1 The power model . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.1.1 Prerequisites & Assumptions . . . . . . . . . . . . . . . . . 39
5.1.2 Power model for a single work node . . . . . . . . . . . . . 42
5.1.3 Power model for a cloud system . . . . . . . . . . . . . . . 45
5.1.4 The opportunity revisited . . . . . . . . . . . . . . . . . . 46
5.2 Energy-aware scheduler . . . . . . . . . . . . . . . . . . . . . . . . 46
5.2.1 Symbolic description . . . . . . . . . . . . . . . . . . . . . 48
5.2.2 Placement scheduler . . . . . . . . . . . . . . . . . . . . . 49
5.2.3 Migration scheduler . . . . . . . . . . . . . . . . . . . . . . 51
6 Related Work 53
6.1 Performance and energy profiling . . . . . . . . . . . . . . . . . . 53
6.2 Energy-aware clouds and power models . . . . . . . . . . . . . . . 54
7 Conclusions 57
References 59
List of Figures 67
List of Tables 69
vi
Chapter 1
Introduction
1.1 Green computing
Cloud computing has emerged as a new paradigm of computing and gains in-
creasing attention from both academic and business community. Its utility-based
usage model allows users to pay per use, similar to other public utility such as
electricity, with relatively low investment on the end devices that access the cloud
computing resources. From the environmental perspective this new computing
model is already a great improvement [1] since the computing resources are shared
among all users and provisioned on-demand. This also puts tremendous pressure
on the cloud service providers who manage these resources, however it also opens
a lot of possibilities in further energy savings.
Energy consciousness and energy efficiency are two increasingly important as-
pects in designing and operating ICT infrastructures. As Belady pointed out in
[2], the annual energy cost for a 1U server in 2008 has surpassed its purchasing
cost. An US EPS’s report [3] on server and data center energy efficiency also indi-
cated that the energy consumed by nation’s servers and data centers was around
61 billion kilowatt-hours (kWh) (1.5% of total electricity consumption) in 2006,
and it could be doubled to more than 100 billion kWh by 2011. It also predicted
that about 10% of annual savings in energy consumption could be achieved by
2011 through state-of-the-art hardware and virtualization technologies.
The state-of-the-art hardware, on one hand, may improve the energy-efficiency
1
1. Introduction
of data centers’ computing infrastructures; on the other hand, upgrading current
infrastructures for the purpose of green computing is also a great investment,
and it is not always economically feasible for all data centers, especially if the
investment surpasses the benefits. It is therefore interesting to look for energy-
savvy alternatives that can satisfy users’ requirements on performance under
current computing infrastructure. The answer lies in our opinion on the use of
virtualization technologies in distributed systems such as clouds augmented with
global system-level optimizations which take focus on providing energy-efficient
optimizations for the entire data center while taking the characteristics of each
individual as input parameters.
The Dutch Organization for Scientific Research (NWO), as a source of funds
for many Dutch research programs, has the initiative to explore the ’green’ op-
portunities in Dutch data centers (e.g. DAS-4, SARA etc.) and therefore issues
the GreenClouds project. This GreenClouds research, where this work originated
from, stems from the following three promising ideas:
• Hardware diversity : computations should run on the architectures (e.g.
GPUs, multicores) that execute them in the most energy-efficient way;
• Elastic scalability : the number of resources should dynamically adapt to
the application needs, taking both computational and energy efficiency into
account;
• Hybrid networks : optical and photonic networks can transport data and
computations in a more energy-efficient way.
The combination of hardware diversity, elastic scalability and hybrid networks
in a distributed setting provides the basic components for a system-level approach,
where we are not limited to local optimizations (e.g., reducing clock speeds), but
we look at the behavior of the overall system.
1.2 Research opportunity
Virtualization technologies expose several new opportunities for green computing
in clouds which we haven’t experienced in non-virtualized environments such as
2
Grid Computing.
Firstly, with help of virtualization technologies, it is possible to partially or
fully migrate the up-running applications to more energy efficient hardware, i.e.
’greener’. The research issues arisen here are how the energy-efficiency is ap-
propriately defined and how to evaluate the energy efficiency of various types of
hardware within a heterogeneous data center.
Another opportunity for green clouds lies in minimizing the number of up-
running work nodes within a data center. This is two-folded:
a) When creating a VM, a VM placement scheduler may be involved to place
the VM on the most appropriate work node so that the number of active
work nodes is minimized while still getting reasonable performance;
b) A VM migration scheduler may make sophisticated decisions on migrating
applications from two or more lightly loaded work nodes to one work node
so that the other idle nodes may be powered off to save energy.
Nevertheless a thorough understanding on the power behavior of underlying
hardware is needed for both opportunities. The challenges in these opportunities
are:
• What is the power behavior of each individual, i.e. virtual machine (VM) in
virtualized environment, and how would it be affected by different patterns
of application running on it and/or different hardware architectures hosting
it?
• How could this power characteristic information be incorporated into the
system-level optimization?
This thesis strives to address the first challenge and provides heuristic pro-
posals on a system-level optimization for the second challenge. The work in this
thesis is broken down into two sub-tasks. We firstly carry on comprehensive
power benchmarks to thoroughly characterize the energy consumption of both
VMs and hardware components used in virtualized environments. Within this
part the state-of-the-art facilities provided by DAS-4 clusters [4] are extensively
utilized as the hardware of our test cloud system. Secondly we provide our power
3
1. Introduction
models for a cloud system based on the profiling results, and propose heuristic
recommendations on how these results and power models could be included in
an energy-conscious elastic scheduler. Overall we also aim to provide the reader
with an analysis of the results which is of general applicability to virtual ma-
chines running in clouds, and not specific to the hardware on which we obtained
the results.
1.3 Structure
The rest of this thesis is organized as follows: we briefly introduce the concepts
and technologies included in our thesis work in Chapter 2, and further elaborate
our experimental setup in Chapter 3. Chapter 4 presents the results of power
benchmarks, as corresponding to the first challenge described above. Our power
model for a cloud system and heuristic proposal on system-level optimization for
energy-efficient computing are illustrated in Chapter 5. Following the related
work in Chapter 6, we conclude with a brief discussion and future directions.
4
Chapter 2
Background
Cloud computing has emerged as a computing paradigm where the computing
resources could be provisioned on-demand to deliver services in a scalable manner.
In this chapter, we will briefly introduce its concept from technical perspectives
and focus specifically on the Infrastructure as a Service (IaaS) cloud systems.
2.1 Cloud computing
A number of researchers has attempted to define cloud computing from different
perspectives [5, 6, 7, 8, 9]. The most widely-accepted and accurate technical defi-
nition comes from the recommendation of National (U.S.) Institute of Standards
and Technology (NIST) in [10], where cloud computing is defined as a model for
enabling ubiquitous, convenient, on-demand network access to a shared pool of
configurable computing resources that can be rapidly provisioned and released
with minimal management effort or service provider interaction.
Although classifying the details of cloud models has been subject of debate
[11, 12, 13], it is generally agreed that a cloud framework may include 5 layers
as presented in Figure 2.1. It consists of three different service layers depending
on the type of resources provided by the cloud, and two underlying framework
layers:
• Server layer consists of hardware resources (such as computer, storage and
network devices) and some basic software products which are used to man-
5
2. Background
Figure 2.1: Typical stack of the cloud model
age these hardware resources.
• Virtualization layer connects the server layer with higher level of service
layers by virtualizing the hardware resources and providing the service layers
with on-demand provision.
• Infrastructure as a Service (IaaS) is built right on top of the virtualiza-
tion layer and provides the users with processing, storage, networks, and
other fundamental computing resources, typically in terms of virtual ma-
chines (VMs). Cloud users may be able to deploy their applications on
these virtualized resources and customize the runtime environment (includ-
ing operating systems of virtualized resources) as if they would normally do
on their local hardware resources.
• Platform as a Service (PaaS) provides users with pre-configured runtime
environment (e.g. Java runtime environment) and enables users to deploy
their own applications within it. At this layer, the users only have full
control over their own applications, and may customize the runtime envi-
ronment configurations in a limited range.
• Software as a Service (SaaS) provides the users with ready-to-use applica-
tions running on a cloud infrastructure. These applications are developed
by the service cloud providers, and the users at this layer may only have
limited ability to modify user-specific application configuration settings.
6
Even though the higher level of service layers may build their underlying in-
frastructures on top of lower level of layers, they are able to provide corresponding
services to the public separately and independently. Table 2.1 summarizes some
of the major players/technologies at each layer.
Table 2.1: Summary of cloud layers and their major players/technologies
Layers Major players and/or technologies
SaaSGmail, Google Docs, Salesforce, Public cloudstorage services such as Dropbox
PaaS Google App Engine, Windows Azure
IaaSAmazon EC2, Rackspace, Eucalyptus, Open-Nebula, OpenStack, Nimbus etc.
VirtualizationKVM, Xen, Microsoft Hyper-V, VMware ESXetc.
ServerMulti-core, Storage, Memory, accelerators suchas GPU and FPGA etc.
Another criterion to classify the cloud infrastructure is the deployment model.
Under this classification we have four categories of clouds: private, public, com-
munity, and hybrid. Private and public clouds are self-explained as they are
available to a single organization and to general public respectively. In commu-
nity cloud, the cloud infrastructure is shared by several organizations. Hybrid
cloud is more or less the combination of two or more of the other three clouds.
Amazon, Google and Microsoft are three major players in the area of public cloud,
while open source cloud infrastructures such as OpenNebula[14], Eucalyptus[15],
and Nimbus[16] etc. play significant roles in the other fields.
Within the task of this thesis work, we will fully utilize the state-of-the-art
hardware provided by DAS-4 cluster and focus on the IaaS service model deployed
with OpenNebula.
2.2 IaaS Cloud managers
Cloud managers at IaaS layer provides the easy-to-use management interfaces to
make the virtualized resources (i.e. VMs) accessible by users. There are plenty
7
2. Background
Table 2.2: The cloud managers compared
OpenNebula Eucalyptus Nimbus
PhilosophyPrivate, highlycustomizablecloud
Mimic AmazonEC2
Cloud resourcestailored for scien-tific researchers
CompatibilityOpen, multi-platform
Compatible withEC2, S3 and EBS
Compatible withEC2
CustomizabilityBasically every-thing
Some for admin,less for user
except imagestorage andglobus credentials
Hypervisorssupported
Xen, KVM,VMware
Xen, KVM(VMware innon-open source)
Xen, KVM
Unique featuresVM migrationsupported
User manage-ment web inter-face
Nimbus contextbroker
of cloud managers at IaaS layer in the open source community, among which
Eucalyptus, OpenNebula, Nimbus are the dominators.
Eucalyptus [17] is designed to be a private cloud computing platform that im-
plements the API of Amazon EC2, S3 and EBS. The Nimbus[18, 19] is an open-
source toolkit which is built on top of its predecessor, Globus Toolkit in Grid
Computing. OpenNebula aims at building the industry standard open source
cloud computing tool to manage the complexity and heterogeneity of distributed
data center infrastructures. Sempolinski et al. has provided a thorough com-
parison of OpenNebula, Eucalyptus, and Nimbus in [20] which we summarize in
Table 2.2.
Despite their differences of detailed implementations, they all share the com-
mon purpose of managing VMs in an easy-to-use way and the procedure of pro-
visioning VMs, as described in Figure 2.2. A typical procedure of provision of
virtual resources (specifically VMs) follows the steps below:
1) Cloud users access to the head node of the cloud through a piece of client
application (or web management interface depending on which is provided by
the cloud provider), and issue the action of requesting a VM with additional
8
Figure 2.2: Typical procedure of VM provisions in IaaS cloud
user-specified configuration (if any).
2) Head node then pushes the fresh image of VM from the pre-configured image
repository to one of the work nodes. Sometimes a VM scheduler is involved
in this step to make sophisticated decisions on which work node this image
should be pushed to.
3) After receiving the image, the hypervisor on the work node creates and starts
the VM requested by the user.
4) During the startup of the VM, bridged network is configured, and it then
requests a network address from the DHCP server (dhcpd).
5) At the end, the user may access to the VM in the way as if he/she may do to
a regular remote machine.
Since OpenNebula has been deployed by DAS-4 and SARA on their HPC clus-
ters, we follow their choice and use OpenNebula as our cloud manager. Moreover,
OpenNebula’s high customizability also makes it a good candidate for us.
9
2. Background
2.3 Virtualization: Hypervisors
Virtualization generally refers to the process of creating one or more virtual ver-
sions (with regard to the actual version) of computer resources such as hard-
ware platforms, operating systems, storage/memory devices or network resources.
Within the scope of IaaS cloud platform, it is usually limited to the technology
of creating multiple virtual machines, specifically system virtual machines1. A
system VM is a complete and isolated guest OS installation within the host OS.
With the help of system VM technology, multiple OS environments can co-exist
on the same computer, in which case the underlying physical resources are shared
among all VMs and the resources which may be occupied by applications running
inside a VM are limited to the resources provided by the VM.
VMs and their interactions with the underlying physical resources are man-
aged by a software layer called virtual machine monitor (VMM) or hypervisor.
Depending on where they are running on, hypervisors are generally classified into
two categories [21]:
• Type-I (or native) hypervisor runs directly on the host’s hardware (see Fig-
ure 2.3(a)) to manage the hardware and VMs, as well as their interactions.
• Type-II (or hosted) hypervisor runs as a normal application on top of host’s
OS (see Figure 2.3(b)). The communication between the hypervisor and the
hardware has to pass through the host’s OS.
Theoretically, performance differences of a native OS and a guest OS running
in a Type-I VM are, in general, barely noticeable; but a guest OS running in a
Type-II VM has significant performance degradation even compared to Type-I
VMs, because with a hosted hypervisor the guest OS has to go through more
software layers to reach the hardware, as described in Figure 2.3. Therefore to
meet the requirements of scientific applications, our work will focus on Type-I
hypervisors only2.
1Sometimes called hardware virtual machines. Another type of virtual machine is processVM which runs a single process as a normal application inside a host OS, with Java VirtualMachine (JVM) as a widely-known example.
2Another practical reason is that only native hypervisors are supported by OpenNebula.Therefore Type-II hypervisors are beyond our concern.
10
(a) Type-I (native) hypervisor (b) Type-II (hosted) hypervisor
Figure 2.3: Architecture comparison: native hypervisor vs. hosted hypervisor
We restrict our discussion to the two widely-known open source Type-I hyper-
visors, i.e. Kernel-based Virtual Machine (KVM)[22] and Xen[23], even though
there are many others such as VMware ESX/ESXi1 and Microsoft Hyper-V. Both
KVM and Xen are virtual machine monitors for x86, x86-64, and IA-32/-64. The
most significant difference between them is to which degree the hardware is vir-
tualized by them. KVM is a full virtualization technology which supports guest
operating systems running unmodified; the guest operating systems running on
Xen need to be specifically modified, which we call para-virtualization. As of
Xen version 3.0, full virtualization is also supported if the CPU supports x86
virtualization (such CPUs include Intel VT-x and AMD-V).
There are debates on whether KVM is a native hypervisor or a hosted hy-
pervisor. On one hand KVM exists in Linux as a kernel module, which qualifies
it as a Type-II hypervisor; on the other hand, however, it exposes the hardware
virtualization extensions (such as Intel VT-x and AMD-v) to the Linux kernel,
which effectively turns the kernel into a Type-I hypervisor. Therefore, we clas-
sify it as a Type-I hypervisor. Besides, its performance is also comparable with
other Type-I hypervisors such as Xen. Table 2.3 summarizes the similarities and
differences between Xen and KVM.
1These should be distinguished with other products of VMware Inc., such as VMwareWorkstation and VMware Server, which are basically Type-II hypervisors since they run asnormal applications within Linux or Windows operating system.
11
2. Background
Table 2.3: A comparison between Xen and KVM
Xen KVM
Para-virtualizaiton Yes No
Full virtualization Yes Yes
Host CPUx86, x86-64, Itanium,IA-64, ARM
x86, x86-64, IA-64,PowerPC
Host OSModified versions ofLinux, NetBSD andSolaris
Linux
Guest OSModified Unix-like OS,Windows
Linux, Windows, Unix
Intel VT-x / AMD-V Optional Required
Live migration Yes Yes
There has been a lot of research on benchmarking and comparing the perfor-
mance of these two hypervisors [24, 25, 26, 27, 28]. To summarize their work, Xen
performs slightly better in network virtualization, while for other workloads such
as Linpack, Fast Fourier Transformation (FFT) and IOzone, its performance is
much worse than KVM. This finally leads to our decision of adopting KVM as
the hypervisor of our test cloud platform.
12
Chapter 3
Experimental Setup
As planned in Chapter 1, we will first characterize the power consumption of both
the hardware and VMs. This knowledge will then be used to develop an energy-
aware VM scheduler for the purpose of energy-efficient computing. A power
measurement environment, as an evaluation to carry on power benchmarks and
to verify the effectiveness of our energy-aware scheduler, is essential to both of
these two tasks.
In this chapter, we will describe our power measurement setup, followed by
more details on both the hardware and software configurations of our power
benchmark test-bed as well as various types of workload generators used within
our experiments.
3.1 Power measurement method
The commonly used approach of measuring the power consumption of a system is
the one adopted by the Green500 list[29, 30], as described in [31]. This approach
consists of the following three basic entities:
• a System Under Test (SUT).
• a power meter which provides the value of power consumed by the SUT. It
usually resides between the power supply and the SUT.
• a data logger to record and analyze the power data.
13
3. Experimental Setup
The power measurement setup is illustrated in Figure 3.1. Depending on
the use scenarios, a SUT could be either a single work node within a cluster,
a collection of several work nodes, or even the entire cluster. However with
more work nodes included within a single SUT, the measurement results are less
meaningful due to the low granularity of the system. By sitting between the
SUT and the power supply, the power meter is able to provide the actual power
consumed by the SUT. The data logger is a piece of software which normally run
on a device other than the SUT to eliminate its impact on the power consumption
of the SUT.
Figure 3.1: Power measurement setup
This approach is simple and intuitive, but has its limitation; since the SUT is
measured as a single unit and the minimal unit is a single work node, it cannot
provide detailed information on the power consumed by each component of a work
node. Therefore, we propose a workaround (see Chapter 4) for this by identifying
workload patterns.
14
3.2 Power metrics
There are generally five metrics within our power benchmarks, which are classified
into two categories - measurement metrics and derived metrics. Measurement
metrics are the ones that could be directly read from the hardware monitoring
tools or software applications without advanced calculations, such as runtime of
workloads, power consumption and performance. The derived metrics are derived
from measurement metrics. Table 3.1 describes each metric in detail.
Table 3.1: Definition of benchmark metrics
Metric Definition Value
Performance∗evaluation of how well theSUT performs benchmarks
value as reported by bench-mark tools themselves
Powerconsumption∗
average power consumedby the SUT during bench-marks
average of values reportedby the power meter
Execution time∗time duration of bench-marks
wall clock time reported bythe data logger
Power-efficiency‡evaluation of how efficientthe power is used
performance divided bypower
Energyconsumption‡
cumulative power con-sumption over executiontime
power multiplied by execu-tion time
∗ Measurement metrics, as their values are directly obtained from devices orfrom benchmark tools.‡ Advanced metrics which are derived from measurement metrics.
Depending on the purpose of performance benchmarks, the Performance met-
ric may have various forms. For instance, throughput of a system is a good
performance metric for system I/O benchmarks, but it would be less desirable
for computational benchmarks which may be better represented by, for exam-
ple, number of unit operations per unit time. Our benchmarks will focus on
the SUT’s computational performance. A typical performance metric for it is
FLoating-point OPerations per Second, namely FLOPS or Flops. It is widely
accepted as a performance metric in ranking supercomputers on the TOP500
list[32]. Besides Flops, the performance metric of integer operations per second,
usually measured in Million Instructions per Second (MIPS), will also be exam-
15
3. Experimental Setup
ined in our benchmarks.
The power consumption metric in our benchmarks refers to the average power
over the execution time of benchmarks, instead of instant power which is defined
as the power consumed by the SUT at a specific time point.
The power efficiency metric is defined as performance per Watt, as described
in Eq. 3.1:
epower =Performance
Power(3.1)
where performance and power are averaged values for static analysis. In [33], Hsu
et al. discussed several possible types of power-efficiency metrics and proposed
to use GFlops/W as an appropriate one, where GFlops is short for Giga Flops.
Since it is well accepted as the power efficiency by Green500 list, we will also use
it as one of our power-efficiency metrics. Besides this, we also take MIPS/W as
a complementary to GFlops/W. The energy consumption metric is calculated by
multiplying the average power consumption with execution time.
Among the five metrics, the power, power-efficiency and energy are our major
concerns as they are directly related to the green aspect of computing. The pur-
pose of research on energy-aware computing is to reduce the energy consumption
of applications without sacrificing their performance, or with reasonable sacrifice
as long as it still meets the users’ or applications’ requirements. Thus the energy
consumption metric turns to be the appropriate metric for applications with def-
inite execution time; for applications with unlimited execution time (e.g. hosting
a web server), however, it is not as suitable as the power-efficiency metric.
3.3 Hardware & Software environment
The power meter and SUT are two basic hardware entities in our test-bed. De-
pending on the use scenarios, the SUT could be a single server (work node) or
a cluster of servers. In order to make our work more generic and practical to
production cloud systems, we will fully utilize the hardware provided by the Dis-
tributed ASCI Supercomputer 4 (DAS-4 [4]) which is designed as a six-cluster
wide-area distributed system to provide a common computational infrastructure
for researchers within ASCI in Netherlands. The features of the six clusters are
16
described in Table 3.21. The two clusters hosted by VU University Amsterdam
(VU) and University of Amsterdam (UvA) have almost the same features except
that VU’s cluster has been equipped with GPUs.
Table 3.2: Heterogeneous design of DAS-4 clusters
Cluster Nodes Type Speed Memory StorageNodeHDDs
Network Accelerators
VU 74Dual-quad-core
2.4GHz 24GB 2*1TB 30TBIB &GbE
16*GTX480+ 2*C2050
UvA 16Dual-quad-core
2.4GHz 24GB 1TB 30TBIB &GbE
futureupgrade
LU 16Dual-quad-core
2.4GHz 48GB 50TB5*2TB+.5TBSSD
IB &GbE
futureupgrade
TUD 32Dual-quad-core
2.4GHz 24GB 18TB 2*1TBIB &GbE
8*GTX480
UvA-MN
36Dual-quad-core
2.4GHz 24GB 30TB 2*1TBIB &GbE
8*GTX480+7*C2050+2xGTX480
ASTRON 24Dual-quad-core
2.4GHz 24GB 24TB 1*1TBIB &GbE
8*GTX580+1*C2050+1*HD6970
In order to get better granularity in our benchmarks, our SUT is configured
to be a single standard DAS-4 work node with a dual-quad-core CPU (Intel
E5620), 24 GB memory and roughly 1 TB of storage. Table 3.3 lists part of Intel
E5620’s specifications2 which is vital to our benchmarks. The operating system
running on the SUT is a fresh CentOS 5.6 (Kernel version: 2.6.18-238.9.1.el5)
with Dynamic Voltage-Frequency Scaling (DVFS) enabled.
The power meter used in our experiment is a 32A PDU gateway from Schleifen-
bauer. It could provide power data through public APIs in PHP, Perl, and SNMP
with the precision of 1 V in voltage and 0.01 A in current. The instant power
consumption is calculated by multiplying the voltage, the current and the power
factor together.
1Cited from http://www.cs.vu.nl/das4/clusters.shtml2For the full list of specifications of Intel E5620, please refer to http://ark.
intel.com/products/47925/Intel-Xeon-Processor-E5620-(12M-Cache-2_40-GHz-5_
86-GTs-Intel-QPI)
17
3. Experimental Setup
Table 3.3: Specifications of Intel E5620
Essentials Advanced features
# of Cores 4 Intel Turbo Boost Technology# of Threads 8 Intel Hyper-Threading TechnologyClock Speed 2.4GHz Intel Virtualization Technology (VT-x)Max Turbo Frequency 2.66GHz Idle StateMax TDP 80W Enhanced Intel SpeedStep Technology
3.4 Data logger and Monitoring
Our long-term purpose is to deploy a monitor system to easily and user-friendly
collect and present the status of both the entire cluster (overview) and each single
work node; therefore, we chose Ganglia[34] and deployed it on the front-end of
our private cloud system. The data logger was implemented as an extension to
it.
Ganglia is a scalable distributed monitoring system for HPC systems. It was
born from the UC Berkeley Millennium Project[35], and also widely deployed on
many other HPC systems. As shown in Figure 3.2, it consists of two types of
daemons, namely Ganglia Meta Daemon (gmetad) and Ganglia Monitor Daemon
(gmond) which are run on the front-end and work nodes respectively, a Round-
Robin Database (RRD) to store data, and a web-front to visualize it.
• gmond resides on each work node that is being monitored, and collects the
local system’s status metrics such as CPU and memory usage, and then
sends them to gmetads.
• gmetad is a daemon running on the front-end of a cluster. It periodically
polls gmonds and store their metrics into a storage engine like RRD.
• A RRD, on one hand, is adopted by gmetad as a database to store all
metrics collected by gmetad; on the other hand, the metrics stored in RRD
are retrieved and visualized on a web front-end.
Though Ganglia has plenty of built-in monitoring metrics, the power metric
is not included due to its dependency on the APIs provided by the manufacturer
18
Figure 3.2: The Ganglia monitoring system
of the power meter. In our experiments, we integrated the power metric into
Ganglia with the Perl APIs provided by Schleifenbauer1.
For stress tests which normally will run for more than 10 minutes and where
power consumption of the work node will not vary much, we collect the power data
every 5 seconds in order to avoid adding too much overhead on the work node; for
instant workload tests where both the resource usage and power consumption of
the work node vary dramatically with time, power data is collected every second
to get accurate and reliable data.
3.5 Workload generator
Our workload generators are carefully selected to profile the power consumption
of the SUT in generic and extreme use cases. The generic use case corresponds
to applications that keep running but do not fully utilize the resource, i.e. a
database server or a webserver. The extreme case corresponds to applications
that stress resource usage to its limits.
Within a SUT, we broadly identified CPU, Memory, Hard disk drive (HDD),
and GPUs (if present) as the major components which consume the most power
1Available at http://sdc.sourceforge.net/index.htm
19
3. Experimental Setup
of a work node in high-performance computing environments. Since GPU is not
presented on our test machines, we focus our work on energy profiles of CPU,
Memory and HDD at the moment.
Our goal in generating the workloads is to separately profile the impact of
each component with respect to the total power consumption. Thus we choose
Stress [36], the Intel optimized LINPACK [37] benchmark and the Fhourstones
benchmark as our three major workload generators. Besides them we also write
our own scripts to generate other specific workload patterns as complementaries.
3.5.1 Linpack
The Linpack benchmark[38], as a tool to evaluate a system’s floating-point com-
puting power by letting the SUT solve a dense N by N system of linear equations,
is both CPU- and Memory-intensive, but has few operations on HDD. Its com-
putational complexity highly depends on the number of equations, i.e. N , which
is practically restricted solely by the memory available to the system. Normally
larger N will result in better performance reported by Linpack. Thus we set N to
its maximum possible value in our experiments to get the best performance. Dur-
ing each run, Linpack internally records the CPU time instead of wall clock time
as the execution time, and at the end reports its result in millions of floating point
operations per second (MFLOPS), or sometimes in GFLOPS. The way Linpack
calculates the execution time may be reasonable for traditional supercomputers
and clusters; it may, however, provide misleading information in virtualized en-
vironments where applications are running on a host’s physical CPUs through
virtual CPUs (vCPUs). This is especially obvious in over-committed environ-
ments where a VM has more vCPUs than the available physical CPUs on the
host. We will explain this in more detail in Section 4.2.1.
3.5.2 Dhrystone and Fhourstones
Another set of benchmarks similar to Linpack is the integer operation bench-
marks, which we achieved with Dhrystone and Fhourstones.
As described in [39], the Dhrystone benchmark is a synthetic integer bench-
mark tool which is carefully designed to statistically mimic the processor usage
20
of some common set of programs. Reinhold P. Weicker released its first version in
1984 after carefully characterizing a broad range of software in terms of various
common constructs such as procedure calls, pointer indirections, assignments, etc.
The benchmark result is reported in number of Dhrystones per second, which is
basically the number of iterations of the main code loop per second.
Named as a pun of Dhrystone but unlike the Dhrystone, the Fhourstones is a
problem-oriented integer benchmark which aims to efficiently solve the positions
in the game of Connect-41, as played on a vertical 7x6 board. The benchmark
result is expressed in number of fhourstones, where a fhourstones is taken as a
thousand positions searched per second.
Note that neither Dhrystone nor Fhourstones uses the intuitive and straight-
forward MIPS as the unit to report its results. This is to hide the details of
the underlying instruction set and make the results comparable even on ma-
chines with different instruction sets (e.g. RISC vs. CISC). However, since we
are benchmarking on the same hardware, it’s of less important to us in which
way the results are reported. Moreover it’s thus also meaningless to compare be-
tween Dhrystone and Fhourstones. Therefore we will continue to adopt their own
units, instead of MIPS, as the performance metric of the SUT’s integer operation
capability2.
3.5.3 Stress
Stress [36] is a simple stress tool which is designed to spawn one or more processes,
named workers, for a pre-specified amount of time. Each worker is dedicated to
stretch either the CPU, Memory, or HDD usage on a single work node. Basically,
Stress works as follows:
• a CPU-stress worker persistently carries on sqrt() operations on random
variables.
1Connect-4 is a two-player game which is normally played on a 7x6, 8x7, 9x7, or 10x7 board.See http://en.wikipedia.org/wiki/Connect_Four for more details.
2Another popular MIPS-normalized representation of the Dhrystone benchmark’s result isthe DMIPS (Dhrystone MPIS). It is calculated by normalizing it with the number of Dhrystonesper second (1757) on a 1 MIPS machine (VAX 11/780), i.e. dividing the Dhrystone score by1757
21
3. Experimental Setup
• a Memory-stress worker repeatedly malloc()s a certain amount of memory
as specified by the user and then fills it with random data before free()ing
it.
• a HDD-stress worker frequently writes random data to the disk.
3.5.4 Customized scripts
Besides the two stress workload generators described above, we also wrote our own
scripts to perform divisions on random numbers to mimic generic non-stressful
workloads. The time interval between operations is automatically adjusted so that
the workload, i.e. CPU usage, will be increased gradually and then decreased in
a similar way after it reaches 100%.
3.6 Virtualization
Our virtualization is done with Kernel-based Virtual Machine (KVM)[22][40]
which is a full virtualization for Linux on supported x86 hardware. A guest
VM running with KVM, in principal, lives as a regular linux process on its host.
In our experiments all of the VMs are configured with the same amount of
virtual memory but with different numbers of virtual CPUs (vCPUs). For each
VM we allocate 20GB memory out of 24GB available physical memory in order
to get the best performance from Linpack. The number of vCPUs of a VM varies
from 1, 2, 4, 8, to 16, where the last case is an over-committed VM since it has
more vCPUs than available physical cores.
22
Chapter 4
Profiling VMs’ power
consumption
In the previous chapter, we have elaborated our test environments which are used
to profile the power behavior of VMs and hosts in this chapter. As have been
presented in Section 3.2, there are three power metrics to express the energy
profile of a system:
• Power (W ), which provides the consumed wattage as reported by the power
meter;
• Power efficiency (GFlops/W ), which is the system performance expressed
in GFlops divided by the power;
• Energy(kJ ), which is the power integrated over the execution time.
The ultimate goal of green computing is to reduce the total energy consumed
while running applications; however, not all metrics are able to be measured
for applications. For instance, it is usually not feasible to obtain the energy
information for applications that run indefinitely on a resource. In these cases,
an energy-aware scheduler will have to base its decision on power and possibly
power efficiency alone. For applications with limited execution time a scheduler
can use the energy metric in its optimization process.
All of the tests we have performed can be classified into two categories: com-
ponent tests and overall tests, as summarized in Table 4.1 and 4.2 respectively.
23
4. Profiling VMs’ power consumption
Each benchmark runs on both the SUT and virtual machines with different num-
ber of vCPUs which we had explained in Chapter 3.6. We call a Guest VM a
virtual machine configured with the same number of vCPUs as the number of
available physical cores, i.e. 8 cores in our case.
Table 4.1: Summary of component benchmarks
Component Test type Workload Metric
CPUCPU usage Script Power consumption
Freq. scaling LinpackPerformance, power effi-ciency, energy consumption
Different VMs LinpackPerformance, power effi-ciency, energy consumption
Memory# of workers Stress Power consumptionMemory usage Stress Power consumption
HDD Timeline StressCPU usage, Power con-sumption
Table 4.2: Summary of overall benchmarks
Test type Workload Metric
Floating-point operation LinpackPerformance, power efficiency,energy consumption
Integer operationDhrystone &Fhourstones
Performance, power efficiency,energy consumption
Hyper-Threading LinpackPerformance, power efficiency,energy consumption
In the remainder of this chapter, we will first profile the contribution of each
major hardware component, e.g. CPU, memory and HDD, to the total power
consumption of the SUT in Section 4.1. Then we continue with the overall tests to
explore the ’green-ness’ of both floating-point and integer operations on the SUT.
Finally we will study the impact of Hyper-threading and finish with discussions
on all benchmark results.
24
4.1 Component benchmarks
In this section, we will separately examine the impact of each individual compo-
nent of a SUT (i.e. CPU, memory or HDD) to the total power consumption. As
this knowledge will be accepted to build our power model in Chapter 5, we will
particularly focus on their variations.
4.1.1 CPU
The process of power profiling for CPU is divided into two sections, the CPU
usage test and the CPU frequency scaling test. The CPU usage test characterizes
the variation of total power consumption with respect to the CPU usage, while
the other one examines the energy efficiency of the SUT with different CPU
frequencies.
CPU usage test
The CPU usage test measures the total power consumption of a SUT with respect
to its CPU usage. For our test machine, a symmetric multiprocessor (SMP)
system with 8 cores, we vary the CPU usage in the following two ways:
• Case I : vary the workload on all available cores and take their average value
as the CPU usage;
• Case II : change the number of cores being used and stretch each used core
to its maximum usage immediately when it starts up.
Dynamic CPU frequency scaling in our SUT is handled by Enhanced Intel
SpeedStep technology. The Linux kernel will switch it to the highest frequency
immediately when the load reaches the threshold. With frequency scaling en-
abled, we observed a clear turning point for Case I when CPU usage reached
∼65% in Figure 4.1(a).
For the case with fixed CPU frequency we chose Turbo mode for the CPU
speed where CPU frequency is fixed to be the maximum value, i.e. 2.66GHz
for Intel E5620 in our case. Turbo mode uses the TurboBoost technology from
Intel [41], which enables the processor to run above its base operating frequency
25
4. Profiling VMs’ power consumption
(a) Case I: Gradual increase of the CPUloads on all available cores
(b) Case II: Gradual increase of number ofcores, where each core is at its maximumusage
Figure 4.1: Power consumption versus CPU usage
(2.40GHz in our case) via dynamic control of the CPU’s clock rate. In this case
the total power consumption is nearly linear with the CPU load except a sharp
increase when the CPU load goes from idle to ∼5%, as shown in Figure 4.1(a).
Figure 4.1(b) shows the results for case II. In this case, the CPU frequency of
each used core jumps to the Turbo model immediately when it starts up because
it is stretched to 100% usage. The rest idle cores remain at the lowest frequency
(i.e. 1.60GHz in our case). According to the results shown in Figure 4.1(b),
the power consumption grows also linearly with the number of threads, up to
the point when we start to have more than 8 threads because both the number
of physical cores and vCPUs of the Guest VM are 8. We observe a negligible
difference in power consumption between the host and the Guest VM since both
of them stretch the usage of each used core to its limit.
CPU frequency scaling test
In this series of tests we disabled dynamic CPU frequency scaling and manually
varied the CPU frequency among several available frequencies within our SUT.
The benchmark results are shown in Figure 4.2. The maximum available fre-
quency supported by our SUT, namely 2.66GHz, corresponds to the turbo mode.
Figure 4.2(a) presents the power consumption and performance of our exper-
iments on the SUT and the guest VM. We observe that the guest VM has nearly
the same power consumption as the host but with worse performance. As we have
26
explained in Chapter 3.5 that the complexity of Linpack benchmark is determined
by its problem size (i.e. N) and limited by the amount of available memory on
the machine, the results of these two sets of benchmarks are comparable because
they have the same problem size (N = 45000) and memory usage of ∼16GB.
Since the CPU is fully utilized, we conclude that the performance degradation
in the Guest VM comes from the virtualization solely. Other researches [42] [28]
have shown similar pattern in the case of Linpack benchmark on KVM VMs: the
processing efficiency of KVM on float-point operations is lower than the host as
KVM checks every time whether an executing instruction is an interrupt, a page
fault, I/O or a common instruction in order to decide if to exit from the guest
mode or to stay in it.
(a) Power & Performance
(b) Power efficiency (c) Total energy consumption
Figure 4.2: CPU frequecy scaling benchmark
Another important result of this series of benchmarks is about the total en-
ergy consumed for each experiment. As shown in Figure 4.2(a) and 4.2(b), per-
formance and power efficiency improve almost linearly as the frequency scales
27
4. Profiling VMs’ power consumption
up, but the total energy consumption decreases (see Figure 4.2(c)) in a similar
manner. Notice that by setting the CPU to turbo mode, the SUT consumes
more power than in all other cases; however, it takes less time to complete the
Linpack benchmark which finally results in less energy used. Therefore we come
to the general conclusion that it’s ’greener’ to set the CPU to turbo mode for
CPU-bound workloads.
We also measured the idle power consumption of the SUT for different CPU
frequencies. Through our experiments we have seen that the idle power con-
sumption remains constant, regardless of the CPU frequency. The value for our
SUT is ∼90W across the whole frequency range. However as we also observed
from the CPU usage test in Figure 4.1(a), there is a sharp increase in power
consumption when the CPU usage increases up to ∼5% with higher frequency.
It is therefore advised to scale down the CPU frequency when the work node is
idle or lightly-loaded in order to be more energy-efficient.
4.1.2 Memory
We have performed two types of tests to quantify the impact of the memory on
the energy profile of the SUT and the guest VM:
• Worker tests, where we vary the number of workers spawned by the Stress
benchmark (see Chapter 3.5);
• Memory usage tests, where we gradually increase the size of memory allo-
cated by the malloc() call on each Stress worker, from 1GB to 18GB in
total.
In the worker test we observe again in Figure 4.3(a) that the increase in power
consumption levels off when the number of threads equals the number of physical
cores on the host. In the memory usage test reported in Figure 4.3(b), less power
is consumed by the VM compared to the host for the same test, and the total
power consumption remains nearly constant, regardless of memory usage. From
Figure 4.3(b) we also identify that the variation of total power consumption with
respect to memory usage is less than ∼5W.
28
(a) with various numbers of workers (b) with different memory usage
Figure 4.3: Memory stress tests
Furthermore we have tried to separate the CPU’s and memory’s contribution
to the total power consumed by the host, by using the Stress tool described in
Chapter 3.5. During the memory usage tests the CPU is fully occupied by system
threads performing malloc(), reading/writing and then free() operations. By
combining the measurements reported by the CPU usage test and the memory
usage test for the host, we can estimate the power consumed by memory, as shown
in Figure 4.4. In the figure we see a separation of no more than 5W in the whole
range of number of threads.
Figure 4.4: CPU and Memory Stress tests on the host
A recent DARPA commissioned study on the challenges for ExaFLOP com-
puting reports in [43] that the power needed for memory systems remains con-
stant regardless of the workloads, but that power is proportional to the number
of memory chips. Our benchmark results presented above verified that the varia-
29
4. Profiling VMs’ power consumption
tion in power consumption of memory systems is ignorable compared to the total
power consumption. Therefore, we will consider it as a constant throughout our
benchmarks and incorporate it with the idle power consumption in the future
research.
4.1.3 HDD
For our HDD stress tests, 8 workers are spawned spinning on write()/unlink()
operations with each worker writing chunks of 1MB random data to an temporary
file until it reaches 1GB, and then unlink() it. In the tests we have observed
little memory usage but high (system) CPU usage. Figure 4.5 shows the SUT’s
CPU usage and total power consumption with respect to time.
Figure 4.5: HDD stress tests on the host
In the timeline plot in Figure 4.5, we observe a strong relation between the
total power consumption and the CPU usage, i.e., the total power consumption
scales up when high CPU usage is observed. However both the CPU usage and
power consumption changes quickly and dramatically, which makes it difficult
to quantify the impact of HDD on total power consumption. Therefore while
building the power models in Chapter 5, we will put aside the impact of HDD
and focus on the CPU and memory by restricting our analysis in the scope of
CPU- and memory-intensive applications.
30
4.2 Overall benchmarks
Overall benchmarks examine the overall performance of VMs, including the per-
formance of both floating-point operations and integer operations. At the end we
also examine how the HT technology affects the SUT’s overall performance.
4.2.1 Floating-point operation performance
In our experiments VMs are configured to have different number of vCPUs, as we
explained in Chapter 3.6. We have profiled the performance, the power consumed,
the power efficiency and the total energy of different VMs by varying the number
of threads used in Linpack benchmarks. Therefore within this series of tests,
there are two variables:
• the number of threads running Linpack benchmark
• the number of vCPUs of a VM.
The actual number of physical cores involved in the benchmark is determined
by their minimal value and bounded by the number of available physical cores
(i.e. 8 in our case).
Figure 4.6 shows our results. We see that all measured parameters increase
until they reach a plateau when the number of threads is the same as the num-
ber of vCPUs for all non-overcommitted cases. Besides that, the performance
increases linearly with respect to the number of involved physical cores (see Fig-
ure 4.6(a)); so does the power consumption (Figure 4.6(b)). However, this is not
the case for power efficiency (Figure 4.6(c)) and energy (Figure 4.6(d)). We will
explain this phenomena in details with our power model in Chapter 5.
Our outputs also indicate that virtualization results in fixed amount (∼30%)
of overall performance degradation with respect to the Linkpack benchmark.
Another interesting result in our experiments comes from the over-committed
VM with 16 vCPUs. When 16 threads are used to run the Linpack benchmark,
it performs less satisfactorily than the same case on the VM with 8 vCPUs. Even
though it consumes less power, its execution time (see Table 4.3) is around 13
times longer, which further leads to much more energy consumed in the test, as
31
4. Profiling VMs’ power consumption
(a) Performance (b) Power consumption
(c) Power efficiency (d) Energy
Figure 4.6: Linpack tests on different VMs
shown in Figure 4.6(d). This is in line with what is known about the performance
degradation when over-committing symmetric multiprocessing guests with KVM
which is caused by dropped requests and unusable response times [44].
Table 4.3: Execution time of Linpack benchmark on different VMs# of threads 1 2 4 8 16
Host 6279 3236 1764 951 964VM 16 vCPUs 9102 4601 2387 1342 109683VM 8 vCPUs 8992 4529 2346 1307 1321VM 4 vCPUs 8982 4523 2340 2356 2365VM 2 vCPUs 8961 4516 4544 4543 4543VM 1 vCPU 9146 8992 8965 8994 8975
4.2.2 Integer operation benchmark
In this section we will examine the integer operation performance of our SUT as
an complementary to its floating-point operation performance evaluated in the
32
previous section.
Dhrystone
Specifically, the Dhrystone v2.1 from UnixBench benchmark suite [45] is used in
this series of benchmarks. Even though Dhrystone v2.1 is originally designed as
a single-threaded application, we varied the number of instances of Dhrystone
which can run concurrently on either the host or the Guest VM.
(a) On the host (b) On the Guest VM
Figure 4.7: Dhrystone benchmark on the host and the Guest VM with differentnumber of threads
Figure 4.7 presents the power consumption and performance of Dhrystone
running on the host and the Guest VM respectively. As shown in Figure 4.7(a),
their power usage is unstable, which makes it less meaningful to continue calcu-
lating the power efficiency in the similar way.
Fhourstones
The Fhourstones benchmark is also a single-threaded application with small code
size. Thus in this section we perform the benchmark only on the host and Guest
VM. The benchmark results are presented in Table 4.4.
Even though we still observed ∼7% of performance degradation in Guest VM,
it is much less than the one in floating-point operation (i.e. Linpack) benchmarks
stated in Section 4.2.1. It is also ∼ 7% less energy-efficient with virtualizaiton
according to the statistics in Table 4.4.
33
4. Profiling VMs’ power consumption
Table 4.4: Performance of the Fhourstones benchmark on the host and guest VM
Performance(KPOS/s)
Executiontime (s)
Power (W)Powerefficiency(KPOS/s/W)
Energy(KJ)
Host 8013 209.5 95.83 83.62 20.08GuestVM
7480 224 96.42 77.58 21.60
4.2.3 Impact of Hyper-Threading
By enabling multiple threads to run on each core simultaneously, Hyper-Threading
(HT) technology improves the overall performance of the CPU and uses it more
efficiently, especially for threaded applications. Within the previous benchmarks,
HT is disabled by default on our SUT. In this section, we will enable HT tech-
nology and explore its impact on the overall performance of the SUT.
We examined its impact for both non- and over-committed VMs and focused
on their floating-point operation performance. The same VMs are used in this
series of power benchmarks in order to make results comparable with our previous
discoveries.
Non-overcommitted case
In this experiment the guest VM is used to perform the Linpack benchmark while
HT is enabled on the host. Figure 4.8 presents the results of Linpack benchmark
running on the guest VM and host.
With HT enabled on the host, there is a significant difference in performance
of how the host performs Linpack benchmark, as shown in Figure 4.8(a). It is
suggested by Intel in [46] that HT is better disabled for compute-efficient appli-
cations, because there is little to be gained from HT technology if the processor’s
execution resources are already well used. What even worse is that spawning a
second process on the same physical core will force the physical resources such as
cache to be shared. If that happens, more cache-miss may be captured and fur-
ther degrades the performance. This issue has also been discussed in [47] which
generates the same conclusion. Another possible explanation for this is that the
host OS is not aware the HT technology on the underlying hardware. In this
34
case, the thread scheduler of host OS may treat the doubled virtual cores equally
and have scheduled, for example, 8 threads of the application to 4 physical cores.
It then results in half of the performance.
(a) Performance (b) Power consumption
(c) Power efficiency (d) Energy
Figure 4.8: Impact of Hyper-Threading for Linpack tests on non-overcommittedVM (with 8 vCPUs) and the host
However the guest VM is surprisingly not affected by the HT technology
according to the results presented in Figure 4.8 where the two data series of the
guest VM almost overlap each other for all of the four metrics.
Over-committed case
Figure 4.9 presents the results of both with and without HT on the host. With HT
enabled, the over-committed VM (with 16 vCPUs) has significant increment in
performance, power consumption and power efficiency, compared with the case
where HT is disabled. Moreover, HT technology enables much more efficient
scheduling on vCPUs, which then results in great improvement in total energy
consumption, as shown in Figure 4.9(d).
35
4. Profiling VMs’ power consumption
Moreover by comparing the cases where number of threads running by Linpack
is less than 16, we also observed slight improvement in performance and power
efficiency while HT is enabled on the host, even though they consumed almost
the same power (see Figure 4.9(b)).
(a) Performance (b) Power consumption
(c) Power efficiency (d) Energy
Figure 4.9: Impact of Hyper-Threading for Linpack tests on overcommitted VM(with 16 vCPUs)
While running Linpack with more than 16 threads (e.g. 24 threads), signif-
icant performance degradation has been observed even when HT is disabled on
host, but not when HT is enabled. However, when over-committing the host with
more than 16 vCPUs, we experienced dramatic performance degradation regard-
less of whether HT is on or off. Therefore we conclude that HT can handle any
number of threads but up to 16 vCPUs on an 8-core machine.
Integer operation performance
In this series of experiments, we performed Dhrystone benchmarks on the host, a
guest VM with 8 vCPUs and an over-committed VM with 16 vCPUs, and varied
36
Figure 4.10: Impact of HT technology for Dhrystone benchmarks
the number of parallel Dhrystone instances from 1 up to 16. Figure 4.10 presents
the results with HT on and off for Dhrystone benchmarks.
It is observed that the performance reaches a plateau after increasing linearly
till 8 Dhrystone instances. The HT technology has no significant impact on
their performance. Moreover, no significant performance degradation has been
experienced in virtualized environments.
4.3 Summary
Our experiments showed that the idle power consumption of the SUT remained
flat with respect to the CPU frequency. However we observed a steep rise when
the CPU load went from idle to 5% (see Figure 4.1(a)). It is therefore our primary
recommendation to maintain CPU frequency scaling in all systems. The effect
of scaling will be significant till the CPU load reaches ∼ 65%. Furthermore
we consider variation of clock speeds local optimizations which are not of great
interest when aiming, as we do, for system-level optimization.
The performance of floating-point operations (as shown by Linpack bench-
marks) and integer operations (as shown by Dhrystone benchmarks) are linear
37
4. Profiling VMs’ power consumption
to the amount of CPU resources used by applications. Thus we came to the
conclusion with a performance model that can be represented as:
P = cpUcpu (4.1)
where Ucpu is the CPU usage and cp is the performance parameter.
We also observed that the total power consumption of the SUT is linear to the
CPU load (see Figure 4.1). The contribution from the memory to the total power
consumption is ∼ 5W for all applications, which is negligible since it accounts for
only ∼ 5.5% of the idle power consumption (90W) or ∼ 3.5% of the maximum
power consumption (140W). As for the HDD, we will keep it in the power model
for the moment since it is difficult to quantify the HDD’s impact on total power
consumption, as presented in Chapter 4.1.3. Therefore by integrating the power
consumption of memory into c0 in Equation 6.1 and calling it Pidle, we believe
that the power model proposed by Bohra et al (see Equation 6.1) can be modified
to:
Ptotal = Pidle + c1Ucpu + c3Uhdd (4.2)
where Pidle is the idle power consumption of the host, Ucpu and Uhdd are the usage
of CPU and HDD respectively, and c1 and c3 are power parameters for CPU and
HDD. A next step is to add the contribution of hardware accelerators, such as
GPU to the above simplified formula.
Though HT technology does little help for systems with less running threads
than the number of physical cores since there are enough cores to host the run-
ning threads, it is of great help in virtualized environments, especially for over-
committed VMs. Therefore we recommend to keep HT enabled so that we are
able to take advantages from over-committing. When over-committing VMs on
lightly-loaded hosts, less work nodes are needed to host all applications. There-
fore energy can be saved by powering off the unneeded idle machines. However,
it’s not wise to overload a work node since it will significantly degrade the per-
formance. We will discuss this in detail in next chapter.
38
Chapter 5
Towards energy-efficient
scheduling
After having thoroughly profiled the power characteristics of both VMs and hard-
ware components in the previous chapter, we are able to provide our novel power
models for green VM scheduling in this chapter.
5.1 The power model
A power model is a mathematical description of the power behavior. We will
specifically focus on the power models for CPU- and/or memory-intensive appli-
cations, in which case the CPU and memory are the only two subcomponents
which may be variated in resource usage.
5.1.1 Prerequisites & Assumptions
We start the development of our power models with several prerequisites which
come from the profiling results in Chapter 4, and some basic assumptions. There
are three essential conclusions, which we call prerequisites later on, from the work
in Chapter 4:
Prerequisite 1. The performance of a work node (in terms of GFlops) is linear
39
5. Towards energy-efficient scheduling
to the CPU usage, as expressed in Equation 5.1
P = cpUcpu (5.1)
where Ucpu is the CPU usage and cp is the per-core performance parameter. For
a multi-core system with Ncpu, we calculate Ucpu as the sum of the usage of all
cores, e.g. 0 < Ucpu ≤ Ncpu. To calculate the value of cp, suppose the maximum
performance Pmax is achieved when Ucpu = Ncpu, then cp = Pmax/Ncpu.
Prerequisite 2. The variation in power consumed by memory is negligible, there-
fore the total power consumption of a work node (host) when running CPU-
and/or memory-intensive applications is linear to its CPU usage, as shown in
Equation 5.2.
P = Pidle + ceUcpu (5.2)
where Pidle and UCPU are the idle power consumption and CPU usage respectively,
ce is the per-core power parameter. Since the maximum power consumption Pmax
is achieved when Ucpu = Ncpu, we have ce = (Pmax − Pidle)/Ncpu.
Prerequisite 3. For virtualized environments (i.e. VMs), a VM has nearly
identical power characteristic as the host does, but with less performance achieved.
Therefore for VMs
c′e = ce and c′p < cp
where c′e and c′p are the per-core power and performance parameters of VMs respec-
tively. Note that the value of c′p depends on the host solely and has no significant
relation with the VM’s configuration because c′p is also the per-core performance
parameter. For example, a VM with 2 vCPUs has the same c′p as another VM
with 4 vCPUs if they run on the same host.
Prerequisite 4. Over-committing on a work node will not cause any additional
performance degradation for applications running on it if the work node is not
overloaded, especially when HT technology is enabled on the work node.
To provide a mapping to our work in Chapter 4, Table 5.1 provides the sample
values of Pidle, ce and cp for our test machine.
40
Table 5.1: Sample values of Pidle, ce and cp for our test machinePidle Pmax Pmax P′max cp ce
Value 90W 150W 75GFlops 52GFlops 9.4GFlops 7.5W
Besides the prerequisites that are obtained from the power profiling, we also
have several practical assumptions to establish our power models.
Assumption 1. A VM is shut down immediately when it is idle, i.e. no appli-
cations running within it. And a work node is powered off (i.e. turned into sleep
state) immediately when no VMs are running within it. Here we also assume that
the cost of ’waking up’ a work node is negligible.
Assumption 2. We assume that the idle power consumption Pidle, performance
and power parameters (i.e. cp and ce respectively) mentioned in Prerequisite 1
and 2 are only hardware-dependent. In other words, all work nodes have their
own values of Pidle, cp and ce; but for all applications running within a single
work node, they have the same cp and ce.
When establishing the power models below, we do not distinguish between
VMs and other applications running on the host because a VM actually lives
as a normal application in KVM virtualization. The difference is that a VM’s
performance is defined by how efficiently, compared with the host, the application
can run within the VM. And while applications run within a VM, the total
computations (e.g. in terms of GFlops) are determined by computations of the
applications and the overhead caused by the VM. In the previous chapter, we have
observed significant but steady performance degradations for applications running
in VMs. Therefore to simplify the case, we will calculate the computations of a
VM (along with all applications running within it) as follows.
Definition 1. The computations of a VM, along with all applications running
within it, is defined as the total computations of all applications runs on the VM
discounted by cp/c′p, where c′p and cp are the performance parameters of the VM
and the host of the VM and c′p < cp. For example, if the total computations of all
applications running on a VM are G, then the computations of the VM are
G′ =cpc′pG
41
5. Towards energy-efficient scheduling
since the performance of the VM is discounted by c′p/cp compared to the host.
Therefore, if an application with (original) computations of G running on a
VM, this VM’s computations are equivalent to an application with computations
of Gcp/c′p running on the host, because their execution time are identical to each
other. In other words, if an VM has computations of G, all applications running
within it has total computations of Gc′p/cp. In this way we can treat the VM
(along with all applications running on it) as a normal application running on
the host. And with this definition, we can uniformly establish the models for a
single work node and a cloud system regardless of whether the applications are
VMs or not.
5.1.2 Power model for a single work node
Theorem 1. For an application (e.g. a VM) with fixed amount of computations
G (e.g. in terms of number of Giga Floating-operations) running on a single
work node exclusively with idle power consumption Pidle, the power parameter ce
and the performance parameter cp, the total energy consumption of the work node
during the application’s runtime T can be expressed as the form in Equation 5.3
regardless of the dynamic CPU usage during the runtime.
Enode = PidleT +cecpG (5.3)
Proof. Suppose we start the application at t = 0 and the CPU usage at t = t is
Ucpu(t), then according to Prerequisite 1, the total amount of computations G is
calculated as
G =
∫ T
0
cpUcpu(t)dt⇒∫ T
0
Ucpu(t)dt =G
cp(5.4)
where T is the runtime of the application. And according to Prerequisite 2, the
total power consumption of the work node during the application’s life time is
42
expressed as
Enode =
∫ T
0
(Pidle + ceUcpu(t))dt
= PidleT + ce
∫ T
0
Ucpu(t)dt
By substituting the value in Equation 5.4, we finally get
Enode = PidleT +cecpG
When multiple applications (e.g. multiple VMs) running on the same work
node, the component of Pidle then crosses the lives of all applications, and the
power model evolves to the following one.
Corollary 1. When N applications (e.g. N VMs) run on a single work node
where each application has Gi computations during its runtime (ti0, ti1] (0 < i ≤
N), the total energy consumption of the work node has the form as expressed in
Equation 5.5.
Enode = PidleT +cecp
N∑i=1
Gi (5.5)
where T = |⋃
0<i≤N(ti0, ti1]| is the joint life time of all applications running within
this work node.
Proof. Suppose at time t, application i has the CPU usage U icpu(t) if t ∈ (ti0, t
i1]
(otherwise U icpu(t) = 0), then the total CPU usage at time t is
Ucpu(t) =N∑i=1
U icpu(t)
Similarly, we have (with Prerequisite 1)
Gi =
∫ ti1
ti0
cpUicpu(t)
43
5. Towards energy-efficient scheduling
Therefore, (with Prerequisite 1 and 2, Assumption 1 and 2)
Enode =
∫⋃
0<i≤N (ti0,ti1]
(Pidle + ceUcpu(t)
)dt
=
∫⋃
0<i≤N (ti0,ti1]
(Pidle + ce
N∑i=1
U icpu(t)
)dt
= Pidle · T +cecp
N∑i=1
Gi
where T = |⋃
0<i≤N(ti0, ti1]|.
Discussion
From the power model for a single work node above, we noticed that the total
power consumption of a work node can be decomposed into two components:
static and application-dependent energy consumption.
The first one corresponds to the idle power consumption of the work node
across the life time of all applications (i.e. PidleT ). It is the minimum energy that
a work node has to consume when running applications. Though it has no direct
relation with the work node’s performance and power efficiency, the life time of
all applications (i.e. T ) is implicitly determined by the work node’s performance.
The second one, i.e. application-dependent energy consumption, is determined
by the total computations of all applications, regardless of their dynamic CPU
usages during their runtime. It also means that overloading the work node has
no benefits nor harms for this component of energy consumption; however the
overloading may degrade the performance of each application, which will result in
longer T in the static component mentioned above, therefore overloading should
be avoided while making scheduling decisions.
Therefore the energy efficiency (i.e. ’green-ness’) of a work node is jointly
defined by the collection of (Pidle, ce, cp). It is greener to run applications on work
nodes with high cp and low Pidle and ce/cp.
44
5.1.3 Power model for a cloud system
In the analysis above, we notice that the energy-efficiency characteristic of a work
node i is directly identified by (with Assumption 2)
wni = (P iidle, ci, c
ip) (5.6)
where ci = cie/cip. cip is included as the third elements above because it has direct
impact on the execution time of an application. Similarly, an application i can
be identified as
appi = (Gi, Ti) (5.7)
where Ti = (ti0, ti1] represents the life of the application, e.g. the application starts
at t = ti0 and finishes at t = ti1. It is influenced by the host’s cp and the amount
of CPU resource it occupies. Therefore a cloud with M work nodes and the
collection of N applications can be represented as
C = {wni|0 < i ≤M} and APPS = {appi|0 < i ≤ N}
respectively. The placement of applications on the cloud turns to be a many-to-
one mapping between APPS and C:
f : appi(Gi, Ti)→ wni(Piidle, ci) (5.8)
With these symbols, the total power consumption of a cloud with M work
nodes and N applications running on them is then calculated as
Ecloud =M∑i=1
Einode
=M∑i=1
(P iidle · |
⋃k∈{j|f(appj)=wni}
Tk|+ ci ·∑
j∈{j|f(appj)=wnj}
Gj) (5.9)
where |⋃
k∈{j|f(appj)=wni} Tk| represents the joint life time of all applications run-
ning on work node wni.
45
5. Towards energy-efficient scheduling
5.1.4 The opportunity revisited
With the power model for a cloud system in Equation 5.9, the energy optimization
for a cloud system with M work nodes turns out to be
min(Ecloud) = minf
( M∑i=1
(P iidle·|
⋃k∈{j|f(appj)=wni}
Tk|+ci·∑
j∈{j|f(appj)=wnj}
Gj))
(5.10)
Notice that the power models elaborated above have one practical assumption
– the work node is powered off immediately when it is idle (i.e. no applications
or VMs running on it). A green work node is then characterized with small Pidle
and c (i.e. ce/cp), and large cp.
In Chapter 1.2 we briefly presented several opportunities for energy-aware
computing. By combining the mathematical descriptions above, we rephrase
them and provide three basic conceptual directions on placing and migrating
VMs across heterogeneous hardware within a data center in an energy-efficient
way:
a) Schedule VMs to work nodes with higher energy-efficiency, i.e. with smaller
ci;
b) Minimize the number of active nodes (i.e. M) by grouping multiple VMs
to a smaller number of work nodes through live migration and/or exploring
the possibility of over-committing on lightly-loaded work nodes.
c) Overloading a work node is not recommended since it comes with lost in
performance, with which the application may run longer and cost more
energy.
5.2 Energy-aware scheduler
There are two sub-schedulers which complement with each other to achieve the
goal of energy-aware scheduling:
Placement scheduler deals with incoming provision requests and distributes
these requests to either active work nodes or new activated work nodes
46
from the resource pool according to the energy-aware placement scheduling
algorithm.
Migration scheduler makes system-level optimizations according to the energy-
aware scheduling algorithm to discover any energy-saving opportunities through
live migration.
A typical working scenario of energy-aware scheduler is described in Figure 5.1.
The scenario consists of a service queue of VM provision requests, the placement
and migration schedulers, a set of active work nodes and a resource pool of idle
work nodes. The idle work nodes in the resource pool are powered off but ready
for provision.
Figure 5.1: Working scenario of energy-aware scheduler
When a sequence of VM provision requests arrive at the front-end of the
cloud system, they are put into a service queue, and the placement scheduler is
triggered to make the decision of mapping VMs to work nodes for the purpose
of green computing. The details of placement scheduler are explained in Section
5.2.2. Within the placement scheduling, new VMs are placed without interfering
other existed VMs, and no live migration is involved. The opportunity with live
47
5. Towards energy-efficient scheduling
migration is explored by the migration scheduler (see Section 5.2.3), where a
global optimization is issued on all active VMs across the entire cloud.
5.2.1 Symbolic description
In the working scenario of the scheduler, there are two basic types of roles: a) VMs
being scheduled and 2) work nodes (or workers) where the VMs are scheduled to.
Their static capabilities and dynamic status can be mathematically described as
collections of resources, as presented in Table 5.2.
Table 5.2: Mathematic description of a Work Node (WN) and a VMWork Node (WN) VM
Static (Ncpu, Pidle, c, cp) (Nvcpu)Dynamic (nvcpu, Ucpu) (nvcpu, Ucpu)
Parameters Where:- Ncpu is the number of physical CPUs,- Pidle is the idle power consumption,- c = ce/cp is the energy-efficiency pa-rameter,- ce is the power parameter as defined inEquation 5.2,- cp is the performance parameter,- nvcpu is the number of active vCPUs,- Ucpu is the current physical CPU usage,0 ≤ Ucpu ≤ Ncpu.
Where:- Nvcpu is the number ofvCPUs requested,- nvcpu is the number ofvCPUs actually used byvm, nvcpu ≤ Nvcpu,- Ucpu is the physicalCPU usage of this VM,Ucpu ≤ nvcpu ≤ Nvcpu.
With the mapping of placement of VMs defined in Formula 5.8, we may
calculate the number of active vCPUs on a work node wn as
wn.nvcpu =∑
vm∈{vm|f(vm)=wni}
vm.nvcpu
And all work nodes of the cloud system may be divided into two sets:
C = Cactive ∪ Cidle and Cactive ∩ Cidle = ∅
where Cactive and Cidle are, respectively, the sets of active work nodes and of idle
work nodes which is in sleep state (i.e. powered off but ready for provision). They
48
are described as
Cactive = {wni|wni.Ucpu 6= 0, wni ∈ C}
And
Cidle = {wni|wni.Ucpu = 0, wni ∈ C}
With Hyper-Threading enabled on the host, the number of (virtual) cores
exposed to the hypervisor is doubled. But the maximum performance is limited
by the number of physic cores, as profiled in Chapter 4. Though more virtual
cores are of little help for CPU-intensive applications (and of little damage either),
lightly-loaded applications will benefit a lot from it through over-committing.
And the performance differences of over-committed VMs, compared with non-
overcommitted ones, are barely noticeable.
Therefore when designing the scheduling algorithm, we assume HT is enabled
on all work nodes, and the maximum number of vCPUs that can be committed
is limited by [Ncpu − Ucpu].
5.2.2 Placement scheduler
The algorithm for placement scheduler is described in Algorithm 1. The job of
this scheduler is to schedule a VM (vm(Nvcpu)) to a suitable work node (wn).
We first sort all active work nodes in descending order of c and sleeping work
nodes in ascending order of Pidle. We then check whether there is an active work
node which can host this VM. If yes, this work node is returned since it is the
most energy efficient candidate. Even though theoretically there is a chance that
a sleeping work node with small Pidle and c and large cp exists so that it may be
more energy efficient if the VM is scheduled to that node instead of an active one,
it’s technically unfeasible at the moment. Moreover, even if it does exist, it will
be scheduled with higher priority by migration scheduler which we will discuss
in next section. If there is no suitable active node to host the VM, we resort to
sleeping work nodes.
If no work node returns at the end, it may be because all resources are in
use. Another possible reason is that work nodes are partially loaded but none of
them meets the VM’s requirement. If this is the case, it’s better to trigger the
49
5. Towards energy-efficient scheduling
Algorithm 1 Energy-aware placement scheduling algorithm
Input: vm(Nvcpu) being scheduledOutput: wn where the VM being scheduled to
1: Sort all wn ∈ Cactive in descending order of c2: Sort all wn ∈ Cidle in descending order of Pidle
3:
4: for all wni ∈ Cactive do5: if wni.Ncpu − wni.Ucpu ≥ vm.Nvcpu then6: return wni
7: end if8: end for9:
10: for all wni ∈ Cidle do11: if wni.Ncpu ≥ vm.Nvcpu then12: return wni
13: end if14: end for
migration scheduler described in next section to carry on global optimization.
Sophisticated decisions may be made (as future work) to determine the value
of Ucpu. Some suggestions may be to take average CPU load of one minute or five
minutes as the value of Ucpu.
An over-committed work node may be overloaded sometime because every
VM running on it has the right to use up to the number of physical cores the
VM has requested in terms of vCPUs. Therefore another algorithm should be
proposed to reschedule workloads to other lightly loaded work nodes according
to Algorithm 1.
Before we go into details of the algorithm, we first determine in which situation
a work node is overloaded in terms of CPU usage. It is intuitive that a work node
is overloaded if its physical CPU usage exceeds a pre-defined threshold T . But
we should also be aware that it is useless to reschedule workloads on a non-
overcommitted work node even if it is overloaded. For example, 8 single-threaded
CPU-intensive applications (VMs) running on a 8-core machine will definitely
overload the work node, but it’s unnecessary to migrate one or multiple VMs to
other work nodes.
Algorithm 2 describes the process of rescheduling workloads. The VM with
50
Algorithm 2 Replacement scheduling algorithm for overloaded work nodes
Input: wn, the work node that is overloadedInput: T , the threshold that defines overloading
1: Let VM = {vmi|f(vmi) = wn} be the collection of all VMs running withinwn
2: Sort all vm ∈ VM in ascending order of vm.Ucpu.3:
4: while wn.Ucpu ≥ T do5: Let vm = VM [1] be the first element in VM6: Apply Algorithm 1 on vm with input parameter vm.Ucpu
7: wn.Ucpu = wn.Ucpu − vm.Ucpu
8: VM = VM − {vm}9: end while
least CPU usage has the highest priority to be migrated on other work nodes,
even though this may require multiple migrations compared to the other way
around. The reason is that this VM is more likely to fit into other work nodes
because it requires less vCPUs, especially when we take its actual CPU usage
instead.
5.2.3 Migration scheduler
For the migration scheduler, global optimization of mapping all VMs to the entire
cloud system will be made. Basically our algorithm (as described in Algorithm
3) is built on top of several practical assumptions:
• If a work node is activated at this round, it will be up and running for a
long time. It is reasonable because if it is selected at this time, it means
that it is also a preferable candidate for other VMs.
• If a work node is activated, it will be fully loaded or near-fully loaded since
it is a greener work node and it will likely be selected to host VMs if a new
provision request arrives when it meets the VM’s requirement.
Based on the two assumptions above, we jointly define the energy efficiency
of a work node as a combination of Pidle and c in order. Taking into account
the fact that all work nodes’ idle power consumption Pidle may only have subtle
51
5. Towards energy-efficient scheduling
differences with each other, we then classify them in different power levels and
treat them equally if they fall into the same power level. This is where sorting
according to c comes into effect. The green sorting algorithm is described in
Algorithm 4. cp is reserved for maximizing performance which is not our major
concern here.
Algorithm 3 Energy-aware migration scheduling algorithm
1: Sort all wn ∈ C according to Algorithm 42: Sort all VMs being provisioned in descending order of their Nvcpu
3:
4: for all vmi ∈ VMs do5: for all wnj ∈ C do6: if wnj.Ncpu − wnj.Ucpu ≥ vm.Nvcpu then7: Schedule vmi on wnj
8: continue9: end if
10: end for11: end for
Algorithm 4 Green sorting of work nodes
1: Sort all wn ∈ C in ascending order of Pidle
2: Group them according to their power levels3:
4: for all power levels do5: Sort all work nodes in this level in descending order of c6: end for
Even though the migration scheduler may optimize the energy consumption
of a cloud system from a global perspective, it involves a lot of migrations across
the cloud, which all come along with cost in both energy and performance. All of
these penalties should be thoroughly profiled before deploying this scheduler. For
the same reason, it is also recommended that this migration scheduling should
not be frequently triggered.
52
Chapter 6
Related Work
Previous work which relates to our project can be classified into two categories.
The first one is energy and performance profiling which corresponds to our work
in Chapter 4; the other one is the framework for green clouds, especially the one
focuses on energy savings through sophisticated VM scheduling. The rest of this
chapter will elaborate on related work done in these two areas.
6.1 Performance and energy profiling
A lot of research has been done on profiling the performance of scientific applica-
tions on either public cloud systems such as Amazon EC2 [48, 49, 50] or private
clouds[51, 52, 53, 54]. Although these works focus on the performance solely, the
approaches in identifying the performance metrics are valuable to our work. We
borrowed some of these ideas but emphasized the ’green’ aspect of computing
without losing the focus on performance.
The most famous and active work about energy profiling of HPC systems is the
work done by green500.org on maintaining the Green500 List[29]. The Green500
List ranks supercomputers from the TOP500 List[32] according to their energy-
efficiency expressed in terms of performance per Watt. It is still an active project,
and the rank is updated twice a year. However their work mainly focuses on
ranking the supercomputer as a single unit, not on improving the energy-efficiency
of supercomputers. Moreover, energy efficiency of virtualized environments are
53
6. Related Work
out of their concern, which makes them less relevant to our work presented in
this thesis.
Hackenberg et al. used SPEC MPI benchmarks to quantify the variations of
energy consumption of HPC systems in [55]. Linpack was introduced to get the
peak power consumption of their test systems. They also identified that the idle
power consumption is 26% of the peak power consumption, and observed 70%
∼ 83% of the peak power was consumed while running SPEC MPI benchmarks.
However they focused on profiling the power consumption of HPC systems solely,
while both the performance and power consumption are our major concerns.
6.2 Energy-aware clouds and power models
Dynamic Voltage/Frequency Scaling (DVFS)[56] provides the handles to adjust
server power states. Together with turning on and off servers, or putting them
to sleep, it is among the most basic power management techniques that can be
applied to servers operating in a cloud [57]. Still all these methods cannot solve
the power consumption optimization problem in the presence of virtual machines,
unless one combines them with forced migration of VMs to concentrate them in
fewer servers. This migration can clearly be undesirable to guarantee application
performance.
One therefore needs to look at the behavior of the individual VMs, and con-
sequently at the availability of correct power models and energy profiles. These
can be obtained by actively using power benchmarks or by closely monitoring the
energy profile of individual system components such as CPU, cache, disk, and
memory at run time. We have been looking at previous work to determine if and
which power model we could use in GreenClouds.
Stoess et al. [58] were among the first to present methods to determine the
energy usage of VMs. They relied on the availability of models for each hardware
component to create a framework for power optimization and the development of
energy-aware OS.
Kansal et al. [59] proposed Joulemeter, a power meter for virtual machines.
Also Joulemeter makes use of power models of individual hardware resources;
at runtime software components monitor the resource usage of VMs and they
54
convert it to energy usage using the available model.
The power modeling technique vMeter, proposed by Bohra et al. [60] is most
relevant for us. They observed a correlation between the total system’s power
consumption and component utilization. They created a four-dimensional linear
weighted power model for the total power consumed P (total):
Ptotal = c0 + c1PCPU + c2Pcache + c3PDRAM + c4Pdisk (6.1)
where PCPU ,Pcache,PDRAM and Pdisk are specific performance parameters for
CPU, cache, DRAM and disk, and c0, c1, c2, c3, c4 are weights. The weights are
calculated per workflow. They refined the power model by separating the contri-
bution of each active domain in a node, either a VM or dom0 :
Ptotal = Pbaseline +N∑k=1
Pdomain(k) (6.2)
where Pdomain(k) is the power consumed by an active domain k, and N is the
number of active domains (including dom0). We reuse this model and provide
our results of empirical studies on it in Chapter 4.
The work done by Liu et al. [61] for the GreenCloud architecture and by
Dhiman et al. [62] in vGreen as well as by Srikantaiah et al. [63] for energy
aware consolidation are also of relevant for us. The GreenCloud architecture
utilizes live migration of VMs based on power information of the physical nodes.
With this technique Liu and his colleagues show a significant energy reduction
for applications running in clouds, specifically for online gaming. They define
an integrated approach similar to the one we are setting out ourselves to follow.
vGreen also consists of a multi-tiered software system, where policies are used to
schedule across the available machines.
Younge discusses a novel green framework for cloud data centers in [64]. He
studied the power behavior of virtual machines and incorporates DVFS-enabled
scheduling within the framework. Moreover, the size of the VM image is also
take into consideration as one more technique to reduce the energy consumption
of data centers.
Van et al. proposed an utility-based VM provisioning and placement pol-
55
6. Related Work
icy in [65] to maximizing the profit of a data center by leveraging applications’
performance and the energy consumption. They took the response time of web
applications, which are hosted within a cluster of Apache servers on multiple ho-
mogeneous VMs, as the performance metric. Instead of Van’s profit-based VM
scheduling policy, Garg et al. proposed carbon/energy based scheduling policy in
[66]. We took a similar approach but focus on the ’green’ aspect of computing.
Besides that, we also explored the diversity of VMs.
Our efforts differ from the above as we aim to explore the benefits of using
very heterogeneous hardware in creating, managing, and when needed migrating,
application-tailored VMs.
56
Chapter 7
Conclusions
The research work presented here aimed to explore the green opportunities with
virtualization technologies in clouds through system- level optimizations, and
focus on energy-saving opportunities by energy- aware scheduling of virtual ma-
chines.
As a pre-research for further deploying system-level energy-savvy scheduling
policy on HPC clouds, we characterized the power consumption of a single work
node of our DAS-4 clusters and quantified the relation between the node’s total
power consumption and the resource usage of its components, i.e. CPU, memory
and HDD. To make our work close to the reality, we mimicked both the lightly-
loaded and stressful workloads with LINPACK benchmark, Stress tool and our
customized scripts.
We identified that the CPU, as the core component of a SUT, has the major
contribution to the total power consumption and dominates the variation part
of the total power consumption when varying the resource usage. However, the
situation may be changed if other accelerators like GPU and FPGA are presented,
which would be an interesting subject as part of our future work. Moreover, the
CPU’s power consumption is almost linearly to its usage. On the other hand,
memory has the constant and also least significant contribution to the total power
consumption which could be further treated as part of the SUT’s idle power
consumption.
Moreover, we also studied the power characteristics of different VMs. VMs
consume nearly the same amount of power as the host does when both running
57
7. Conclusions
same CPU-intensive workloads; however, VMs normally perform ∼ 30% worse in
performance, especially when the VM is over-committed.
At the end we provided our power model for a cloud system based on the
power profiling. We also proposed two novel energy-aware scheduling policies for
instant placement of VMs and for global optimization through live migration.
We believe the power characteristics of both the host and VMs presented in
this paper is not restricted by specific hardware; they generally apply to other
HPC nodes with similar hardware components, even though there may be some
slight differences on the value of parameters in the power model. Besides it, by
integrating the power characterization module into clusters’ monitoring system,
our measurement environment could be further extended to an on-line analyzing
system to support system-level optimization with on-the-fly VM scheduling and
dynamic resource adaptation.
58
References
[1] A. Berl, E. Gelenbe, M. Di Girolamo, G. Giuliani, H. De Meer, M. Q.
Dang, and K. Pentikousis, “Energy-efficient cloud computing,” The Com-
puter Journal, vol. 53, no. 7, pp. 1045–1051, 2010. 1
[2] P. Christian L. Belady, “In the data center, power and cooling costs more
than the IT equipment it supports,” Electronics Cooling Magazine, February
2007. 1
[3] ENERGY STAR program, “Report to congress on server and data
center energy efficiency,” U.S. Environmental Protection Agency,
August 2007, in response to Public Law 109-431. [Online].
Available: http://www.energystar.gov/ia/partners/prod development/
downloads/EPA Datacenter Report Congress Final1.pdf 1
[4] DAS-4 website. [Online]. Available: http://www.cs.vu.nl/das4/ 3, 16
[5] J. Geelan, “Twenty-one experts define cloud computing,” Cloud Computing
Journal, pp. 1–5, 2009. [Online]. Available: http://cloudcomputing.sys-con.
com/node/612375 5
[6] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski,
H. Lee, D. Patterson, A. Rabkin, I. Stoica, and et al., “Above the
clouds: A berkeley view of cloud computing (tr ucb/eecs-2009-28),” 2009.
[Online]. Available: http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/
EECS-2009-28.pdf 5
59
REFERENCES
[7] R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud com-
puting and emerging IT platforms: Vision, hype, and reality for delivering
computing as the 5th utility,” Future Generation Computer Systems, vol. 25,
no. 6, pp. 599 – 616, 2009. 5
[8] M. A. Vouk, “Cloud computing — issues, research and implementations,”
ITI 2008 30th International Conference on Information Technology Inter-
faces, vol. 16, no. 4, pp. 31–40, 2008. 5
[9] I. Foster, Y. Zhao, I. Raicu, and S. Lu, “Cloud computing and grid computing
360-degree compared,” in Grid Computing Environments Workshop, 2008.
GCE ’08, nov. 2008, pp. 1 –10. 5
[10] P. Mell and T. Grance, “The NIST definition of cloud computing
(draft),” National Institute of Standards and Technology, January
2011. [Online]. Available: http://csrc.nist.gov/publications/drafts/800-145/
Draft-SP-800-145 cloud-definition.pdf 5
[11] K. Stanoevska-Slabeva and T. Wozniak, “Cloud basics an introduction to
cloud computing,” in Grid and Cloud Computing. Springer Berlin Heidel-
berg, 2010, pp. 47–61. 5
[12] A. Lenk, M. Klems, J. Nimis, S. Tai, and T. Sandholm, “What’s inside
the cloud? an architectural map of the cloud landscape,” in Proceedings
of the 2009 ICSE Workshop on Software Engineering Challenges of Cloud
Computing, ser. CLOUD ’09. Washington, DC, USA: IEEE Computer
Society, 2009, pp. 23–31. 5
[13] E. D. LEON, “The five layers within cloud computing,” Cloud Computing
Journal, 2009. [Online]. Available: http://cloudcomputing.sys-con.com/
node/1200642 5
[14] OpenNebula: the open source toolkit for cloud computing. [Online].
Available: http://opennebula.org/ 7
[15] Eucalyptus homepage. [Online]. Available: http://www.eucalyptus.com/ 7
60
REFERENCES
[16] Nimbus project. [Online]. Available: http://www.nimbusproject.org/ 7
[17] D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and
D. Zagorodnov, “The Eucalyptus open-source cloud-computing system,” in
Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster
Computing and the Grid, ser. CCGRID ’09. Washington, DC, USA: IEEE
Computer Society, 2009, pp. 124–131. 8
[18] K. Keahey and T. Freeman, “Science clouds: Early experiences in cloud
computing for scientific applications,” Cloud Computing and Its Applications
2008 (CCA-08), October 2008. 8
[19] K. Keahey, T. Freeman, J. Lauret, and D. Olson, “Virtual workspaces for
scientific applications,” Journal of Physics: Conference Series, vol. 78, no. 1,
p. 012038, 2007. 8
[20] P. Sempolinski and D. Thain, “A comparison and critique of Eucalyptus,
OpenNebula and Nimbus,” Cloud Computing Technology and Science, IEEE
International Conference on, vol. 0, pp. 417–426, 2010. 8
[21] R. P. Goldberg, “Architectural principles for virtual computer sys-
tems,” Harvard University, Tech. Rep., February 1973. [Online]. Avail-
able: http://www.dtic.mil/cgi-bin/GetTRDoc?AD=AD772809&Location=
U2&doc=GetTRDoc.pdf 10
[22] KVM homepage. [Online]. Available: http://www.linux-kvm.org/ 11, 22
[23] Xen project homepage. [Online]. Available: http://www.xen.org/ 11
[24] T. Deshane, Z. Shepherd, J. Matthews, M. Ben-Yehuda, A. Shah, and
B. Rao, “Quantitative comparison of Xen and KVM,” in Xen summit.
Berkeley, CA, USA: USENIX association, June 2008. 12
[25] A. Heissler, “Performance analysis of xen virtual machines in real-world
scenarios,” University of Applied Sciences Technikum Wien, Tech. Rep.
arXiv:1009.5878, Sep 2010. 12
61
REFERENCES
[26] X. Xu, F. Zhou, J. Wan, and Y. Jiang, “Quantifying performance properties
of virtual machine,” in Information Science and Engineering, 2008. ISISE
’08. International Symposium on, vol. 1, dec. 2008, pp. 24 –28. 12
[27] A. J. Younge, R. Henschel, J. T. Brown, G. von Laszewski, J. Qiu, and G. C.
Fox, “Analysis of virtualization technologies for high performance computing
environments,” in Proceedings of The Fourth IEEE International Conference
on Cloud Computing, ser. CLOUD ’11, April 2011. 12
[28] M. Fenn, M. A. Murphy, and S. Goasguen, “A study of a KVM-based cluster
for grid computing,” in Proceedings of the 47th Annual Southeast Regional
Conference, ser. ACM-SE 47. New York, NY, USA: ACM, 2009, pp. 34:1–
34:6. 12, 27
[29] The Green500 List website. [Online]. Available: http://www.green500.org/
13, 53
[30] S. Sharma, C.-H. Hsu, and W.-C. Feng, “Making a case for a Green500
list.” in IEEE International Parallel and Distributed Processing Symposium
(IPDPS 2006)/ Workshop on High Performance - Power Aware Computing,
2006. 13
[31] R. Ge and X. Feng and H. Pyla and K. Cameron and W. Feng. (2007, June)
Power measurement tutorial for the Green500 list. [Online]. Available:
http://www.green500.org/docs/pubs/tutorial.pdf 13
[32] The TOP500 List website. [Online]. Available: http://www.top500.org/ 15,
53
[33] C.-H. Hsu, W. chun Feng, and J. S. Archuleta, “Towards efficient supercom-
puting: A quest for the right metric,” Parallel and Distributed Processing
Symposium, International, vol. 12, p. 230a, 2005. 16
[34] Ganglia homepage. [Online]. Available: http://ganglia.sourceforge.net/ 18
[35] University of California, Berkeley Millennium Project website. [Online].
Available: https://www.millennium.berkeley.edu/ 18
62
REFERENCES
[36] Homepage of Stress tool. [Online]. Available: http://weather.ou.edu/∼apw/
projects/stress/ 20, 21
[37] Intel optimized LINPACK benchmark. [Online]. Available: http://software.
intel.com/en-us/articles/intel-math-kernel-library-linpack-download/ 20
[38] Homepage of LINPACK benchmark. [Online]. Available: http://www.
netlib.org/linpack/ 20
[39] A. R. Weiss, “ECL Dhrystone White Paper,” The EEMBC Certification
Laboratories, LLC (ECL), November 2002. [Online]. Available: http://
www.johnloomis.org/NiosII/dhrystone/ECLDhrystoneWhitePaper.pdf 20
[40] K. Avi, “KVM: The Linux virtual machine monitor,” Proceedings of the
Linux Symposium, Ottawa, Ontario, 2007, 2007. 22
[41] Intel Turbo Boost Technology 2.0. [Online]. Available: http://www.intel.
com/technology/turboboost/ 25
[42] J. Che, Q. He, Q. Gao, and D. Huang, “Performance measuring and compar-
ing of virtual machine monitors,” in Embedded and Ubiquitous Computing,
2008. EUC ’08. IEEE/IFIP International Conference on, vol. 2, dec. 2008,
pp. 381 –386. 27
[43] P. Kogge, K. Bergman, S. Borkar, and et al., “ExaScale computing study:
Technology challenges in achieving exascale systems,” September 2008. 29
[44] Fedora Documentation Project, Fedora 13 Virtualization Guide. Fultus
Corporation, 2010, pp. 180–182. 32
[45] UnixBench benchmark suite. [Online]. Available: http://code.google.com/
p/byte-unixbench/ 33
[46] G. Drysdale, A. C. Valles, and M. Gillespie. Per-
formance insights to Intel R© Hyper-Threading technol-
ogy. [Online]. Available: http://software.intel.com/en-us/articles/
performance-insights-to-intel-hyper-threading-technology/ 34
63
REFERENCES
[47] O. Celebioglu, A. Saify, T. Leng, J. Hsieh, V. Mashayekhi, and R. Rooho-
lamini, “The performance impact of computational efficiency on HPC clus-
ters with Hyper-Threading technology,” Parallel and Distributed Processing
Symposium, International, vol. 15, p. 250b, 2004. 34
[48] K. R. Jackson, L. Ramakrishnan, K. Muriki, S. Canon, S. Cholia, J. Shalf,
H. J. Wasserman, and N. J. Wright, “Performance analysis of high perfor-
mance computing applications on the amazon web services cloud,” Cloud
Computing Technology and Science, IEEE International Conference on,
vol. 0, pp. 159–168, 2010. 53
[49] S. Ostermann, A. Iosup, N. Yigitbasi, R. Prodan, T. Fahringer, and
D. Epema, “A performance analysis of EC2 cloud computing services for
scientific computing,” in Cloud Computing, ser. Lecture Notes of the In-
stitute for Computer Sciences, Social Informatics and Telecommunications
Engineering. Springer Berlin Heidelberg, 2010, vol. 34, pp. 115–131. 53
[50] A. Iosup, S. Ostermann, M. N. Yigitbasi, R. Prodan, T. Fahringer, and
D. H. Epema, “Performance analysis of cloud computing services for many-
tasks scientific computing,” IEEE Transactions on Parallel and Distributed
Systems, vol. 22, pp. 931–945, 2011. 53
[51] J. Ekanayake and G. Fox, “High performance parallel computing with clouds
and cloud technologies,” in Cloud Computing, ser. Lecture Notes of the In-
stitute for Computer Sciences, Social Informatics and Telecommunications
Engineering. Springer Berlin Heidelberg, 2010, vol. 34, pp. 20–38. 53
[52] C. Baun and M. Kunze, “Performance measurement of a private cloud in the
OpenCirrus TM testbed,” in Euro-Par 2009 Parallel Processing Workshops,
ser. Lecture Notes in Computer Science. Springer Berlin / Heidelberg, 2010,
vol. 6043, pp. 434–443. 53
[53] G. V. Mc Evoy, B. Schulze, and E. L. M. Garcia, “Performance and deploy-
ment evaluation of a parallel application on a private cloud,” Concurrency
and Computation: Practice and Experience, 2011. 53
64
REFERENCES
[54] J. Tao, K. Furlinger, and H. Marten, “Performance evaluation of OpenMP
applications on virtualized multicore machines,” in OpenMP in the Petascale
Era, ser. Lecture Notes in Computer Science. Springer Berlin / Heidelberg,
2011, vol. 6665, pp. 138–150. 53
[55] D. Hackenberg, R. Schone, D. Molka, M. Muller, and A. Knupfer, “Quanti-
fying power consumption variations of HPC systems using spec mpi bench-
marks,” Computer Science - Research and Development, vol. 25, pp. 155–163,
2010, 10.1007/s00450-010-0118-0. 54
[56] G. Magklis, G. Semeraro, D. Albonesi, S. Dropsho, S. Dwarkadas, and
M. Scott, “Dynamic frequency and voltage scaling for a multiple-clock-
domain microprocessor,” Micro, IEEE, vol. 23, no. 6, pp. 62 – 68, nov.-dec.
2003. 54
[57] G. von Laszewski, L. Wang, A. Younge, and X. He, “Power-aware scheduling
of virtual machines in DVFS-enabled clusters,” in Cluster Computing and
Workshops, 2009. CLUSTER ’09. IEEE International Conference on, 31
2009-sept. 4 2009, pp. 1 –10. 54
[58] J. Stoess, C. Lang, and F. Bellosa, “Energy management for hypervisor-
based virtual machines,” in USENIX Annual Technical Conference, 2007, p.
114. 54
[59] A. Kansal, F. Zhao, J. Liu, N. Kothari, and A. A. Bhattacharya, “Virtual
machine power metering and provisioning,” in SoCC’10, 2010, pp. 39–50. 54
[60] A. Bohra and V. Chaudhary, “VMeter: Power modelling for virtualized
clouds,” in Parallel Distributed Processing, Workshops and Phd Forum
(IPDPSW), 2010 IEEE International Symposium on, april 2010, pp. 1 –8.
55
[61] L. Liu, H. Wang, X. Liu, X. Jin, W. B. He, Q. B. Wang, and Y. Chen,
“GreenCloud: a new architecture for green data center,” in Proceedings of
the 6th international conference industry session on Autonomic computing
and communications industry session, ser. ICAC-INDST ’09. New York,
NY, USA: ACM, 2009, pp. 29–38. 55
65
REFERENCES
[62] G. Dhiman, G. Marchetti, and T. Rosing, “vGreen: a system for energy
efficient computing in virtualized environments,” in Proceedings of the 14th
ACM/IEEE international symposium on Low power electronics and design,
ser. ISLPED ’09. New York, NY, USA: ACM, 2009, pp. 243–248. 55
[63] S. Srikantaiah, A. Kansal, and F. Zhao, “Energy aware consolidation for
cloud computing,” in Proceedings of the 2008 conference on Power aware
computing and systems, ser. HotPower’08. Berkeley, CA, USA: USENIX
Association, 2008, pp. 10–10. 55
[64] A. J. Younge, “Towards a green framework for cloud data cen-
ters,” Master’s thesis, Rochester Institute of Technology, May 2010.
[Online]. Available: http://cyberaide.googlecode.com/svn-history/r5110/
trunk/papers/thesis-younge/ajy-thesis.pdf 55
[65] H. N. Van, F. D. Tran, and J.-M. Menaud, “Performance and power man-
agement for cloud infrastructures,” Cloud Computing, IEEE International
Conference on, vol. 0, pp. 329–336, 2010. 56
[66] S. K. Garg, C. S. Yeo, A. Anandasivam, and R. Buyya, “Environment-
conscious scheduling of HPC applications on distributed cloud-oriented data
centers,” Journal of Parallel and Distributed Computing, vol. 71, no. 6, pp.
732 – 749, 2011, special Issue on Cloud Computing. 56
66
List of Figures
2.1 Typical stack of the cloud model . . . . . . . . . . . . . . . . . . 6
2.2 Typical procedure of VM provisions in IaaS cloud . . . . . . . . . 9
2.3 Architecture comparison: native hypervisor vs. hosted hypervisor 11
3.1 Power measurement setup . . . . . . . . . . . . . . . . . . . . . . 14
3.2 The Ganglia monitoring system . . . . . . . . . . . . . . . . . . . 19
4.1 Power consumption versus CPU usage . . . . . . . . . . . . . . . 26
4.2 CPU frequecy scaling benchmark . . . . . . . . . . . . . . . . . . 27
4.3 Memory stress tests . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.4 CPU and Memory Stress tests on the host . . . . . . . . . . . . . 29
4.5 HDD stress tests on the host . . . . . . . . . . . . . . . . . . . . . 30
4.6 Linpack tests on different VMs . . . . . . . . . . . . . . . . . . . . 32
4.7 Dhrystone benchmark on the host and the Guest VM with different
number of threads . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.8 Impact of Hyper-Threading for Linpack tests on non-overcommitted
VM (with 8 vCPUs) and the host . . . . . . . . . . . . . . . . . . 35
4.9 Impact of Hyper-Threading for Linpack tests on overcommitted
VM (with 16 vCPUs) . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.10 Impact of HT technology for Dhrystone benchmarks . . . . . . . . 37
5.1 Working scenario of energy-aware scheduler . . . . . . . . . . . . 47
67
LIST OF FIGURES
68
List of Tables
2.1 Summary of cloud layers and their major players/technologies . . 7
2.2 The cloud managers compared . . . . . . . . . . . . . . . . . . . . 8
2.3 A comparison between Xen and KVM . . . . . . . . . . . . . . . . 12
3.1 Definition of benchmark metrics . . . . . . . . . . . . . . . . . . . 15
3.2 Heterogeneous design of DAS-4 clusters . . . . . . . . . . . . . . . 17
3.3 Specifications of Intel E5620 . . . . . . . . . . . . . . . . . . . . . 18
4.1 Summary of component benchmarks . . . . . . . . . . . . . . . . 24
4.2 Summary of overall benchmarks . . . . . . . . . . . . . . . . . . . 24
4.3 Execution time of Linpack benchmark on different VMs . . . . . . 32
4.4 Performance of the Fhourstones benchmark on the host and guest
VM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.1 Sample values of Pidle, ce and cp for our test machine . . . . . . . 41
5.2 Mathematic description of a Work Node (WN) and a VM . . . . . 48
69