Upload
buidieu
View
214
Download
2
Embed Size (px)
Citation preview
N . 1 0 - 1 5 t h J u l y 2 0 1 5
- P a g e 1 -
Internet of Things (IoT) has become a popular expression.
It indicates the network of physical objects or “things”, each
characterized by a number of capabilities that include communication, sensing, actuation and
computation, designed to interact with the
environment and with other things. How different is
this from a natural ecosystem?
A natural ecosystem is an
ensemble that includes all of
the living things (plants,
animals and organisms) in a
given area, interacting with
each other, and with the
environment. In this perspective,
the IoT presents itself as a hybrid
ecosystem, i.e. the extension of
this ensemble to the presence of
mobile, autonomous, communicating devices. There is little
doubt that in the near future we will assist at the transformation
of our natural ecosystems due to the increasing presence of
artificial objects that will coexist and interact with the living
organisms in our present ecosystems. How is the device
powering issue relevant to this perspective? In a parallel with natural
ecosystems, we can envision a scenario where cooperation-competition
mechanisms will arise in order to secure the finite amount of energy
available. How to deal with this? Let us think about it… (L.G.)
ICT-ENERGYL E T T E R S
EU Projects
ENTRA – Pag. 2
LANDAUER – Pag. 3
PROTEUS – Pag. 4
ICT Energy events
Doctoral Symposium – Pag. 3
Micro Energy Day 2015 – Pag. 5
Summer School 2015 – Pag. 6
Conferences
ICT 2015 Innovate, Connect,
Transform – Pag. 7
Scientific Papers
Pag. 9
This newsletter is edited with the collaboration of the ICT-ENERGY editorial board composed by: - L. Gammaitoni, NiPS Laboratory,
University of Perugia (IT)
- L. Worschech, University of Wuerzburg
(Ger)
- J. Ahopelto, VTT (FI)
- C. Sotomayor Torres, ICN and ICREA
Barcellona (SP)
- F. Marchesoni, University of Camerino
(IT)
- G. Abadal Berini, Universitat Autònoma
de Barcelona (SP)
- D. Paul, University of Glasgow (UK)
- G. Fagas, Tyndall National Institute
University College Cork (IR)
- S. Lombardi, NiPS Laboratory,
University of Perugia (IT)
- J.P. Gallagher, Roskilde Univ. (DK)
- V. Heuveline, Ruprecht-Karls-
universitaet Heidelberg (Ger)
- A.C. Kestelman, Barcelona
Supercomputing Center (SP)
- D. Atienza, EPFL (CH)
- D. Williams, Hitachi Europe limited
(UK)
- K. Eder, University of Bristol (UK)
- K. G. Larsen, Aalborg Universitet (DK)
- F. Cottone, NiPS Laboratory,
University of Perugia (IT)
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 2 -
ENTRA project workshop on Energy-Aware Software Development
Malaga, 6-7, May 2015
The EU ENTRA project (http://entraproject.eu/) aims to promote energy-aware software development
using program analysis and modelling of energy consumption in computers. ENTRA is one of the FET MINECC
cluster of projects investigating how to achieve energy efficient ICT, and the ENTRA coordinator Roskilde
University, Denmark, is a partner in the ICT-Energy Coordination Action (http://www.ict-energy.eu/), as is
ENTRA partner the University of Bristol, UK.
A workshop in the historic city of Malaga on 6-7 May, sponsored by the ENTRA project, combined
presentations of ENTRA project results with invited talks from outside speakers. There are many ICT
applications where reduction of energy consumption is a critical design goal. The invited speakers covered
some of these, including wearable bio-medical devices (Hossein Mamaghanian from EPFL, Switzerland), a
topic targeted by the EU PHIDIAS project. Other applications presented by invited speakers were data
centres (José Manuel Moya from Universidad Politécnica de Madrid) and cloud computing in e-health (José
Ayala from Universidad Complutense de Madrid). There were also presentations from Sudipta Chattopadhyay
(Linköping University, Sweden) describing methods for statically identifying energy hotspots in programs
such as smartphone apps that drain the phone's battery, and from Maurizio Morisio (Politecnico di Torino,
Italy) who presented an agenda for energy-efficient software in green computing.
The overall challenge of energy-aware software development requires tools to bridge a large gap: the gap is
between hardware, where energy is consumed, and high-level programming languages and software
abstractions used in modern software engineering, which in many ways hide resource usage from the
programmer. The ENTRA project researchers presented recent progress in building such tools. IMDEA
Software researchers described recent advances in the CiaoPP resource analysis tools developed at IMDEA
Software. Handling energy analysis of multi-threaded programs was dealt with by researchers from Roskilde
University, Denmark and Bristol University, UK. Energy models derived from hardware, at different levels of
detail and abstraction, were presented by Bristol University, and an approach for modelling energy directly
from high-level source code was presented by Roskilde University. IMDEA Software researchers described
techniques for energy-optimal scheduling of tasks on multi-core computers using genetic algorithms, and
Roskilde University researchers presented a framework for probabilistic analysis of resource usage.
The ENTRA project cases studies are carried out using hardware produced by the industrial partner in the
ENTRA project, XMOS Ltd, UK. This hardware is used for embedded systems especially in automotive and
digital media applications. XMOS staff explained the energy-saving features of its latest chip and the
associated compiling techniques.
In summary, the combination of ENTRA research results along with invited talks stimulated discussion both
at the technical level and around the wider implications and applications of energy-efficient ICT and
energy-aware software engineering. (J.P. Gallagher)
The full workshop programme can be seen at
http://entraproject.eu/programme-entra-workshop-malaga-6-7-may-2015
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 3 -
Doctoral Symposium in Bristol on Sept. 16th 2015
The Barcelona Supercomputing Center is organizing a
Doctoral Symposium to be held in Bristol, on Sept. 16th,
2015.
The Doctoral Symposium is a part of the ICT-Energy
Clustering/Networking workshop 2015 (www.ict-
energy.eu). The event, organized in by the University of
Bristol, will be held on 14-16 September.
This symposium is a great opportunity for PhD students to present their thesis work to a broad audience in
the ICT community from both industry and academia. The forum may also help students to establish
contacts for entering into research related to minimization of energy from various layers. In addition,
representatives from industry and academia get a glance of state-of-the-art in the energy minimization
field.
Topics of interest include, but are not limited to:
- Power- and thermal-aware algorithms, software and hardware
- Sensing and monitoring - Power and thermal behavior and control
- Data centers optimization
- Smart grid and microgrids
- Low-power electronics and systems
- Power-efficient multi/many-core chip design
If you want to know more, please download the Symposium brochure here.
LANDAUER Project consortium meeting in Paris
A consortium meeting of the European project LANDAUER has been held in Paris on June 11-12.
During the meeting the five laboratories that
compose the consortium presented recent
progresses and discussed the steps toward the
end of the project in 2015.
For latest news and information, please visit
the LANDAUER project website
(www.landauer-project.eu).
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 4 -
The PROTEUS project launched
PROTEUS is a project funded under
the H2020 framework program for
research of the European
Commission.
It investigates the smart integration
of chemical sensors based on
carbon nanotube, MEMS based
physical sensors together with a
cognitive engine providing on the fly reconfigurability. Produced deviced will be tested in the context of
water monitoring.
PROTEUS mix competences from integrated smart systems area, Internet of Things, cloud based computing,
long range wireless sensors in the field of water utilities.
NIPS Laboratory group is part of the 9 partners of the PROTEUS consortium, which met in IFSTTAR on 10th
and 11th of February 2015 in Marne la Vallée, close to Paris, to kick-off the project.
This meeting was the opportunity to present the sense city testbed to the consortium. This testbed will
allow lab scale validation of PROTEUS solutions before the deployment in te field.
NiPS laboratory research group at Physics Department at University of Perugia, will carry out the
important research work of design, study, prototyping ant validation of novel hybrid energy harvesting
systems that will add full self-powering capability to wireless sensor network for water quality monitoring.
PROTEUS will run during 36 months, until January 2018.
http://www.proteus-sensor.eu/
(F. Cottone)
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 5 -
Micro Energy Day 2015
Discover smart solutions for mobile electronic energy needs.
Institutions from across Europe celebrated the Micro-Energy Day, during the European Sustainable Energy
Week, to raise awareness about energy use in mobile electronic devices. The Micro-Energy Day, proposed
since 2010 by NiPS Laboratory (www.nipslab.org) at University of Perugia to disseminate a new conceptual
approach to renewable energies at microscales, has grown, involving a larger number of partners all over
Europe and reaching students, families as well as politicians and media.
The 2015 edition of the Micro Energy Day, organized by the partners in the EU funded project ICT-Energy
(www.ict-energy.eu) coordinated by NiPS Lab, included several demonstrations in different European
countries.
In Perugia (Italy) a dissemination activity, held on 20th of
June, made use of several demonstrative devices: from
crystal radios, which can operate without power sources
apart from the collected signals, to small radios operated
by solar cells or thermoelectric generators, to wireless
energy transmitters. Together with the NiPS group, the
local amateur radio community and representatives of the
Perugia Electronic Engineering Department discussed with
the local audience and with the amateur community all
around the world the relevance of the scientific research
on microenergies.
Barcelona Supercomputing Center (Spain) organized a Live Twitter Chat in the ICT-Energy Twitter channel
on the 18th of June, using the hashtag #LessEnergyICT, while discussing with local visitors to the
MareNostrum supercomputer energy issues and the solutions proposed to solve them. The chat was focused
on the energy expenditures in Information and Comunication Technology (from small computing devices to
the exascale supercomputers).
In Cork (Ireland) school children from primary school took part
in a number of fun and interactive demonstrations designed to
encourage us all to become more energy aware, organized by
researchers of the Tyndall Institute. Very rare to see kids asking
so many questions at once!
Moreover, on 18-19 in Heidelberg (Germany), there were
specific lessons introducing the microenergy approach and the
theme „renewable microenergies physics“ to secondary level
pupils at the Ernst-Sigle-Gymnasium in Kornwestheim.
According to the International Energy Agency, consumer
electronics and computer equipment represent 15% of global
residential electricity consumption – and this figure is set to
triple if energy efficiency is not improved. Micro-Energy Day
seeks to highlight this issue and showcase projects under the ICT
Energy umbrella. You’re invited to be part of it!
Further information is available in the Micro-Energy Day website
(microenergyday.eu/) and ICT Energy website (www.ict-
energy.eu).
(S. Lombardi)
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 6 -
NiPS Summer School 2015
ICT-Energy
Energy consumption in future ICT devices
Fiuggi (Italy), 7-12 July 2015
The sixth edition of NiPS Summer School, organized by the Noise in Physical System Laboratory at University
of Perugia as part of the European FET Proactive Coordination Action ICT-Energy (www.ict-energy.eu), was
held in Fiuggi (www.nipslab.org/summerschool).
The 40 participants (graduate students, post-docs and
young researchers) attended 4 thematic groups of
lectures, aimed at teaching the bases of the science of
efficient ICT:
1) Basic on the physics of energy transformations at
micro and nanoscales
2) Introduction to energy harvesting and distributed
autonomous mobile devices
3) Software and energy aware computing
4) High performance computing and systems
Participants to the school had the opportunity to widely discuss with teachers, coming from the ICT-Energy
project Consortium, and colleagues, and to illustrate their research results during a evening Poster Session.
For more information and the slides of the lectures, please visit www.nipslab.org/summerschool2015.
Lectures and social events at NiPS2015 Summer School on ICT-Energy.
Organized by: Supported by:
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 7 -
ICT 2015 Innovate, Connect, Transform
The EC event dedicated to ICT – Lisbon 22-23 October, 2015
European Commission from time to time promotes a
large event dedicated to the world of ICT. The next
one (entitled with little fantasy ICT 2015) will happen
next October 22-23 in Lisbon, Portugal. This is a good
opportunity to do networking activities with other
participants, to interesting debates in the associated
conference, to hear the latest news on the European
Commission's new policies and initiatives with regard
to R&I in ICT and find information about funding
opportunities from the European Union.
The ICT-Energy community has submitted a proposal
for organising a “Networking Session”. Networking
sessions are informal discussion groups organised by
stakeholders participating in the ICT 2015 event. Their
goal is to enable a broad and multi-disciplinary
dialogue, help participants find partners for a
collaborative relationship enlarge and enhance
stakeholder communities and better connect research
to innovation for a Digital Europe.
ICT-Energy Networking Session proposal is entitled: ICT-Energy: decreasing energy consumption toward
zero-power devices.
The objective of the proposed session is twofold:
1) to bring together people from the world of research with people form the world of industry,
interested in lowering energy consumption in ICT devices. They often speak two different languages
and have different perspectives. In this session we will hear from both community in short, focused
communications.
2) to raise awareness in the ICT people (at large) in the importance of decreasing energy dissipation in
the future of ICT devices. This strategic bottleneck is often overlooked when people think at the
future of ICT. We need to put under focus the connection between the need for powering mobile
devices (in the internet-of-things scenario) with the need for decreasing heat production in future
microprocessors. Both these aspects rely on the science of energy transformation at micro and nano
scale. By addressing this very issue we will open novel perspectives for exascale computers and
micro wireless autonomous sensors at the same time.
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 8 -
If approved, the ICT-Energy Networking Session will have
the following format:
- Opening with a 2 min video on the role on energy
on the functioning of ICT devices.
- Then a speaker will briefly (5 min) introduce the
two sides of this topic: excess heat production in
microprocessors and low battery life for mobile
micro devices.
- Podium confrontation: one representative from industry and one from research will illustrate in a 5
min speech the 3 main “wishful thoughts” (industry rep) and the 3 main “things we know” (research
rep).
- Follows a question-and-answer session, much like in a political debate, with public, moderated by a
professional journalist/science communicator.
- During the session there will be a corner installation with video boot “speak-up your view”. Here
everybody has a 3 min speech opportunity for recording a message on this topic. The messages will
be subsequently made available through the web site of the (FET Proactive) ICT-Energy coordination
action (www.ict-energy.eu).
Who should attend the ICT-Energy Networking Session?
The main target is represented by ICT people coming both from industry and from research, generically
interested (or curious) about the problem of energy consumption in ICT devices and thus in their powering
problem.
Specifically we plan to involve people from the following areas:
- low-power microprocessor design
- microprocessor heat dissipation and cooling
- energy harvesting system design
- wireless sensor networks
- mobile communications
- energy aware software
- energy related material science
- green ICT world
ICT 2015 website: https://ec.europa.eu/digital-agenda/en/ict2015-innovate-connect-transform-lisbon-
20-22-october-2015
For further information and updates on the ICT-Energy Networking Session stay tuned to the ICT-Energy
project website (www.ict-energy.eu).
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 9 -
Towards super computers: EU project Exa2Green improves
energy efficiency in high performance computing
Engineering Mathematics and Computing Lab (EMCL),
Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Germany
Steinbeis-Europa-Zentrum, Steinbeis Innovation gGmbh, Germany
Scientific Computing Group, Department of Informatics, Hamburg University, Germany
High Performance and Computing Architectures (HPCA), University Jaume I, Castellon, Spain
Swiss National Supercomputing Centre (CSCS), ETH Zurich, Switzerland
IBM Research – Zurich, Foundations of Cognitive Solutions, Switzerland
Institute for Meteorology and Climate Research, Karlsruhe Institute of Technology, Germany
Abstract—The FP7-funded Exa2Green project is paving the
road to Exascale computing by improving energy efficiency in high
performance computing (HPC). Slowly approaching the end of the
project, the team can show quite remarkable results.
I. INTRODUCTION
EXASCALE computers, so-called supercomputers of our
future, are machines that will be capable of performing at least
1018
floating point operations per second. The work rate at
Exascale is inconceivably immense, offering the prospect of
transformative progress in energy, national security, the
environment, our economy, and fundamental scientific questions.
However, the path to Exascale involves numerous complex
challenges, with a key one being power consumption. The FP7-
funded Exa2Green project (www.exa2green-project.eu) has
taken up this challenge and is developing a radically new energy-
aware computing paradigm and programming methodology for
Exascale computing. The 3-year research project started in
November 2012.
Comprising an interdisciplinary research team of High
Performance Computing (HPC) experts from Germany,
Switzerland and Spain, the Exa2Green team focuses on three
main activities: first, developing tools for measuring the
performance and energy consumption of computations; second,
analysing existing, widely used computational kernels and
developing new energy-efficient algorithms; and finally,
optimising a compute-intensive climate model to achieve a
considerable reduction of energy consumption in climate
simulations. For this third element, Exa2Green is using the
COSMO-ART weather forecast model as an example of an
intensive simulation, whose energy profile is currently far from
optimal. In the following, we highlight some aspects of our
research in the different fields of work.
II. DESIGN OF TOOLS FOR POWER- AND ENERGY
ANALYSIS ON HPC SYSTEMS
University of Hamburg focused on designing tools for power and
energy analysis on HPC systems. A first goal was achieved with
the design of an integrated power-performance analysis
framework to trace and analyze the power and energy
consumption made by parallel scientific applications. This
framework comprises PMLib, a well-established package for
investigating power usage in HPC applications [4]. Its
implementation provides a general interface to utilize a wide
range of wattmeters, including i) external devices, such as
commercial
PDUs, WattsUp? Pro .Net, ZES Zimmer LMG450, etc., ii)
internal wattmeters, directly attached to the power lines leaving
from PSU, iii) commercial DAS from National Instruments (NI)
and, iv) integrated power measurement interfaces such as Intel
RAPL, NVIDIA NVML, IPMI, etc. The PMLib client side
provides a C library with a set of routines in order to measure the
application code. The traces obtained can be integrated into
existing profiling and tracing frameworks such as
Extrae+Paraver or VampirTrace+Vampir. On the top of this
framework, we have incorporated two useful plug-ins: a module
to detect power-related states and a tool to automatically detect
power bottlenecks [5, 6].
Fig. 1: Graphical representation of the power-performance analysis framework.
Additionally, we have desgined ArduPower a small, low-cost
and accurate DC wattmeter, specifically designed for HPC
infrastructures. ArduPower comprises of two basic hardware
components: a shield of 16 current sensors and an Arduino Mega
2560 processing board. This new wattmeter has been conceived
as a low-cost DAS to measure the instantaneous DC power
consumption of the internal components of computing systems,
wherever the power lines are accessible. In general, it offers a
spatially fine-grained power measurement by providing 16
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 1 0 -
channels to monitor power consumption of several parts, e.g.,
mainboard, HDD and GPU cards etc. simultaneously, with a
sampling rate varying from 480 to 5,880 Sa/s. The total
production cost of the wattmeter is approximately 100 €, so it
can be easily accommodated in moderate-/large-scale HPC
platforms in order to investigate power consumption of scientific
applications.
Fig 2.: ArduPower wattmeter comprising an Arduino Mega 2560 board and a power sensing shield with 16 current sensros.
III. SETUP FOR INNOVATIVE ENREGY EFFICIENT
ALGORITHMIC KERNELS
IBM Research – Zurich investigated power consumption of
several elementary computational kernels, the so-called
computational dwarfs. The list of analysed kernels ranges from
dense to sparse linear algebra like sparse matrix-vector
multiplication (SpMV) and conjugate gradient (CG) method, as
well as to spectral methods such as the fast Fourier transform [4,
5, 6]. All the kernels have been analyzed on modern HPC
architectures, such as the IBM Blue Gene/Q, the IBM POWER7
and the new IBM POWER8 system.
As an example, in Figure 1 we report measurements and
subsequent classification analysis for the SpMV, one of the most
widely used kernels to solve PDEs and modern engineering
problems. On top of these analysis, IBM Research - Zurich
developed accurate models for the characterization and time-
power-energy prediction of the same kernels, tested on multicore
architectures.
Fig. 1: SpMV performance analysis and classification, for different sparse matrices on IBM POWER8. Each color corresponds to a different class of
matrices. For more details see [4].
As an example, in Figure 2 we show a model of the power
consumption of the most critical sub-kernels of the CG
algorithm. These models represent not only a fundamental utility
tool for cluster and cloud systems (e.g., to predict costs and
optimize placement of jobs on the nodes), but also consist in a
solid base for the subsequent optimization of the kernels.
Fig. 2: Net power cost prediction of the main sub-kernels used during each
iteration of the CG algorithm on IBM POWER7. For more details see [5] Left: 1 thread per core. Right: 4 threads per core.
IV. DEVELOPMENT OF ENERGY-AWARE NUMERICAL
LINEAR ALGEBRA LIBRARIES
The researchers from the HPCA group (University Jaume I,
Spain) focused on developing techniques for energy conservation
in dense and sparse linear algebra problems. One of the primary
goals of this work consisted in assessing the performance, power
consumption and energy efficiency of the CG method on a large
variety of architectures that form the landscape of high
performance computing, such as general-purpose multicore
processors, graphics processing units (GPUs), and low-power
digital signal processors (DSPs) [8]. From this study, we
observed that the performance-per-power unit of the many-core
GPUs from NVIDIA can be matched by several low-power
devices, such as those Intel (Atom), ARM (Cortex-A9) and
Texas Instruments (C66); and even outperformed by more recent
architectures from ARM (Cortex-A15). In addition, while GPUs
deliver high energy efficiency with remarkable performance, the
less power-demanding embedded processors exhibit a much
lower dissipation and/or smaller memories. This reduces the
appeal of these inexpensive architectures for general-purpose
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 1 1 -
computing, but makes them suitable options for certain
applications. Finally, despite the fact that conventional general-
purpose processors attain neither the throughput of GPUs nor the
performance-per-watt of the low-power systems, they offer an
interesting balance between these two extremes.
Additionally, this effort targeted the efficient exploitation of
task-parallelism, via the OmpSs framework, during the execution
of the multilevel preconditioned CG sparse linear solver
underlying ILUPACK (Incomplete LU Package). For this
purpose, specialized implementations were developed for Non-
Uniform Memory Access (NUMA) platforms and many-core
hardware co-processors based on the Intel Xeon Phi and GPUs
[9]. One key advantage of the new parallel data-flow solver is
that the numerical method/software is decoupled from the
runtime. This feature was exploited to investigate the effect of
different options of the runtime, in particular, schedules that may
favor data locality as well as the integration of energy saving
mechanisms into the runtime scheduler.
Finally, as part of a third line of work, this WP ported the
SuperMatrix runtime task scheduler for dense linear algebra
algorithms to the many-threaded Intel Xeon Phi, paying special
attention to the performance and energy profile of the solution.
From the performance perspective, this work focused on the
implications of balancing task- and data-parallelism. From the
energy-aware point of view, a mechanism was introduced in the
runtime in order to reduce power consumption during the idle
periods inherent to task parallel executions, achieving significant
energy savings [10].
V. ADVANCING HARDWARE-AWARE ALGORITHMS
Recent work of the partner Heidelberg University addresses the
improvement of the energy consumption of linear systems
solvers, in particular for multigrid methods. Multigrid solvers
belong to the most efficient numerical methods for solving
symmetric positive definite linear systems. The computational
complexity is optimal with O(n) for sparse systems with n
unknowns. In multigrid methods, typically a sequence of
problems with decreasing size appears. To improve the energy
consumption, we developed a hardware-aware implementation
that manages the number of active compute nodes with respect to
the size of the linear equation system. It allows to decrease the
number of active compute nodes until the lowest grid level is
reached. Once the system is solved on the lowest grid level, the
number of active compute nodes is increased again,
simultaneously to the prolongation of the solution. The energy
improvement then stems from the fact, that for each multigrid
level, the optimal hardware configuration is chosen. The
efficiency of a multigrid method strongly depends on the
smoothing method employed. The role of the smoother is to
remove high frequency error contributions from the solution.
Classical relaxation schemes such as Jacobi or Gauss-Seidel
iteration and their damped variants are often used as smoothers.
In the context of high-performance computing (HPC), scalability
of all building blocks of the multigrid solver is crucial for good
parallel performance. At the same time, the smoothing properties
must be maintained to sustain the efficiency.The developed
routines and solver libraries are dedicated to reduce the energy
consumption of the solving process. This is achieved by
leveraging the power-saving techniques provided by the
hardware. In addition to the dynamic adjustment of the compute
node activity within a mutligrid cycle, the smoothers can be
executed on the devices in GPU-accelerated systems. The goal is
to exploit the superior ratio of compute performance per
consumed energy which GPUs can yield compared to CPUs [7].
But often, this demands for adapted algorithms tailored to the
hardware. One aspect is the introduction of asynchronism in the
algorithms. If poor performance is encountered in parallel
algorithms, the reason is often the gap between the parallelism
provided by modern hardware, and the parallelism inside the
algorithms. A higher degree of asynchronism may result in
additional work, but due to the more efficient hardware use and
the decreased communication, the runtime performance as well
as the energy consumption can benefit from this approach [8,9].
In order to fully exploit the opportunities for parallelization
and asynchronous techniques in distributed systems, we
seamlessly integrated our developments in the general purpose
parallel finite element package HiFlow3 [10]. This makes the
outcomes usable for a broad scientific and industrial community
and puts strong emphasis on the aspect of energy efficiency in
the area of high performance computing.
VI. SHOWCASE FOR ENERGY-OPTIMISED AEROSOL
CHEMISTRY PACKAGES
Swiss National Supercomputing Centre (CSCS) and
Karlsruhe Institute of Technology (KIT) focus on a regional
scale model system consisting of the COSMO weather forecast
model of the German weather service (DWD) heavily used in
Europe, online coupled with the chemical transport model ART
(Aerosols and Reactive Trace gases), developed at KIT. The
coupled chemistry-meteorology COSMO-ART model enables
the simulation of atmospheric chemistry for studying air quality
and the calculation of the interactions of gases and aerosols with
the state of the atmosphere on the regional to continental scale.
Nevertheless, since a large number of additional tracers and
processes have to be considered, the air quality prediction model
COSMO-ART is much more demanding than COSMO alone and
thus severely limited in terms of applicability and expensive in
terms of energy consumption.
A baseline benchmark of the application was initially
produced on the current, state-of-the-art target platform, a dual
socket Ivybridge cluster at CSCS, to provide a reference
baseline. The assessment of the energy footprint and
performance profile of COSMO-ART was realised on various
HPC platforms (Intel Xeon Ivy Bridge, Xeon Sandy Bridge,
Xeon Westmere) [15]. So far, two different frameworks were
deployed on the testbed HPC systems to analyse the
performance, power dissipation and energy consumption of the
baseline execution, namely the E3METER metering framework
and the high resolution power-performance tracing framework
designed by the University of Hamburg. Profiling has given
detailed insights into power and energy requirements and has
been particularly useful to detect power bottlenecks, basically
constituted by inefficient waiting methods. On the other hand,
the usage of energy-friendly MPI techniques has demonstrated
that it is possible to reduce the energy-to-solution ETS while
maintaining (or even decreasing) the time-to-solution (TTS).
One computationally highly intensive part within the air
quality model COSMO-ART is the numerical simulation of the
atmospheric gas-phase chemistry and more especially the
solution of the chemical kinetics. Therefore, the TTS and ETS
improvements of the port of the Kinetic PreProcessor (KPP) to
multi-threaded CPUs and accelerators (KPPA) were investigated
with a standalone test harness for atmospheric chemistry
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 1 2 -
simulations. Besides, a comparative study on Rosenbrock
integrators was conducted in order to investigate the potential of
algorithmic changes demonstrated in the PRACE 2IP WP8
within the solution of the chemical reaction kinetics, and identify
an optimal integrator for chemical kinetics.
Other approaches to optimise the energy footprint and
performance of the atmospheric modeling system are currently
considered, such as:
the use of evolving COSMO optimisations within the
PASC initiative,
the use of single-precision arithmetic,
the use of an optimised library for algebraic kernels in
the atmospheric chemical kinetics solver,
the use of transactional memory,
the use of CPU DVFS governors for memory-bounded
code sections,
the use of parallel I/O libraries and asynchronous I/O,
the use of an optimal MVAPICH2 execution mode
(busy-wait: polling or idle-wait: blocking) or
alternatively, the use of an optimal OpenMPI execution
mode (busy-wait: aggressive or idle-wait: degraded), the use of an optimal MPI/OpenMP hybrid parallelism
on several HPC platforms (e.g. Piz Dora at CSCS and
POWER8 at IBM),
the feasibility study on running on low power platforms,
such as ARM processors.
Current achievements and prospective changes indicate that
we are getting close to the ultimate target of a 5-fold reduction in
E2S, with respect to the baseline code. Such an implementation
would allow the community to investigate critical questions at
higher resolutions and over longer periods, at reduced cost to the
environment.
ACKNOWLEDGMENTS
Exa2Green is part of the 'FET (Future and Emerging
Technologies) Proactive Initiative: Minimising Energy
Consumption of Computing to the Limit'. We acknowledge the
financial support of the European Commission under grant
agreement no. 318793.
TRADEMARKS
IBM, the IBM logo, ibm.com, IBM POWER, IBM Blue
Gene, are trademarks or registered trademarks of International
Business Machines Corporation in the United States, other
countries, or both. Other product and service names might be
trademarks of IBM or other companies.
REFERENCES
[1] P. Alonso, R. M. Badia, M. Barreda, M. F. Dolz, J. Labarta, R. Mayo and
E.S. Quintana-Ortí and R.Reyes. Tools for power and energy analysis of parallel scientific applications. 41st International Conference on Parallel
Processing–ICPP, pages 420–429, 2012.
[2] hin , , t n, M. F. Dolz, G. Fabregat, R. Mayo
and E.S. Quintana-Ortí. An Integrated Framework for Power-Performance
Analysis of Parallel
Scientific Workloads. 3rd Int. Conf. on Smart Grids, Green Communications and
IT Energy-aware Technologies, pages 114-119, 2013.
[3] , t n, M. F. Dolz, R. Mayo, E. S. Quintana-Ortí. Automatic detection of power bottlenecks in parallel scientific
applicationss. Fourth International Conference on Energy-Aware High
Performance Computing, Computer Science Research and Development, Vol.29(3-4), pages 114-119, 2013.
[4] A. C. I. Malossi, Y. Ineichen, C. Bekas, A. Curioni, and E. S. Quintana-
O tı' Performance and energy-aware characterization of the sparse matrix-vector multiplication on multithreaded architectures. In Proc. of 43rd ICPP,
Minneapolis, MN, USA, Sep. 2014.
[5] A. C. I. Malossi, Y. Ineichen, C. Bekas, A. Curioni, and E. S. Quintana-O tı' Systematic derivation of time and power models for linear algebra
kernels on multicore architectures. Sustainable Computing: Informatics and
Systems (SUSCOM), 2015 [6] A. C. I. Malossi, Y. Ineichen, C. Bekas, A. Curioni. Fast Exponential
Computation on SIMD Architectures. In Proc. of HIPEAC-WAPCO,
Amsterdam NL, 2015 [7] M. Wlotzka, V. Heuveline: Block-asynchronous and Jacobi smoothers for a
multigrid solver on GPU-accelerated HPC systems, submitted to Auto-
Tuning for Multicore and GPU at IEEE MCSoC-15, Turin, Italy, Sep. 23-25, 2015
[8] H. Anzt, S. Tomov, J. Dongarra, V. Heuveline: A block-asynchronous
relaxation method for graphics processing units, J. Parallel and Distrib. Comput., vol. 73 (2013), pp. 1613-1626
[9] M. Wlotzka, V. Heuveline: Energy-aware mixed precision iterative
refinement for linear systems on GPU-accelerated multi-node HPC clusters, to be published in Proceedings of the 26th GI/ITG PARS Workshop,
Potsdam, Germany, May 7-8, 2015
[10] [4] H. Anzt, F. Wilhelm, J. P. Weiß, C. Subramanian, M. Schmidtobreick, M. Schick, S. Ronnas, S. Ritterbusch, A. Nestler, D. Lukarski, E. Ketelaer,
V. Heuveline, A. Helfrich-Schkarbanenko, T. Hahn, T. Gengenbach, M.
Baumann, W. Augustin and M. Wlotzka: HiFlow3: A Hardware-Aware Parallel Finite Element Package, in Holger Brunst, Matthias S. Müller,
Wolfgang E. Nagel and Michael M. Resch, eds., Tools for High
Performance.
[11] "Unveiling the performance-energy trade-off in iterative linear system
solvers for multithreaded processors". J. I. Aliaga, H. Anzt, M. Castillo, J.
Fernández, G. León, J. Pérez, E. S. Quintana. Concurrency and Computation: Practice & Experience, Vol. 27(4), pp. 885-904, 2
[12] "Exposing data dependencies to leverage task-parallelism with OmpSs in
the preconditioned CG method". J. I. Aliaga, R. M. Badia, M. Barreda, M. Bollhöfer, E. S. Quintana, in 26th Int. Symposium on Computer
Architecture and High Performance Computing-SBAC-PAD 2014, Paris
(France), October 2014. [13] “Balancing Task- and Data-Level Parallelism to Improve Performance and
Energy Consumption of Matrix Computations on the Intel Xeon Phi“
Manuel F. Dolz, Francisco D. Igual, Thomas Ludwig, Luis Piñuel, Enrique S. Quintana-Ortí, Computer and Electrical Engineering (CAEE) journal,
2015. To appear.
[14] Performance Computing 2011, pp. 139-151, Springer, 2012 [15] J. Charles, W. Sawyer, M. Dolz, S. Catalán: Evaluating the performance
and energy efficiency of the COSMO-ART model system, Computer
Science - Research and Development, July 2014
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 1 3 -
Power and Energy Characterization of Finite Element
Matrices on IBM Power Martin Wlotzka
1, A. Cristiano I. Malossi
2, Vincent Heuveline
1, Costas Bekas
2
1Interdisciplinary Center for Scientific Computing, Heidelberg University, Germany
2IBM Research – Zurich, Foundations of Cognitive Solutions, Switzerland
Abstract— In this work we investigate the performance and
energy characteristics of different finite element discretizations
arising from the numerical solution of PDEs. We measure and
compare performance of sparse matrix-vector multiplications and
we analyze the behavior obtained for different mesh properties as
the degree of the FE basis polynomial and the level of h-refinement.
This analysis is carried out on the new IBM Power 8 platform.
I. INTRODUCTION
Numerical simulations typically involve the solution of linear
systems of equations, where the linear systems generally arise
from the space (and time) discretization of the underlying
continuous model equations. This is valid also in the case of
nonlinear models, where Newton or other method yields a
sequence of linear problems. The resulting matrices are typically
sparse, in particular when they are derived from finite difference,
finite element or finite volume discretizations of partial
differential equations (PDEs). The properties of the system
matrices greatly affect the power and performance behaviour of
the solution process. In particular, the sparse matrix-vector
multiplication (SpMV) plays a key role since it is the dominating
computational part in each iteration of the widely used Krylov
subspace linear solvers such as conjugate gradient (CG) or
general minimal residual (GMRES) method [6], and it may also
appear in multigrid smoothers.
In [5] the authors investigated the fundamental principles of the
power and performance characteristics of the SpMV. A
classification was established based on an artificial training set of
matrices covering a large variety of sparse matrix structures.
From this classification, a prediction model for power and
performance was derived and successfully validated using real
matrices from different application fields.
In this work, we extent those results focusing on the specific type
of matrices resulting from finite element discretizations of PDEs.
In particular, we aim to analyze and understand the effect of
mesh refinement and polynomial degree, among others, on the
power and energy consumption of the SpMV.
The layout of this work is the following: in Section II, we
illustrate with a prototypical example the origin of the sparsity of
finite element matrices. In Section III, we describe our
experimental setup for the measurements on the Power 8
platform, and in Section IV we present the results and finally
conclusions are summarized in Section V.
II. FINITE ELEMENT MATRICES
In modern engineering simulations, the FEM (see, e.g., [4]) is
widely used to numerically approximate the solution of PDEs. In
this work, we use the Laplace equation as a prototype to measure
the basic features of the FEM, which are driven by the sparse
matrix properties. The Laplace equation in the domain reads
(1)
where is the solution vector. This equation can model simple
engineering problems, such as the temperature distribution on
solid surfaces and volumes. Equation (1) is generally closed by a
proper set of boundary conditions. However, in our analysis we
omit these conditions, as they impact only a few rows, and they
do not affect significantly the resulting matrix structure.
The existence and uniqueness of classical or weak solutions
under appropriate boundary conditions is guaranteed by theory
[1,2,3] and not analyzed here. The FEM seeks to find an
approximate solution as a linear combination of finite element
basis functions with coefficients such that
. (2)
The finite element basis functions are piece-wise polynomials on
a discrete computational mesh modeling the original domain
. For Lagrange elements, the basis functions are defined
through the values at regular points on the mesh cells, like
vertices or midpoints of edges.
Fig. 1: 1-D example with two piece-wise linear Lagrange-type finite element basis functions, so-called “h t fun tions” Each hat functions has the value 1 at
exactly one vertex, and the value 0 outside the adjacent mesh cells. Figure 1 shows a one-dimensional (1-D) example of linear basis
functions. Note that the basis functions have only a small
support, i.e., the region where the function is not zero, and
couplings occur only over adjacent cells. The matrix pattern then
follows from the variational formulation of (1), i.e.,
. (3)
Equation (3) represents a linear system of equations for the
coefficient vector which can be stated as
(4)
and the right hand side vector incorporates any boundary
condition terms which we omitted so far. The resulting matrix
is “stiffn ss m t ix” fo histo i sons, wh n this typ
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 1 4 -
of discretization was used for elasticity problems. is sparse
since the only non-zero entries occur where the basis
functions and share an overlapping support. The sparsity
pattern depends on the structure of the underlying computational
mesh, the finite element type, the number of finite element
variables, the mesh refinement, and the polynomial degree of the
elements.
We create a collection of 384 sparse matrix structures
resulting from different finite element spaces. The precise
definitions are given in Table 1. We use both quadrilateral (quad)
and triangular (tri) meshes for a 2-D square domain, and
hexahedrons and tetrahedrons for a 3-D cube domain. We start
from the coarsest possible meshes 1 and perform successive
uniform mesh refinement (also known as “h- fin m nt”) For
each individual mesh, we use finite elements with a polynomial
degree of to the highest possible degree . The number
of finite element variables is in the 2-D case, and
in the 3-D case. This includes the possibility to
approximate scalar functions and velocity fields. Table 1: Definition of the test matrices used in this work
cell type h-refinement pmax d
quad / tri up to 6 times 7 1 quad / tri up to 7 times 6 1 quad / tri up to 8 times 5 1 quad / tri up to 9 times 2 1 quad / tri up to 10 times 1 1 quad / tri up to 7 times 7 2 quad / tri up to 8 times 3 2 quad / tri up to 9 times 1 2 hex / tet up to 2 times 11 1 hex / tet up to 4 times 7 1 hex / tet up to 5 times 3 1 hex / tet up to 6 times 1 1
hex up to 1 time 10 3 hex up to 3 times 6 3 hex up to 4 times 4 3
III. EXPERIMENTAL SETTING
Our tests are run on one core of the IBM Power 8 [7], using
double-precision real arithmetic. This architecture supports up to
8 hardware threads per core and is equipped with multiple cache
levels. The driver routine, including the CSR-based
implementation of SpMV, is processed using the IBM XL C++
compiler, with the options: -O3 -qstrict -qsmp=noauto:omp. For
simplicity our analysis is restricted to exploit OpenMP thread
parallelism on a single core, using 1, 2, and 4 threads. This does
not represent a limitation for the matrices considered in this
work, as they fit in the memory of the local node. In order to
instrument the code and collect power measurements, we use the
AMESTER library [8]. To collect enough power samples, each
SpMV is repeated many times: the time cost of the auxiliary
repetition loop is negligible and does not affect measurements.
1 The coarsest possible meshes consist of one quadrilateral or two triangles, in the 2-D case, and one hexahedron or five tetrahedrons, in the 3-D case.
IV. RESULTS AND CONCLUSION
Figures 2 and 3 show plots of the measurement data where the
colors indicate the level of h-refinement and the polynomial
degree of the finite element basis functions, respectively. In
general, most of the data points lie very close to each other. This
observation of similar power/energy characteristics is in good
agreement with the fact that our finite element matrices resulted
from a common construction basis. However, we observed a
correlation of the characteristics with the polynomial degree.
(a) – time vs. energy, 1 thread per core
(b) – time vs. power, 1 threads per core
(c) – time vs. energy, 4 threads per core
(d) – time vs. power, 4 threads per core
Figure 2: Performance of the finite element matrices. Colors correspond to the
level of h-refinement. Circles indicate matrices that fit into the cache, rectangles
indicate bigger matrices that are stored in the DDR. In particular the configuration with 4 threads per core benefits
from higher degrees, as it can be observed from the clear color
gradient in Figures 3(c) and 3(d). This is due to the size of the
local dense matrix blocks which become larger for higher order
polynomials, since a larger number of basis functions shares an
overlapping support.
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 1 5 -
(a) – time vs. energy, 1 thread per core
(b) – time vs. power, 1 thread per core
(c) – time vs. energy, 4 threads per core
(d) – time vs. power, 4 threads per core
Figure 3: Performance of the finite element matrices. Colors correspond to the
polynomial degree of the finite element basis functions. Circles indicate matrices that fit into the cache, rectangles bigger matrices that are stored in the DDR. The values presented in Figures 2 and 3 are normalized to the
number of non-zero elements in each matrix. This has been done
to compare the actual cost of each floating-point multiply-add
operation (see [5] for more details). In practical problems,
however, what really matters is the total time- and energy-to-
solution. To this end, we extracted two sets of different matrices
with equal number of rows (which is often regarded the problem
size of an application). For these matrices we solved (4) using
the CG method, and we compare the number of iterations to
reach an absolute tolerance of . Table 2: Time / energy to solution and number of CG iterations for matrices with
equal number of rows.
nrows h p time / SpMV [s] energy / SpMV
[J] CG iter.
16,641 5 4 8.038e-4 1.8437e-3 338
16,641 6 2 3.425e-4 8.3990e-4 210
16,641 7 1 1.885e-4 4.19E-004 158
263,169 7 4 1.321e-2 3.3492e-2 1501
263,169 8 2 6.184e-3 1.8733e-2 818
263,169 9 1 3.326e-3 9.0704e-3 589
Table 2 summarizes the time and energy for one SpMV and the
number of CG iterations. The normalized data do not reflect
other important properties of the matrix like the condition
number, which can greatly affect the behaviour of a solution
scheme. The comparison of the matrices with equal problem size
clearly shows an advantage of lower order finite elements in
terms of time and energy per sparse matrix-vector multiplication.
This is expected since the lower order discretizations yield less
non-zero matrix entries. The number of CG iterations to reach
the desired accuracy also increases greatly when using higher
order finite elements. This is due to the worsened condition
number of the corresponding matrices. Both effects add up to a
clear advantage of lower order discretizations, which should
therefore be used whenever possible.
V. CONCLUSIONS
The experiments and results analyzed in this work suggest that
lower order discretization schemes should be preferred, when
possible, as they require less operations per row, and thus have
shorter execution time, which leads to lower energy consumtion.
This conclusion is valid for standard CSR-implementations with
no SIMD instructions: other more elaborate and architecture
optimized implementations can lead to different results and
should be investigated in future works.
ACKNOWLEDGMENTS
This work was supported by the European Comission as part
of the project Exa2Green under grant agreement no. 318793. We
acknowledge the financial support of the Future and Emerging
Technologies (FET) programme within the ICT theme of the
Seventh Framework Programme for Research (FP7/2007–2013).
TRADEMARKS
IBM, the IBM logo, ibm.com, IBM POWER, IBM Blue
Gene, are trademarks or registered trademarks of International
Business Machines Corporation in the United States, other
countries, or both. Other product and service names might be
trademarks of IBM or other companies.
REFERENCES
[16] L.C. Evans: Partial differential equations, AMS, 1998, Reprint 2008 [17] R. Adams, J.J.F. Fournier: Sobolev Spaces, 2nd ed., Academic Press, 2003
[18] A. Quarteroni, A. Valli: Numerical Approximation of Partial Differential
Equations, Springer, 2008 [19] A. Ern, J.-L. Guermond: Theory and Practice of Finite Elements, Springer,
2004
[20] A.C.I. Malossi, Y. Ineichen, C. Bekas, A. Curioni, E.S. Quintana-Orti: Performance and Energy-Aware Characterization of the Sparse Matrix-
Vector Multiplication on Multithreaded Architectures, in Proceedings of the
43rd Annual International Conference on Parallel Processing, Minneapolis, Minnesota, USA, 2014.
[21] Y. Saad: Iterative Methods for Sparse Linear Systems, 2000
[22] A.B. Caldeira, B. Grabowski, V. Haug, M.-E. Kahle, A. Laidlaw, C.D.
Maciel, M. Sanchez, S.Y. Sung: IBM Power System S822 Technical
Overview and Introduction, IBM, Tech. Rep. REDP-5102-00, August 2014
[23] C. Lefurgy, X. Wang, M. Ware: Server-level power control, in Proc. of the 4th IEEE onf on Autonomi omputing (I A ’07), 2007
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 1 6 -
Dynamics of a C-battery Scale Energy Harvester
V. Nico1, E. Boco
1, R. Frizzell
2, J. Punch
1
1Connect, Stokes Institute, University of Limerick
2Efficient Energy Transfer Dept., Bell Labs, Alcatel Lucent, Dublin, Ireland
Abstract— A “C-battery” scale two Degree-of-Freedom (2-DoF)
nonlinear electromagnetic energy harvester is presented in this
paper. The device employs velocity amplification, achieved through
sequential collisions between two free moving masses, to enhance
output power. The nonlinearities are due to the multiple masses and
the use of magnetic springs between the primary mass and the
housing and between the primary and secondary masses. A
theoretical model and its experimental validation are presented in
this work.
I. INTRODUCTION
In recent years, the development of small and low power
electronics has led to the deployment of Wireless Sensor
Networks (WSNs), which are largely used in military and civil
applications. There is often an abundance of unused ambient
energy in the vicinity of WSNs that can be converted to usable
electrical power through the use of energy harvesting techniques.
Among all the different forms, kinetic energy, available from
ambient vibrations, is one of the most common forms.
Conventional vibration energy harvesters (VEHs) are based on
simple linear spring-mass resonator designs for which the
resonant frequency has to be tuned to the main frequency of the
ambient vibration. The spectra of real vibrations are broad-band,
however, with content predominantly at low frequencies [1].
In the literature, different methods have been suggested to
overcome the problem of narrow bandwidth harvesters. Some
groups have proposed multi-frequency systems [2, 3] or methods
for varying the resonance frequency of piezoelectric beams [4].
Cottone et al. [5] instead showed that the use of nonlinear
bistable systems improves the performance of an energy
harvester. Another approach is to employ multi-DoF
configurations and it has been shown that these systems
outperform single-DoF equivalents [6, 7, 8, 9, 10]. These
approaches focus on increasing the bandwidth of the harvester at
low frequencies, but new methods for improving both the
bandwidth and the power of the harvested energy are required.
Cottone et al. [11] introduced the concept of a velocity amplified
electromagnetic generator (VAEG), which featured a series of
impacting masses. In the following studies [12, 13, 14, 15], it
was shown that within a given volume constraint, the 2-DoF
system performs best and that the nonlinearities, introduced by
the impacts, increased both the power and the bandwidth.
The work presented in this paper introduces the theoretical
and experimental analysis of an improved design of a 2-DoF
VAEG, th t is pp oxim t y th siz of “ -b tt y” In th
following sections, the experimental setup and the results of an
experimental characterisation under sinusoidal excitation are
presented. The nonlinear effects of the magnetic springs and the
consequent shift of the resonance peak are outlined. In the final
section, a theoretical model that can represent the dynamics of
the system is developed.
II. SYSTEM OF INTEREST
The system examined in this paper is shown in Figure 1: its
overall size is comparable to a C-battery: 25.5mm diameter by
57.45mm long, 29.340 cm3 total volume (Figure 2).
Figure 1 - Section of the VAEG prototype
Figure 2 - Illustration of the scale of the harvester with reference to a C-battery
The primary mass (Mass 1) is made from Teflon and acts as
the housing for seven coils, while the secondary mass (Mass 2)
comprises a Halbach stack of 5 NeFeB magnets: 3 axial and 2
radial. The detailed description of the transducer mechanism is in
o o t : “Power Optimization of a C-Battery Scale
Vibrational Energy Harvester”. Opposing magnets are affixed to
the base and cap and inside Mass 1 in order to reduce the energy
losses between Mass 1 and the housing and also between the two
masses.
The magnetic springs introduce a hardening nonlinearity in
the system that enhances the bandwidth of the harvester. The
spacing of the magnets can be varied by changing the height of
the cap and this significantly affects the dynamics of the system.
When the stack of magnets moves, an electrical current is
in u into th oi o ing to F y’s w n th in u
voltage is given by Vemf = Blv, where B is the magnetic field
experienced by the coil of total length l and v is the relative
velocity between the coils and magnets. Since the coils are
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 1 7 -
within the primary mass and the magnets form the secondary
mass, improving the velocity of the secondary mass increases the
output voltage and the output power (Pe ~ v2). The velocity of the
secondary mass can be increased by exploiting the velocity
amplification technique: following an impact, the final velocity
of Mass 2, v2f, is equal to [12]:
where e is the restitution coefficient and the subscript i and f
refer to the initial state before the impact and the final state after
the impact, respectively.
III. EXPERIMENTAL RESULTS
The harvester was tested under sinusoidal excitation at two
different level of acceleration: arms = 0.2 g and arms = 0.4 g (g =
9.81 m/s2). Since the height of the cap can be varied, for this test
a value of h = 50.10 mm was chosen.
The output voltage from the harvester was measured across a
load resistor (RL) and, since the frequency response of the device
varies for the two different excitations, the optimal load
resistance was set before every test.
The output power generated by the device for the two levels
of excitation is shown in Figure 3 over a range of frequencies.
Each discrete data point was measured under sinusoidal
excitation at the corresponding frequency, starting from rest. The
power was calculated using P = V2
rms/RL where Vrms is the rms
voltage measured across the optimal resistive load RL.
Figure 3 - Output power as function of acceleration level and frequency
The response of the system at the lower acceleration (arms =
0.2 g) is reasonably symmetric about the peak value and narrow.
By increasing the acceleration, the system starts to behave
nonlinearly: the peak shifts (from 13.5 Hz at arms = 0.2 g to 15.25
Hz at arms = 0.4 g) and bends to the right due to the hardening
effects of the magnetic springs. This hardening effect is more
predominant at higher accelerations because masses within the
harvester can reach positions closer to the magnets in the cap and
the base, and so experience stronger magnetic repulsive forces
compared to lower acceleration tests. The higher the repulsive
force, the higher the effective elastic constant of the magnetic
springs, which leads to the observed hardening effect. Due to the
increased nonlinearities of the system, the 3 dB bandwidth of the
harvester increases from 2 Hz (arms = 0.2 g) to 4 Hz (arms= 0.4 g).
IV. THEORETICAL MODEL
In order to predict the power output of the harvester, a model
that takes into account the magnetic repulsive force of the
magnets has been developed. A finite element simulation has
been carried to derive the magnetic force between the magnets of
the internal and external magnetic springs. Figure 4 shows the
trend of the simulated force as function of distance and the trend
of the fit, obtained according to Eqn.2:
where
x is the distance between magnets; and
p1, qi are the parameters of the fit.
Figure 4 - Simulated repulsive magnetic force as a function of distance
The dynamics of the system can be represented with three
coupled differential equations: two of which describe the
mechanical motion of the masses, and the third describes the
voltage generated in the coil.
The seven coils have been considered as a single coil with
resistance Rc and inductance Lc equal to the total resistance and
inductance of the seven coils. The Halbach stack of magnets has
been considered as a single magnet with magnetic field B equal
to the average field experienced by the coils.
The three equations that describe the system are:
where:
d1 and d2 are the measured viscous damping
coefficient acting on m1 and m2,
α = Bl/R is the conversion factor,
αVL is the magnetic force due to the transduction
mechanism,
Fint, Fint1, Fmag, Fmag1 are the magnetic forces as
described in Figure 5, and
z1, z2 are the displacement shown in Figure 5.
(1)
(2)
(3)
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 1 8 -
Figure 5 - Schematic model for the 2-DoF harvester: Mass 1 is in white, while
Mass 2 is in gray
The three differential equations have been solved numerically
using the Runge-Kutta algorithm. The trend of the output power
for different excitation frequencies and amplitude is presented in
Figure 6.
Figure 6 - Simulated output power as function of frequency
As in the experimental data presented in Figure 3, the
response of the system changes with increasing acceleration: the
peak at the lower excitation is quite symmetric, while it shifts
and bends to the right for higher amplitudes, due to the nonlinear
hardening effects of the magnetic springs. The simulations shows
also the peak of the secondary mass at f = 15.5 Hz. The power is
higher (~8x) for the theoretical prediction compared to the
measurements, and this is thought to be due to the fact that the
seven coils and the Halbach stack are considered as just one coil
and one magnet with magnetic field equal to the average field
felt by the coils.
V. CONCLUSIONS
The theoretical and experimental characterisation of a
nonlinear 2-DoF v o ity mp ifi VEH t “ -b tt y” s is
presented in this paper.
The device was tested experimentally under sinusoidal
excitation and demonstrated a maximum power output of 1.03
mW at 15.25 Hz for arms = 0.4 g. The theoretical simulation of
the magnetic force and the dynamic model of the system,
described in Section IV, present good agreement with the
frequency response of the system. The model can be used as a
design tool for choosing the correct configuration of magnetic
springs and geometrical parameters in order to maximise the
output power.
Since the harvester performs well at low frequencies, the next
step of this work will be the realisation of a small wearable
device that can be used to harvest human motion, for powering
low-consumption devices.
ACKNOWLEDGMENT
The authors acknowledge the financial support of Science
Foundation Ireland under Grant No.10/CE/I1853 and the Irish
Research Council (IRC) for funding under their Enterprise
Partnership Scheme (EPS). Bell Labs Ireland would also like to
thank the Industrial Development Agency (IDA) Ireland for their
continued support.
REFERENCES
[24] Roundy S et al, “A study of low level vibrations as a power source or
wireless sensor no s” Computer Communications, 26(11),pp. 1131– 144, July,
2003.
[25] Ferrari M et al, “Pi zo t i multifrequency energy converter for power
harvesting in autonomous mi osyst m” Sensor and Actuators A, 142(1), pp.
329-335, March, 2008 [26] Castagnetti D, “Exp im nt modal analysis of fractal-inspired multi-
frequency structures for piezoelectric energy onv t s” Smart Materials and
Structures, 21(9), p. 094009, August, 2012.
[27] Todorov G et al, “Tuning techniques for kinetic mems energy harvest s” Proceedings of the Telecommunications Energy Conference INTELC),
Amsterdam, Netherland, 9-13 October 2011, pp. 1–6.
[28] Cottone et al, “Non in energy h v sting” Physical Review Letter,
02(8), February, 2009.
[29] Petropoulos T et al., “ ms coupled resonators for power generation and
s nsing” Proceedings of Micromechanics Europe, Leuven, Belgium, 5-7 September 2004.
[30] Vullers RJM et al, “L g power amplification of mems harvester by secondary spring and mass ss mb y” Proceedings of the PowerMEMS 2012,
Atlanta, USA, 2-5 December 2012.
[31] Wu H et al, “D v opm nt of a broadband nonlinear two-degree-of-
freedom piezoelectric energy h v st ” Journal of Intelligent Material Systems and Structures, 25(14), pp. 1875–1889, September, 2014
[32] Galchev TV et al, “H v sting traffic-induced vibrations for structural health monitoring of b i g s” Journal of Micromechanics and Microengineering,
21(10), October, 2011.
[33] Ashraf K et al, “Imp ov energy harvesting from low frequency vibrations
by resonance amplification at multiple f qu n i s” Sensors and Actuators A:
Physical, 195, pp. 123–132, June, 2013
[34] Cottone F et al, “En gy harvesting apparatus having improved ffi i n y”
USA Patent 0074162, 2009.
[35] Cottone F et al, “Enh n vibrational energy harvester based on velocity
mp ifi tion” Journal of Intelligent Material Systems and structures, 25(4), pp.
443–451, August, 2013.
[36] O’Donoghu D et al, “A multiple-degree-of-freedom velocity-amplified vibrational energy harvester: Part A: experimental n ysis”. Proceedings of the
ASME 2014 Conference on Smart Materials, Adaptive Structures and Intelligent
Systems, 2, Newport, Rhode Island, USA, 8-10 September 2014.
[37] Nico V et al, “A multiple-degree-of-freedom velocity-amplified vibrational
energy harvester: Part B mo ing” Proceedings of the ASME 2014 Conference on Smart Materials, Adaptive Structures and Intelligent Systems, 2, Newport,
Rhode Island, USA, 8-10 September 2014.
[38] Boco E et al., “Optimiz tion of coil parameters for a nonlinear two degree-
of-freedom (2dof) velocity-amplified electromagnetic vibrational energy h v st ” Proceedings of the SMARTGREENS 2015 4th International
Conference on Smart Cities and Green ICT Systems, Lisbon, Portugal, 20-22
May 2015.
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 1 9 -
Power Optimization of a C-battery Scale Electromagnetic
Vibrational Energy Harvester
E.Boco1, V.Nico
1, R.Frizzell
2, J.Punch
1
1Connect, Stokes Institute, University of Limerick, Limerick, Ireland
2Efficient Energy Transfer Dept, Bell-Labs Alcatel-Lucent
Abstract— In this paper, a two Degree-of-Freedom (2DoF)
velocity amplified electromagnetic energy harvester is investigated.
Dimensions are very close to those of a “C-battery” (26.2mm
diameter and 50mm length, giving a volume of 27.8cm3), making the
harvester suitable to be integrated in electronic devices. The
harvester consists of two masses; one of which is a Halbach array of
magnets that oscillate inside the second mass, which is a set of seven
coils. Impacts between the masses cause velocity amplification of the
secondary (smaller) mass, which improves the electromagnetic
conversion. A finite element simulation is used to justify the choice
of the Halbach configuration for the magnets. Three configurations
of coils are tested, with different numbers of turns, to verify the
optimization. Tests under harmonic excitation result in a high
volumetric Figure of Merit (FoMV).
I. INTRODUCTION
IN the past few years, the idea of “smart cities” h s become
more prevalent in the European vision of future sustainable
development of society “ m t ity” means an urban space
where services are provided to the citizen following real time
data analysis on various factors, such as air pollution, traffic
conditions, temperature, light conditions and so on [1-3]. What
has to be taken into account in such a vision is that in, order to
have a constant and widespread monitoring in such a large
environment, a great number of sensors, transmitters and
receivers must be employed. This requires a large number of
independently-powered electronic devices to be embedded in
buildings, roads, and infrastructures.
Batteries are often regarded as a solution to power sensor
nodes, but they present significant issues: batteries are polluting
and their disposal has to be carefully planned. Moreover,
substituting them at the end of their lifetime can be difficult due
to the number of devices involved in a sensor network, or
potentially impossible if they are embedded in a structure [4].
For this reason, in the last few years, energy harvesting has been
proposed as a better solution to power low consumption devices.
In this paper, vibrational energy is investigated because it is
always present in environment and can be easily converted.
Electromagnetic conversion is considered, due to its high energy
density at small scales. Moreover, its dependency on the relative
velocity between magnet and coil, allows velocity amplification
technique to be exploited in the device [5-6], which can also
provide nonlinearity to broaden the harvesting band [7-8]. This
paper first examines the benefits of a Halbach stack of magnets,
before presenting an approximate method to optimize the
transducer design. In the final section a comparison with
experimental data is reported that demonstrates the benefits of
the optimization strategy for 2DoF systems through a high
volumetric Figure of Merit (FoMV). The suitability of the
optimization technique is also discussed in relation to nonlinear
devices.
II. HARVESTER DESIGN
The harvester (Fig. 1) consists of a primary (larger) mass,
made up of seven coils, oscillating outside a smaller mass, which
consists of the magnet stack. Both the primary and the secondary
masses oscillate between a set of magnetic springs, which
mediate the impacts between the two masses. These impacts
cause velocity amplification to occur, which enhances the power
harvested and introduces nonlinearities to the system. The cap
height, H, can range between 50.10 mm and 57.45 mm.
Figure 7 Harvester section
III. OPTIMIZATION MODEL
As a first approximation, a vibrational energy harvester can be
regarded as a spring-mass-damper system [9], in which the
damping is both due to the mechanical losses and to the energy
conversion (Fig. 2).
Following the linear approximation, the power output of a
damped harmonic oscillator is known to be:
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 2 0 -
Figure 8. Linear approximation for a vibrational energy harvester. The
damping is due both to mechanical losses and the conversion method.
Where P is the power dissipated (W), m is the oscillator mass, ω
and ωN are respectively the excitation frequency and the natural
frequency, and
Where:
If the oscillator is driven at resonance, the expression of the
power output simplifies to:
It is possible to find the amount of power dissipated by the
electrical damping:
Finding the derivative and equating to zero, allows the
maximization condition to be found:
Following this guideline, a model was formulated to calculate
the optimal number of turns and the best wire diameter that fit
within the geometrical design constraints.
In order to choose the highest magnetic flux density magnet
configuration, a COMSOL Multiphysics simulation was
performed, comparing a classical stack of magnets with a
Halbach configuration stack [10-11], as shown in Fig. 3.
In a classical stack, the magnetic flux lines spread out in the
volume; a Halbach configuration features diametrically and
axially polarized magnets which closes the flux line along the
stack. Moreover, the neighboring diametrical-axial-diametrical
magnet periods close the flux in the opposite direction, so that
the intensity difference between two periods is double the
intensity difference of a single periods.
Figure 3 COMSOL modeling: (a) classical stack (b) Halbach stack. Black
arrows denote the magnetic field direction, magnetic flux line are shown, and the colour scale refers to magnetic flux density norm.
Once the Halbach configuration was chosen as the most
suitable for the trasducer, a 2DoF harvester was constructed,
where the smaller mass was a Halback stack and the larger mass
was a series of coils (see Fig. 1). For the optimization method it
was necessary to measure the mean magnetic field strength along
the length of the stack and the mechanical damping of the system
using drop tests.
These parameters were found respectively to be
B = 0.017T
These parameters were found respectively to be
By varying the coil length and the wire diameter, it was
possible to determine the number of turns which maximized the
power output and resulted in a coil that fits the geometric
constraints of the harvester design.
A set of 7 coils with 2800 turns each, wound with a 50 µm
diameter wire was the optimal design.
.
IV. EXPERIMENTAL VERIFICATION
Since the system presented in Section 1 is highly nonlinear, due
to the presence of the magnetic springs and to the impacts, an
experimental verification of the power optimization was
required.
Three sets of coils were built in order to compare their power
outputs, that had respectively 1500, 2800 (to fit the calculation
results) and 4500 turns for each coil.
Experimental tests under harmonic excitation at a = 0.4g
(Figure 4) were conducted and showed a good agreement with
the theoretical prediction. The tests confirmed the power output
from the N=2800 coils being the highest. Interesting effects given by the presence of nonlinearity can
be seen for stronger magnetic springs, changing the height of the
cap (Fig.5): this demonstrates that the optimization condition is
not stable for this system.
A deeper study on how the change in the cap height affects the
dynamics adding non-linearity can be found in [26].
(a) (b)
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 2 1 -
(a) H = 57.45mm
(b) H = 52.40mm
(c) H = 50.10mm
Figure 5 Output power as a function of driving frequency and number of turns
coil. (a) Test with high position of the cap. (b) Test with the cap at 52.40mm
from the base. (c) Test with the lowest position of the cap. The optimization condition is changing..
V. FIGURE OF MERIT
The Volumetric Figure of Merit (FoMV), introduced by
Mitcheson et al.[12] is one of the most commonly used methods
to evaluate the performance of a vibrational energy harvester. It
represents the ratio between the actual power output from the
device and the energy present in a cubic harmonic oscillator
made of gold, with the same volume and mass as the device of
interest, and oscillating at the same frequency:
Figure 6, compares the highest figure of merit obtained from the
device analysed in this paper in this paper with recently
published results [13].
Figure 6 FoMV performance comparison with recent literature, (a) [13], (b)
[14], (c) [15], (d) [16], (e) [17], (f) [9], (g) [18], (h) [19], (i) [20], (j) [21], (k)
[23], (l) [24], (m) [25], (n) [22]. In red, our harvester FoMV
VI. CONCLUSIONS
In this paper, the power optimization of a vibrational energy
harvester employing velocity amplification is presented. The
harmonic oscillator approximation was used as a guideline for
the coil design and its results were compared to experimental
data.
Although some discrepancies due to the system nonlinearuty
were demonstrated, the theoretical power optimization is mostly
in agreement with the experimental data. Finally, the comparison
between the presented harvester performance and devices from
the recent literature demonstrated the effectiveness of velocity
amplification in improving the electromagnetic conversion.
Future work will investigate the parameters affecting nonlinear
electromagnetic optimization, and will study the optimal
mechanical configuration in order to maximize the performance
of 2DoF harvesters.
ACKNOWLEDGMENTS
The authors acknowledge the financial support of Science
Foundation Ireland under Grant No.10/CE/I1853 and the Irish
Research Council (IRC) for funding under their Enterprise
Partnership Scheme (EPS). This work was financially supported
by the Industrial Development Agency (IDA) Ireland.
Figure 4 Voltage and power output exciting the harvester at a = 0.4g. The
voltage output increases with the number of turns, and the power output is
maximum for N = 2800
(a)
(b)
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 2 2 -
REFERENCES
[39] K nouskos, Ho n , “ imu tion of m t G i ity with oftw Ag nts”, in “Computer Modeling and Simulation, 2009. EMS '09. Third
UKSim European Symposium on”, 25-27 Nov 2009, pages 424-429,
Athens. [40] Valasquez- , “ king m t iti s m t - Using Artificial
Int ig n T niqu s fo m t obi ity”, K ynot p h in
SmartGreens 2015, Lisbon. [41] Kamme st tt ,L ng , kopik,Kupzog,K stn , “P ti Risk Ass ssm nt
Using umu tiv m t G i o ”, P o ing of 4th International
Conference on Smart Grids and Green IT Systems, pages 31-42, June 2015, Lisbon .
[42] Kollias, Nikolaidis, In-Wall Thermoelectric Harvesting for Wireless Sensor
Networks”, P o ings of 4th International Conference on Smart Grids and Green IT Systems (SMARTGREENS 2014), pages 213-221, June
2015, Lisbon .
[43] Cottone, F izz , Goy , K y, Pun h, “Enh n vib tion n gy h v st b s on v o ity mp ifi tion” in Jou n of Int ig nt t i
Systems and Structure, Vol. 25, Sage Journal, 2013.
[44] V Ni o, D O’Donoghu , R F izz , G K y, n J Pun h, “A mu tip degree-of-freedom velocity amplified vibrationalenergy harvester: Part B
modelling”, in P o ings of A E, 2(46155), 2014 [45] D O’Donoghu , V Ni o, R F izz , G K y, n J Pun h, “A mu tip
degree-of-freedom velocity-amplified vibrational energy harvester: Part a
xp im nt n ysis”, P o ings of A E, 2(46155), pt 2014
[46] o o, Ni o, O’Donoghu , F izz , K y, Pun h, “Optimization of Coil Parameters for a Nonlinear Two Degree-of-Freedom (2DOF) Velocity-
amplified Electromagnetic Vibrational Energy Harvester”, Proceeding of
4th International Conference on Smart Grids and Green IT Systems, pages 31-42, June 2015, Lisbon .
[47] S.P.Beeby, R.N.Torah, M.J.Tudor, P.Glynne-Jon s, T O’Donn ,
T R sh , n Roy “A mi o t om gn ti g n to fo vib tion n gy h v sting”, Jou n of i om hanics and microengineering,
17(7), July 2007.
[48] X T ng, T Lin, n L Zuo, 2013 “D sign n optimiz tion of tubu in t om gn ti vib tion n gy h v st ” T ns tions on
Mechatronics, IEEE/ASME,19(2), Mar, pp. 615–622.
[49] Zhu, Beeby, Tudor, and H is “Vib tion n gy harvesting using the h b h y” In m t t i s and Structures, IOP Publishing, 2012.
[50] P.D.Mitcheson, E.M.Yeatman, G.K.Rao, A.S.Holmes, and T.C.Green.
“En gy h v sting f om hum n n m hin motion for wireless electronic vi s”, in Proceedings of the IEEE, 96(9).
[51] Ash f, Khi , D nnis, n h u in, 2013 “Imp ov n gy h v sting from low frequency vibrations by resonance amplification at multiple
f qu n i s” In nso s and Actuators A: Physical, ELSEVIER.
[52] Ashraf, Khi , D nnis, n h u in, “A wi b n f qu n y up-converting bounded vibration energy harvester for a low frequency
nvi onm nt”, in m t t i s and Structures, IOP Publishing, 2013.
[53] G h v, u gh, P t son, n N j fi, “H v sting t ffi -induced vib tions fo st u tu h th monito ing of b i g s”, in Jou n of
Micromechanics and Microengineering, IOP Publishing, 2011.
[54] G h v, Kim, n N j fi, “A p m t i f qu n y in s pow generator for scavenging low frequency ambient vibrations”, in P o i
Chemistry, Volume 1, Issue1, September 2009, Pages 14391442,
ELSEVIER, 2009. [55] R n u , Fio ini, h ijk, n Hoof , “H v sting n gy f om th motion of
human limbs: the design and analysis of an impact-based piezoelectric
g n to ” in m t Materials and Structures, IOP Publishing, 2009. [56] Ay , Zhu, Tu o , n by, 2009 “Autonomous tun b energy
h v st ”, in Pow E , 2009
[57] y, isungsitthisunti, Xu, Rho s, Jung, n P ou is, 2009 “ omp t low frequency meandered piezoelectric n gy h v st ”, in Pow E ,
2009.
[58] Ferro Solutions, VEH-460. [59] ini, n p oni, “An ffi i nt t om gn ti pow h v sting
device for low-f qu n y pp i tions”, in nso s n tu to s A:
Physical, ELSEVIER, 2011. [60] Ching, Wong, Li, Leong, an W n, “A s mi om hin mu ti-modal
son ting pow t ns u fo wi ss s nsing syst ms” in nso s n
actuators A: Physical, ELSEVIER, 2002. [61] Zhu, by, Tu o , n H is, “Vib tion n gy h v sting using th
h b h y”, in m t t ials and Structures, IOP Publishing 2012.
[62] Ku k ni, Koukh nko, To h, Tu o , by, O’Donn , n Roy, 2008
“D sign, f b i tion n t st of int g t mi o-scale vibration-based
t om gn ti g n to ” in nso s n A tu to s A: Physi , ELSEVIER.
[63] Y ng, n L , 2010 “Non-resonant electromagnetic wideband energy
h v sting m h nism fo ow f qu n y vib tions” In i osyst T hno
(2010) 16:961966, Springer-Verlag 2010. [64] Nico, Boco, Punch, Frizzell, Dynamics of a c-battery scale energy
harvester, 10th issue ICT-Energy Letters,2015.
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 2 3 -
Virtual Machine Power Modelling in Multi-tenant
Ecosystems: Challenges and Pitfalls
Leila Sharifi1, Felix Freitag
1, Luís Veiga
2
1 Computer Architecture Dept., Universitat Politecnica de Catalunya, Barcelona, Spain
2 Técnico Lisboa/INESC-ID Lisboa, Lisboa, Portugal
Abstract— Since energy has emerging as the first class
computing resource, we need to characterize this resource in
different granularity. On the other hand, the computing paradigm
is shifting to the multi-tenant ecosystems. Therefore, characterizing
the power consumption on Virtual Machines(VMs), running in data
center hosts is necessary to attain energy efficient cloud ecosystems.
In this paper, we study the challenges should be addressed in VM
power modeling in cloud service provisioning.
I. INTRODUCTION
Energy and associated environmental costs (cooling,
carbon footprint, etc.) of IT services constitute a remarkable
portion of service dynamic cost. Indeed, estimating energy
consumption at each level of service provisioning stack, i.e. from
hardware to , operating system/Virtual Machine(VM) and
application is the cornerstone of research toward energy
efficiency all through the stack.
Although there is a growing body of work centered on the
energy aware resource management, allocation and scheduling
[7,10,11] they mainly considered the whole system energy
measurement, estimation, improvement and optimization. There
is only limited work focusing on the energy issues per individual
job [1,5,8,18]. However, they only aim at reducing total energy
consumption in the infrastructure without taking into account the
energy-related behavior of each individual , its performance and
price, i.e., how expensive and efficient is the energy employed
for the observed job performance or progress.
Nonetheless, energy-based job pricing confronts some more
challenges further to the system wide energy efficiency issues. In
the system wide energy efficiency, the energy consumption of the
resources are measurable simply by plugging the energy meter
devices or exploiting the embedded sensors of the contemporary
devices. Nonetheless, it is nontrivial to measure the energy
consumed per VM, since we cannot embed a physical sensor in a
VM or plug a metering device to it. Therefore, estimation is still
the only option in this case. Estimation results in a more
complicated model since it has to deal with uncertainty and error.
In this work, we study the challenges of VM power modeling
in a multi-tenant ecosystem. In the next section, we outline the
background terms and hypothesis that we use in this work.
Section 3 surveys the sate of the art in VM power modeling and
introduces the challenges in this area. The work is concluded in
Section 4.
II. BACKGROUND
In this section we introduce the background and hypothesis
required to drive the discussion in the rest of this paper.
Energy Proportionality
The vision of energy proportional system implies the power
model of an ideal system in which no power is used by idle
systems (Ps=0), and dynamic power dissipation is linearly
proportional to the system load.
LDR indicates the maximum difference of the actual power
consumption, P(U), and linear power model over the linear
power model as in (1).
(1)
IPR is the indicator of idle to peak power consumption as
illustrated in (2)
(2)
To measure how far a system power model is from the ideal
(energy proportional) one, Proportionality Gap(PG) (19) is
defined as the normalized difference of the real power value and
the ideal power value, which is indicated as PMax × U, under a
certain utilization level as shown in (3). Therefore, having
proportionality gap values for a given device, we can reconstruct
the power model of the device.
(3)
Given the state of the art hardware, designing hardware
which is fully energy proportional remains an open challenge,
power model of a non-energy proportional system is illustrated
in Figure 1. However, even in the absence of redesigned
hardware, we can approximate the behavior of energy
proportional systems by leveraging combined power saving
mechanisms [16] and engaging heterogeneous commodity
devices combined with powerful server machines in lieu of
homogeneous server hardware platform [19].
( ) ( )s d
s d
P U P PLDR max
P P
idle
Max
PIPR
P
( ) ( )( ) Max
Max
P U P UPG U
P
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 2 4 -
Figure 9 - Energy Proportionality.
II.1. Server power modeling
State of the art platforms are not capable of fine-grained
power measurement. Therefore, to manage dynamic power
proportionality, a power model is required. Currently, Running
Average Power Limit (RAPL) counters, available in the recent
Intel CPUs, is the closest to the hardware based monitoring.
RAPL allows monitoring of whole CPU package, cores, and
DRAM. Since these counters are not available in all CPUs, to
cope with the heterogeneous infrastructure, a group of works rely
on performance counters for synthesising a power model.
Mapping the counter and power values is usually done through
linear regression. Nonetheless, linear power model is not
sufficient in many cases due to non-proportional power
dissipation characteristics of CPUs.
Moreover, linear models rely on the non-correlated covered
features, which is not a valid assumption in the state of the art
systems. A quadratic solution fits better the power modeling of
multi-core systems [2].
However, Hyperthreading and turbo-boost may still impede
the model from accurate estimation, due to hidden states they
make. A hyperthread aware power modeling mechanism is
introduce in [20]. The introduced model differentiates between
the cases where either single or both hardware threads of a core
are in use.
The most recent work in this line is BitWatts [5], which
introduces a counter based power model for each individual
frequency.
Nonetheless, there is a trade off between accuracy and the
overhead. Targeting the community of commodity devices
forming a collaborative system, in edge layer, integrated with
data center to form P2P assisted clouds, we should particularly
tune the trade off level due to the lower energy consumption in
such devices, which acquires adaptive models.
III. ESTIMATING VM ENERGY
In multi-tenant platforms, the efficiency of VM
consolidation, power dependent cost modeling, and power
provisioning are highly dependent on accurate power models.
Such models are particularly needed because it is not possible to
attach a power meter to a virtual machine. In this section we
review the state of the art VM power models and demonstrate
their shortcomings.
III.1. Related Work
In general, VMs can be monitored as black-box systems for
coarse-grained scheduling decisions. However, for fine-grained
scheduling decisions—e.g., with heterogeneous hardware—
finer-grained estimation at sub-system level is required and
might even need to step inside the VM.
So far, fine-grained power estimation of VMs required
profiling each application separately. To exemplify, WattApp
[12], which relies on application throughput instead of
performance counters as a basis for the power model. PMapper
[17] maps resources using a centralized step-wise decision
algorithm in lieu of application power estimation.
To generalize power estimation, JouleMeter [9] assumes that
each VM only hosts a single application and thus treat VMs as
black boxes. In a multi-VM system, they try to compute the
resource usage of each VM in isolation and feed the resulting
values in a power model. Bertran et al. [1] propose an approach
employes a sampling phase to gather data related to
performance-monitoring counters (PMCs) and compute energy
models from these samples. With the gathered energy models, it
is possible to predict the power consumption of a process, and
therefore apply it to estimate the power consumption of the entire
VM. Another example is VMeter [3], which estimates the
consumption of all active VMs on a system. A linear model is
us to omput th V s’ pow onsumption with the help of
available statistics (processor utilization and I/O accesses) from
each physical node. The total power consumption is
subs qu nt y omput by summing th V s’ onsumption with
the power consumed by the infrastructure.
Janacek et al. [6] exploit a linear power model to compute the
server consumption with postmortem analysis. The computed
power consumption is then mapped to VMs depending on their
load. This technique is not effective when runtime information is
required. In general, VMs can be monitored as black-box
systems for coarse-grained scheduling decisions. However, for
fine-grained scheduling decisions—e.g., with heterogeneous
hardware— finer-grained estimation at sub-system level is
required and might even need to step inside the VM.
So far, fine-grained power estimation of VMs required
profiling each application separately.To exemplify, WattApp
[12], which relies on application throughput instead of
performance counters as a basis for the power model. PMapper
[17] maps resources using a centralized step-wise decision
algorithm in lieu of application power estimation.
IV. CHALLENGES
However, collocation of applications has its own challenges.
Workload intensity is often highly dynamic. The power profile of
the data center hardware is inherently heterogeneous; this makes
the optimal VM power modeling problem more complicated.
The nonlinearity and in some cases unpredictability of the energy
efficiency profile aggravates the complexity of energy efficient
collocation management, due to inaccurate VM power
characterization.
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 2 5 -
IV.1. Static Power and Overhead Modeling
Energy consumption of the host per job embraces the static
power consumption, independent of the resource utilization, and
the dynamic power, which is degraded not only proportional to
the VM's allocated resources but also on account of the overhead
caused in the hypervisor, and the interference due to collocation.
Estimating this overhead is complicated since the pattern of the
hypervisor overhead is tightly coupled with the number of VMs,
the type of resources each VM asks for, and the number of times
the switching occurs between VMs and hypervisor. Thus, for a
more accurate estimation, further to individual VM's energy, VM
interference energy overhead should also be estimated. Some
estimation methods have been proposed in the state of the art:
e.g. [3,4,14]. In [15] the authors argue that, in virtualized
environments, energy monitoring has to be integrated within the
VM as well as the hypervisor.
Work in [13] introduces an interference coefficient, defined
to model the energy interference. The major contribution of this
work is to estimate the energy interference according to the
previous knowledge of standalone application running on the
same machine. They model interference as a separate implicit
task. Moreover, an energy efficient collocation management
policy is introduced in this work that is modeled as an
optimization problem solvable by data mining techniques. All
the VMs running on the same machine are known as a collection.
The energy consumption of a collection is the sum of idle energy
consumed for the longest VM run, dynamic energy consumed by
each VM if they were run in isolated environment, and the
energy depleted due to interference between each VM pair. The
interference energy can be positive or negative depending on the
intersection of resources between each VM pair. Interference
energy is estimated as the coefficient of the summation of idle
and isolated run for each VM. On the other hand, performance is
measured as the delay, which is measured by modeling the
system as a M/M/1 queue and calculating the imaginary
interference tasks response time as the delay due to interference.
IV.2. Non-energy Proportional Host Effect
Besides the hypervisor and interference overhead in multi-
tenant systems, the non-energy proportional hardware adds more
complexity to the VM power modeling agenda. In non-energy
proportional hardware platform, since the hardware power model
is non-linear, two identical VMs, sharing the same hardware,
may end up with different dynamic power usage estimation
during the runtime, which may lead to unfair energy based
service charging, and planning.
Figure 2, visualizes such a case. In this scenario, there are
two identical VMs, i.e. VM1 and VM2, collocated on a host with
the power model demonstrated in the Figure.
If we only run VM1, the dynamic power estimated for this
VM will be P1, whereas running the second identical VM on the
same machine predicted as P2 < P1. Therefore, in case of
collocation, there should be a strategy to divide the dynamic
power fairly among the running VMs.
Figure 10 - VM power modeling issues in non-energy proportional
systems.
To address the fairness issue introduced in the previous
section we propose the weighted division VM power model. In
this model as illustrated in [4], a particular VM's power
consumption, PVM(i) is calculated according to the relative
utilization, i.e. u(i)/U, contributed by that particular VM. In this
equation, u(i) represents the utilization incurred by VM i, and U
denotes the overall machine utilization.
(4)
V. CONCLUSION
In this paper, we explained the state of the art Virtual
Machine(VM) power modeling techniques and their
shortcomings. We demonstrated that current VM power models,
fail to capture the effect of non-energy proportional hosts in a
multi-tenat cloud ecosystems. We argued that a fair VM power
model needs not only to be able to characterize per VM resource
usage and translate it to power, but also it requires to be aware of
the overall host utilization for fair power division.
Moreover, interference and other vitalization overhead
require an accurate model which can be mappable to the power
consumption. Therefore, future line of work in VM power
modeling should address overhead modeling in multi-tenant
ecosystems baring in mind the properties of non-energy
proportional hosts.
REFERENCES
[65] Bertran, R.; Gonzàlez, M.; Martorell, X.; Navarro, N. & Ayguadé, E. “Counter-based power modeling methods: Top-down vs. bottom-up” The
Computer Journal, Br Computer Soc, 2013, 56, 198-213.
[66] Bircher, W. L. & John, L. K. “Complete system power estimation: A trickle-down approach based on performance events Performance Analysis
of Systems & Software”, 2007. ISPASS 2007. IEEE International
Symposium on, 2007, 158-168.
[67] Bohra, A. E. & Chaudhary, V. VMeter: “Power modelling for virtualized
clouds Parallel & Distributed Processing”, Workshops and Phd Forum
(IPDPSW), 2010 IEEE International Symposium on, 2010, 1-8.
[68] Chen, Q.; Grosso, P.; van der Veldt, K.; de Laat, C.; Hofman, R. & Bal, H.
“Profiling energy consumption of VMs for green cloud computing”,
Dependable, Autonomic and Secure Computing (DASC), 2011 IEEE Ninth International Conference on, 2011, 768-775.
[69] Colmant, M.; Kurpicz, M.; Felber, P.; Huertas, L.; Rouvoy, R. & Sobe, A. “Process-level power estimation in VM-based systems”, European
Conference on Computer Systems (EuroSys), 2015, 14.
( )( ) i
VM
u P UP i
U
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 2 6 -
[70] Janacek, S.; Schroder, K.; Schomaker, G.; Nebel, W.; Ruschen, M. &
Pistoor, G. “Modeling and approaching a cost transparent, specific data
center power consumption”, Energy Aware Computing, 2012 International Conference on, 2012, 1-6.
[71] Jin, X.; Zhang, F.; Wang, L.; Hu, S.; Zhou, B. & Liu, Z. “A Joint
Optimization of Operational Cost and Performance Interference in Cloud Data Centers”, arXiv preprint arXiv:1404.2842, 2014.
[72] Kaewpuang, R.; Chaisiri, S.; Niyato, D.; Lee, B. & Wang, P. “Cooperative
Virtual Machine Management in Smart Grid Environment”, IEEE, 2013.
[73] Kansal, A.; Zhao, F.; Liu, J.; Kothari, N. & Bhattacharya, A. A. ,”Virtual
machine power metering and provisioning”, Proceedings of the 1st ACM
symposium on Cloud computing, 2010, 39-50.
[74] Khosravi, A.; Garg, S. K. & Buyya, R. “Energy and carbon-efficient
placement of virtual machines in distributed cloud data centers”, Euro-Par
2013 Parallel Processing, Springer, 2013, 317-328.
[75] Kim, N.; Cho, J. & Seo, E. “Energy-credit scheduler: An energy-aware
virtual machine scheduler for cloud systems”, Future Generation Computer
Systems , 2012.
[76] Koller, R.; Verma, A. & Neogi, A. “WattApp: an application aware power
meter for shared data centers”, Proceedings of the 7th international
conference on Autonomic computing, 2010, 31-40.
[77] Pore, M.; Abbasi, Z.; Gupta, S. K. & Varsamopoulos, G. “Energy aware
Colocation of Workload in Data centers”, High Performance Computing
(HiPC), 2012 19th International Conference on, 2012, 1-6.
[78] Smith, J. W.; Khajeh-Hosseini, A.; Ward, J. S. & Sommerville, I.
“Clou onito : P ofi ing Pow Us g ”, Cloud Computing (CLOUD),
2012 IEEE 5th International Conference on, 2012, 947-948.
[79] Stoess, J.; Lang, C. & Bellosa, F. “Energy Management for Hypervisor-
Based Virtual Machines”. USENIX annual technical conference, 2007, 1-
14.
[80] Tolia, N.; Wang, Z.; Marwah, M.; Bash, C.; Ranganathan, P. & Zhu, X.
“Delivering Energy Proportionality with Non Energy-Proportional
Systems-Optimizing the Ensemble”. HotPower, 2008, 8, 2-2.
[81] A. Verma P. Ahuja, A. N. “pMapper: power and migration cost aware
application placement in virtualized systems”, 9th ACM/IFIP/USENIX
International Conference on Middleware , 2008, 243-264.
[82] Wang, C.; Urgaonkar, B.; Kesidis, G.; Shanbhag, U. V. & Wang, Q. “A
case for virtualizing the electric utility in cloud data centers”, Proceedings
of the 6th USENIX conference on Hot Topics in Cloud Computing, 2014, 22-22.
[83] Wong, D. & Annavaram, M. “Knightshift: Scaling the energy
proportionality wall through server-level heterogeneity”, Microarchitecture (MICRO), 2012 45th Annual IEEE/ACM International Symposium on,
2012, 119-13.
[84] Zhai, Y.; Zhang, X.; Eranian, S.; Tang, L. & Mars, J. “Happy: Hyperthread-aware power profiling dynamically”, 2014 USENIX Annual Technical
Conference (USENIX ATC 14)(Philadelphia, PA, 2014, 211-217.
N . 1 0 - 1 5 t h J u l y 2 0 1 5 I C T - E N E R G Y L E T T E R S
- P a g e 2 7 -
SAVE THE DATE
D A T E E V E N T W E B S I T E
Aug 31 – Sep 1, 2015 - Dresden EnA-HPC 2015 http://www.ena-hpc.org
September 7-9, 2015 - Copenhagen EnviroInfo & ICT4S 2015 http://enviroinfo2015.org/
September 14-16, 2015 - Bristol ICT-Energy Workshop www.ict-energy.eu
October 20-22, 2015 - Lisbon ICT 2015
Innovate, Connect, Transform
https://ec.europa.eu/digital-agenda/en/ict2015-
innovate-connect-transform-lisbon-20-22-
october-2015
Nov 29 – Dec 4, 2015 - Boston 2015 MRS Fall Meeting & Exhibit http://www.mrs.org/fall2015/
December 11-13, 2015 - Sydney IEEE GreenCom 2015 http://www.swinflow.org/confs/greencom2015/
December 14-16, 2015 – Las Vegas IGSC’15 http://igsc.eecs.wsu.edu/
January 18-20, 2016 - Prague HiPEAC 2016 Conference https://www.hipeac.net/2016/prague/
January 20, 2016 - Prague
ICT-Energy Workshop
Minimizing Energy Consumption of
Computing to the Limit
https://www.hipeac.net/events/activities/7326/ict-
energy/
July, 2016 – Copenhagen ICT-Energy International Conference www.ict-energy.eu
ICT-ENERGY LETTERS is realized with the contribution of LANDAUER
project and ICT-Energy Coordination Action, funded under the Future and
Emerging Technologies (FET) programme within the ICT theme of the Seventh
Framework Programme for Research of the European Commission.
I C T - E N E R G Y L E T T E R S
c/o NiPS Laboratory, Dipartimento di Fisica e Geologia
Università degli Studi di Perugia
Via A. Pascoli, 1 – I-06123 – Perugia (Italy)
www.ict-energyletters.eu