23
The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira *† , Marco Couto *† , J´ acome Cunha , Jo˜ ao Paulo Fernandes § , and Jo˜ ao Saraiva *† * HASLab/INESC TEC, Portugal Universidade do Minho, Portugal NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa, Portugal § RELEASE, Universidade da Beira Interior, Portugal {ruipereira,marcocouto,jas}@di.uminho.pt, [email protected], [email protected] Abstract—This paper presents a detailed study of the energy consumption of the different Java Collection Framework (JFC) implementations. For each method of an implementation in this framework, we present its energy consumption when handling different amounts of data. Knowing the greenest methods for each implementation, we present an energy optimization approach for Java programs: based on calls to JFC methods in the source code of a program, we select the greenest implementation. Finally, we present preliminary results of optimizing a set of Java programs where we obtained 6.2% energy savings. I. I NTRODUCTION The increasing energy costs related to ICT in organiza- tions [1], and society’s environmental concerns, are changing the way both computer manufacturers and software engi- neers develop their products. While in the previous century, improving execution time was the main goal when devel- oping hardware/software, and thus programming languages and their compilers were designed to produce fast systems, nowadays energy consumption is becoming the bottleneck of such systems. As a consequence, powerful libraries offered by programming languages and their compiler optimizations have to consider this new reality. In this paper we conduct a detailed study in terms of energy consumption of the widely used Java Collections Framework (JCF) library 1 . We consider three different groups of data structures, namely Sets, Lists and Maps, and for each of these groups, we study the energy consumption of each of its different implementations and methods. We exercise and monitor the energy consumed by each of the API methods when handling low, medium and big data sets. A first result of our study is a quantification of the energy spent by each method of each implementation, for each of the data structures we consider. This energy-awareness can not only be used to steer software developers in writing greener software, but also in optimizing legacy code. In fact, we have used/validated this quantification by semi-automatically optimizing the energy consumption of a set of similar software systems. As a second result, we statically compute which imple- mentations and methods are used in the source code of such projects, and then look up the energy consumption data to 1 https://docs.oracle.com/javase/7/docs/technotes/guides/collections/index. html find which equivalent implementation has the lowest energy consumption for those specific methods. Finally, we manually transform the source code to use the “greenest” implemen- tation. Our preliminary results show that energy consumption decreased in all the optimized software systems that we tested, with an average energy saving of 6.2%. With our work we are answering the following research questions: RQ 1 - Can we define an energy consumption quan- tification of Java data structures and their methods? RQ 2 - Can we use such quantification to decrease the energy consumption of software systems? This paper is organized as follows: Section II contains our analysis of the energy consumption of the different Java Collection Framework implementations. In Section III we describe our methodology to optimize Java programs and its application to five Java programs. Section IV presents the validity threats for our analysis. Next, we present related and future work (Sections V and VI, respectively), and in Section VII we present the conclusions of our work. II. TOWARDS A RANKING OF JAVA I MPLEMENTATIONS METHODS One of our goals is to compare the energy consumption of different Java implementations of the same abstract data struc- tures. To do this, we designed an experiment that simulates different kinds of uses of such structures. In this section we present the design, execution, and results of that simulation. A. Design Our experiment design is inspired by the one used in [2], since our analysis also considers a simple scenario of storing, retrieving, and deleting String values in the various collections. a) JCF Data structures: The most classical way to separate Java data structures is into groups which implement the interfaces Set 2 , List 3 , or Map 4 , respectively. This separation indeed makes sense as each interface has its own distinct properties and purposes (for example, there is no ordering notion under Sets). 2 https://docs.oracle.com/javase/7/docs/api/java/util/Set.html 3 https://docs.oracle.com/javase/7/docs/api/java/util/List.html 4 https://docs.oracle.com/javase/7/docs/api/java/util/Map.html arXiv:1602.00984v1 [cs.SE] 2 Feb 2016

The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

  • Upload
    others

  • View
    29

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

The Influence of the Java Collection Framework onOverall Energy Consumption

Rui Pereira∗†, Marco Couto∗†, Jacome Cunha‡, Joao Paulo Fernandes§, and Joao Saraiva∗†∗ HASLab/INESC TEC, Portugal† Universidade do Minho, Portugal

‡ NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa, Portugal§ RELEASE, Universidade da Beira Interior, Portugal

ruipereira,marcocouto,[email protected], [email protected], [email protected]

Abstract—This paper presents a detailed study of the energyconsumption of the different Java Collection Framework (JFC)implementations. For each method of an implementation in thisframework, we present its energy consumption when handlingdifferent amounts of data. Knowing the greenest methods for eachimplementation, we present an energy optimization approach forJava programs: based on calls to JFC methods in the source codeof a program, we select the greenest implementation. Finally, wepresent preliminary results of optimizing a set of Java programswhere we obtained 6.2% energy savings.

I. INTRODUCTION

The increasing energy costs related to ICT in organiza-tions [1], and society’s environmental concerns, are changingthe way both computer manufacturers and software engi-neers develop their products. While in the previous century,improving execution time was the main goal when devel-oping hardware/software, and thus programming languagesand their compilers were designed to produce fast systems,nowadays energy consumption is becoming the bottleneck ofsuch systems. As a consequence, powerful libraries offered byprogramming languages and their compiler optimizations haveto consider this new reality.

In this paper we conduct a detailed study in terms of energyconsumption of the widely used Java Collections Framework(JCF) library 1. We consider three different groups of datastructures, namely Sets, Lists and Maps, and for each ofthese groups, we study the energy consumption of each ofits different implementations and methods. We exercise andmonitor the energy consumed by each of the API methodswhen handling low, medium and big data sets.

A first result of our study is a quantification of the energyspent by each method of each implementation, for each of thedata structures we consider. This energy-awareness can notonly be used to steer software developers in writing greenersoftware, but also in optimizing legacy code. In fact, wehave used/validated this quantification by semi-automaticallyoptimizing the energy consumption of a set of similar softwaresystems.

As a second result, we statically compute which imple-mentations and methods are used in the source code of suchprojects, and then look up the energy consumption data to

1https://docs.oracle.com/javase/7/docs/technotes/guides/collections/index.html

find which equivalent implementation has the lowest energyconsumption for those specific methods. Finally, we manuallytransform the source code to use the “greenest” implemen-tation. Our preliminary results show that energy consumptiondecreased in all the optimized software systems that we tested,with an average energy saving of 6.2%.

With our work we are answering the following researchquestions:

• RQ 1 - Can we define an energy consumption quan-tification of Java data structures and their methods?

• RQ 2 - Can we use such quantification to decreasethe energy consumption of software systems?

This paper is organized as follows: Section II containsour analysis of the energy consumption of the different JavaCollection Framework implementations. In Section III wedescribe our methodology to optimize Java programs and itsapplication to five Java programs. Section IV presents thevalidity threats for our analysis. Next, we present relatedand future work (Sections V and VI, respectively), and inSection VII we present the conclusions of our work.

II. TOWARDS A RANKING OF JAVA IMPLEMENTATION’SMETHODS

One of our goals is to compare the energy consumption ofdifferent Java implementations of the same abstract data struc-tures. To do this, we designed an experiment that simulatesdifferent kinds of uses of such structures. In this section wepresent the design, execution, and results of that simulation.

A. Design

Our experiment design is inspired by the one used in [2],since our analysis also considers a simple scenario of storing,retrieving, and deleting String values in the various collections.

a) JCF Data structures: The most classical way toseparate Java data structures is into groups which implementthe interfaces Set2, List3, or Map4, respectively.

This separation indeed makes sense as each interface hasits own distinct properties and purposes (for example, there isno ordering notion under Sets).

2https://docs.oracle.com/javase/7/docs/api/java/util/Set.html3https://docs.oracle.com/javase/7/docs/api/java/util/List.html4https://docs.oracle.com/javase/7/docs/api/java/util/Map.html

arX

iv:1

602.

0098

4v1

[cs

.SE

] 2

Feb

201

6

Page 2: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

In our study, a few implementations were not evaluatedas they are quite particular in their usage and could not bepopulated with strings. In particular, JobStateReasons (Set)only accepts JobStateReason objects, IdentityHashMap (Map)accepts strings but compares its elements with the identityfunction, and not with the equals method.

Given these considerations, we evaluated the followingimplementations:

Sets ConcurrentSkipListSet, CopyOnWriteArraySet,HashSet, LinkedHashSet, TreeSet

Lists ArrayList, AttributeList, CopyOnWriteArrayList,LinkedList, RoleList, RoleUnresolvedList, Stack,Vector

Maps ConcurrentHashMap, ConcurrentSkipListMap,HashMap, Hashtable, IdentityHashMap,LinkedHashMap, Properties, SimpleBindings,TreeMap, UIDefaults, WeakHashMap

b) Methods: To choose the methods to measure foreach abstraction, we looked at the generic API list for thecorresponding interface.

From this list, we chose the methods which performed in-sertion, removal, or searching operations on the data structures,along with a method to iterate and consult all the values inthe structure. In some methods (e.g. containsAll or addAll),a second data structure is needed.

Sets add, addAll, clear, contains, containsAll, iter-ateAll, iterator, remove, removeAll, retainAll,toArray

Lists add, addAll, add (at an index), addAll (at anindex), clear, contains, containsAll, get, indexOf,iterator, lastIndexOf, listIterator, listIterator (at anindex), remove, removeAll, remove (at an index),retainAll, set, sublist, and toArray

Maps clear, containsKey, containsValue, entrySet, get,iterateAll, keySet, put, putAll, remove, and values

c) Benchmark: To evaluate the different implemen-tations on each of the described methods, we started bycreating and populating objects with different sizes for eachimplementation. 5

We considered initial objects with 25.000, 250.000, and1.000.000 elements, providing our analysis with multiple or-ders of magnitude of measurement. This will allow us to betterunderstand how the energy consumption scales in regards topopulation size.

When a second data structure is required, we have adoptedfor it a size6 of 10% the popsize of the tested structure,containing half existing values from the tested structure andhalf new values, all shuffled.

Table I briefly summarizes how each method is tested forthe Setcollection. The tests for the other collections are similar,and their full description can be found in this paper’s appendix.

5We will refer to the population size of an object as popsize.6We will refer to the size of each such structure as secondaryCol.

TABLE I. TEST DESCRIPTION OF SET METHODSMethod Description of the test for the methodadd add popsize/10 elements. half existing, half newaddAll addAll of secondaryCol 5 timesclear clear 5 timescontains contains popsize/10 elements. half existing, half newcontainsAll containsAll of secondaryCol 5 timesiterateAll iterate and consult popsize valuesiterator iterator popsize timesremove remove popsize/10 elements. half existing, half newremoveAll removeAll of secondaryCol 5 timesretainAll retainAll of secondaryCol 5 timestoArray toArray 5 times

B. Execution

To analyze the energy consumption, we first imple-mented our data structure analysis design as an energybenchmark framework. This is one of our contributions,and can be found at https://github.com/greensoftwarelab/Collections-Energy-Benchmark. This implementation is basedon a publicly available micro-benchmark7 which evaluatesthe runtime performance of different implementations of theCollections API, and has been used in a previous study toobtain energy measurements [2].

To allow us to record precise energy consumption measure-ments from the CPU, we used Intel’s Runtime Average PowerLimit (RAPL) [3]. RAPL is an interface which allows accessto energy and power readings via a model-specific register. Itsprecision and reliability has been extensively studied [4], [5].More specifically, we used jRAPL [6] which is a frameworkfor profiling Java programs using RAPL. Using these toolspermitted us to obtain energy measurements on a method level,allowing us a fine grained measurement.

We ran this study on a server with the following speci-fications: Linux 3.13.0-74-generic operating system, 8GB ofRAM, and Intel(R) Core(TM) i3-3240 CPU @ 3.40GHz. Thissystem has no other software installed or running other thannecessary to run this study, and the operating system daemons.Both the Java compiler and interpreter were versions 1.8.0 66.

Prior to executing a test, we ran an initial “warm-up” wherewe instantiated, populated (with the designated popsize), andperformed simple actions on the data structures. Each test wasexecuted 10 times, and the average values for both the timeand energetic consumption were extracted (of the specific test,and not the initial “warm-up” as to only measure the testedmethods) after removing the lowest and highest 20% as tolimit outliers.

C. Results

This section presents the results we gathered from theexperiment. We highly recommend and assume the images arebeing viewed in color. Figures 1, 3, and 5 represent the datafor our analyzed Sets, Lists, and Maps respectively. Eachrow in the tables represents the measured methods, and foreach analyzed implementation, we have two columns repre-senting the consumption in Joules(J) and execution time inmilliseconds(ms). Each row has a color highlight (under the Jcolumns) varying between a Red to Yellow to Green. The mostenergetically inefficient implementation for that row’s method(the one with the highest consumed Joules) is highlighted

7https://dzone.com/articles/java-collection-performance

Page 3: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

Red. The implementation with the lowest consumed Joules(most energetically efficient) is highlighted Green. The restare highlighted depending on their consumption values whencompared to the most inefficient and efficient implementation,and colored accordingly in the color scale.

Figures 2, and 4 are a graphical representation of the datafor our analyzed Sets, and Lists. The Y-Axis represents theconsumption in Joules, and the X-Axis represents the variousmeasured methods. Each column represents a specific analyzedimplementation.

The CopyOnWriteArraySet implementation was discardedduring the experiment execution as the tests did not finish ina reasonable amount of time. We also omitted the removeAllmethod data from Figure 4 as to visually improve the readabil-ity of the graph. For the full representation of the data/graphsof the other two population sizes, please consult the appendix.From our data, we can draw interesting observations:

• Looking at the Setresults for population of 25k data(shown in Fig 1) we can see that LinkedHashSetincludes most of the energetically efficient methods.Nevertheless, one can easily notice that it is alsothe most inefficient with the addAll and containsAllmethods.

• Figure 3 presents the Listresults for population of 25k.Both RoleUnresolvedList and AttributeList contain themost efficient methods. Interesting to point out thatboth of these extend ArrayList, which contains lessefficient methods, and very different consumptionvalues in comparison with these two. We can alsoclearly see that LinkedList is by far the most inefficientListimplementation.

• In Figure 5, we can see that Hashtable, Linked-HashMap, and Properties contain the most efficientmethods, and with no red methods. Interesting to noteis that while the Properties data structure is generallyused to store project configuration data or settings, itproduced the very good results for our scenario ofstoring string values.

• The concurrent data structure implementations (Con-currentSkipListSet, CopyOnWriteArrayList, Concur-rentHashMap, ConcurrentSkipListMap, and the re-moved CopyOnWriteArraySet) perform very poorly.As such, these should probably be avoided if a re-quirement is a low consuming application.

• One can see cases where a decrease in execution timetranslates into a decrease in the energy consumedas suggested by [7]. For instance in Figure 5, whencomparing Hashtable and TreeSet for the get method,we see that Hashtable has both a lower executiontime and energy consumption. As observed by [8], [9],cases where an increase in execution time brings abouta decrease in the energy consumed can also be seen,for example in Figure 5 when comparing HashMapand Hashtable for the keySet method. As such, wecannot draw any conclusion of the correlation betweenexecution time and energy being consumed.

• Different conclusions can be drawn for the 250kand 1m population sizes (which can be seen in the

appendix). This also shows that the energy consump-tion of different data structure implementations scaledifferently in regards to size. What may be the mostefficient implementation for one population size, maynot be the best for another.

III. OPTIMIZING ENERGY CONSUMPTION OF JAVAPROGRAMS

The results presented in the previous section may allowsoftware developers to develop more energy efficient software.In this section we present a methodology to optimize, atcompilation time, existing Java programs. This methodologyconsists of the following steps:

1) Computing which implementation/methods are usedin the programs

2) Looking up the appropriate energy tables for the usedimplementation/methods

3) Choosing the most efficient implementation based ontotal energy

In the next subsection, we describe in detail how weapplied this approach and how it was used to optimize a setof equivalent Java programs.

A. Data Acquisition

First, we obtained several Java projects from an object-oriented course for undergraduate computer science students.For this course, students were asked to build a journalism sup-port platform, where users (Collaborators, Journalists, Readers,and Editors) can write articles (chronicles and reports), andgive likes and comments. Along with these different platformimplementations, we obtained seven test cases which simulatedusing the system (registering, logging in, writing articles,commenting, etc.). The size of users, articles, and commentsvaried between approximately 2.000 and 10.000 each for eachdifferent test case and each entity. These projects had anaverage of 36 classes, 104 methods, and 2.000 lines of code.

Next we discuss the optimization of five of those projects,where we semi-automatically detected the use of any JCF im-plementation (both efficient and inefficient implementations),and which were the used methods for each implementation.

B. Choosing an energy efficient alternative

To try to optimize these projects based on the data struc-tures and their used methods, we looked at our data for the 25kpopulation. We chose this one, as it is the one which is closestto the population used in the test cases (which was between2.000 and 10.000 for each different entity).

For each detected data structure implementation, we se-lected the used methods, and chose our optimized data struc-ture based on the implementation which consumed the leastamount of energy for this specific case.

Figures 6 and 7 show the data used to make our decisionfor Project 1, where Hashtable was used in place of TreeMap(as Hashtable was the most efficient implementation in thisscenario with 6.8J), and ArrayList was used in place ofLinkedList (as ArrayList was the most efficient implementationwith 2.4J).

Page 4: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

Fig. 1. Set results for population of 25k Fig. 2. Set results graph for population of 25k

Fig. 3. List results for population of 25k

Fig. 4. List results graph for population of 25k without removeAll

Fig. 5. Map results for population of 25k

Page 5: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

Table II details the 5 Projects, their originally used datastructure implementations, new implementation, and usedmethods for the implementations.

TABLE II. ORIGINAL AND OPTIMIZED DATA STRUCTURES, AND USEDMETHODS FOR EACH PROJECT

Data StructuresProjects Original Optimized Methods

1 TreeMap Hashtable containsKey, get, put, valuesLinkedList ArrayList add, listIterator

2 HashMap Hashtable containsKey, get, put, values3 LinkedList ArrayList add, addAll, iterator, listIterator, remove

4 LinkedList AttributeList add (at an index), iteratorHashMap Hashtable containsKey, get, put

5 HashMap Hashtable containsKey, get, putTreeSet LinkedHastSet add, iterator

C. Pre-energy measurement setup

Now that we have chosen our energy efficient alternative,we need to change the projects to reflect this. The sourcecode was manually altered to use the chosen implementations.Finally, we verified that the program maintained the originalconsistency and state by verifying if the outputs and operationsproduced by these two versions did not change.

D. Energy measurements

To measure the original, and optimized projects, we fol-lowed the same methodology detailed in Section: II-B Execu-tion. We executed the seven test cases in the same server, andusing jRAPL obtained the energy consumption measurements.Each test was also executed 10 times, and the average values(after removing the 20% highest and lowest values) werecalculated.

E. Results

Table III presents, for each project, the energy consumptionin Joules (J), and execution time in milliseconds (ms) forboth the original and optimized implementations. The lastcolumn shows the improvement gained after having performedthe optimized implementations for both the consumption andexecution time.

TABLE III. RESULTS OF PRE AND POST OPTIMIZATIONData Structures

Projects Original Optimized ImprovementJ ms J ms J ms

1 23.744583 1549 22.7071302 1523 4.37% 1.68%2 24.6787895 1823 23.525123 1741 4.67% 4.50%3 25.0243507 1720 22.259355 1508 11.05% 12.33%4 17.1994425 1258 16.2014997 1217 5.80% 3.26%5 19.314512 1372 18.3067573 1245 5.22% 9.26%

As we can see, all five programs improve their energy ef-ficiency. Our optimization improves their energy consumptionbetween 4.37% and 11.05%, with an average of 6.2%.

IV. THREATS

The goal of our experiments was to define the energyconsumption profile of JCF implementations and validate suchresults. As in any experiment, there are a few threats toits validity. We start by presenting the validity threats forthe first experiment, that is, the evaluation of the energyconsumption of several Java data structure methods. We dividethese threats in four categories as defined in [10], namely:conclusion validity, internal validity, construct validity, andexternal validity.

A. JCF Implementations Profile

We start by discussing the threats to validity of the firstexperiment.

a) Conclusion Validity: We used the energy consump-tion measurements to establish a simplistic order between thedifferent implementations. To do so, we have based ourselveson an existing benchmark (although developed to measuredifferent things). To perform the actual measurements, we usedRAPL which is known to be a quite reliable tool [4], [5]. Thus,we believe the finding are quite reliable.

b) Internal Validity: The energy consumption mea-surements we have for the different implementations/methodscould have been influenced by other factors other than justtheir source code execution. To mitigate this issue, for everytest we added a “warm-up” run, and we ran every test 10 times,taking the average values for these runs so we could minimizeparticular states of the machine and other software in it (e.g.operating system daemons). Moreover, we ran the tests in aLinux server with no other software running except for theoperating system and its services in order to isolate the energyconsumption values for the code we were running as much aspossible.

c) Construct Validity: We have designed a set of teststo evaluate the energy consumption of the methods of the dif-ferent JCF implementations. As software engineers ourselves,we have done the best we can and know to make them as realand interesting as possible. However, these experiments couldhave been done in many other different ways. In particular,we have only used strings to perform our evaluation. We havealso fixed the size of the collections in 25K, 250K, and 1M.Nevertheless, we believe that since all the tests are the samefor all the implementations (of a particular interface), differenttests would probably produce the same relationship betweenthe consumption of the different implementations and theirmethods. Still, we make all our material publicly available forbetter analysis of our peers.

d) External Validity: The experiment we performed caneasily be extended to include other collections. The methodcan also be easily adapted to other programming languages.However, until such execution are done, nothing can be saidabout such results.

B. Validating the Measurements

Next we present the threats to validity, again divided infour categories, for the experiment we performed to evaluatethe impact of the finding of the first study when changing theimplementations in a complete program.

a) Conclusion Validity: Our validation assumed thateach method is on the same level of importance or weight,and does not distinguish between possible gain of optimizingfor one method or another (for instance, there might be moregain in optimizing for a commonly used add method over aretainAll method). Thus, the method of choosing the best alter-native implementation would need fine tuning. Nevertheless,it is consistent that changing an implementation by anotherinfluences the energy consumption of the code in the sameline with the results found for the implementations/methods inthe first experiment.

Page 6: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

Fig. 6. Choosing optimized Map for Project 1

Fig. 7. Choosing optimized List for Project 1

b) Internal Validity: The energy consumptions mea-sures we have for the different projects (before and afterchanging the used implementations) could have influencefrom other factors. However, the most important thing is therelationship between the value before and after changing theimplementations. Nevertheless, we have executed each project10 times and calculated the average so particular states of themachine could be mitigated as much as possible in the finalresults.

c) Construct Validity: We used 5 different (project) im-plementation of a single problem developed by students in thesecond semester of an undergraduation in Computer Science.This gave us different solutions for the same problem that canbe directly compared as they all passed a set of functional testsdefined in the corresponding course. However, different kindsof projects could have different results. Nevertheless, there isno basis to suspect that these projects are best or worst thanany other kind. Thus, we expect to continue to see gains/losseswhen changing implementations in any other kinds of softwareprojects according to our findings.

d) External Validity: The used source code has noparticular characteristics which could influence our findings.The main characteristic is possibly the fact that it was de-veloped by novice programmers. Nevertheless, we could seethe impact of changing data structure implementations in boththe good and bad (project) implementations. Thus, we believethat these results can be further generalized for other projects.Nevertheless, we intend to further study this issue and performa wider evaluation.

V. RELATED WORK

Although energy consumption analysis is an area exploredfor the last two decades, only more recently has it started tofocus on software improvement more than hardware improve-ment. In fact, designing energy-aware programming languagesis an active area [11], and software developers claim for toolsand techniques to support them in the goal of developingenergy-aware development [12]. Even in software testing,researchers want to know how to reduce energy consumptionand where do they need to focus to do it [13].

Studies have shown that there are a lot of software develop-ment related factors that can significantly influence the energy

consumption of a software system. Different design patterns,using Model-View-Controller, information hiding, implemen-tation of persistence layers, code obfuscations, refactorings,and the usage of different data structures [14], [15], [16], [2],[17], [18], [19], [20] can all influence energy consumption,and all are software related implementation decisions.

Some other research works are even focused on detectingexcessive/anomalous energy consumptions in software, notby comparing the overall energy consumption of differentimplementations of the same software system, but by usingtools and techniques specialized in determining the consump-tion per blocks of code, such as methods [21], source codeinstructions [15] or even bytecode instructions [22]. Thoseworks are based on an energy consumption model: a predictionmodel which can relate such blocks of code with the amountof energy that they are expected to consume. The conceptof energy models have been widely used, particularly in themobile area [22], [23], [24], [25], [26], [27].

In a more focused and concrete way, some research worksalso analyzed the efficiency of data structures [2]. Manotas etal. [2] built a framework which was capable of determining thegain or loss, in a global point of view, of switching from oneJava collection to another. Nevertheless, we have studied thebehavior of a broader set of data structure implementations,divided between the appropriate groups, different populationsizes, and a larger number of operations per structure. Morerecently, a study of the energy profiles of java collectionclasses has been performed [28]. As this paper is currentlynot published nor public, unfortunately we cannot properlycompare the two works. Nevertheless, the energy quantificationis not the only and main contribution in our paper.

VI. FUTURE WORK

There are several directions for future work. We willcontinue to evolve our data and perform further tests andanalyses. More specifically, we will extend our tests to evaluateother population sizes, more data structures, interface specificmethods, and other types of input other than Strings.

To choose the most efficient alternative implementation,we are defining a new algorithm which uses the number ofoccurrences of the methods, and different weights for differentmethods.

Page 7: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

We are also planning to extend this work into an automaticanalysis and refactoring tool plugin. This tool would woulddetect if an energy inefficient data structure is being used,suggest an alternative energy efficient data structure, and evenautomatically refactor the source-code to optimize consump-tion. The chosen alternative would be based on which methodsare being used, either automatically chosen or chosen by

VII. CONCLUSION

This paper presented a detailed study of the energy con-sumption of the Sets, Listsand Mapsdata structures includedin the Java collections framework. We presented a quantifica-tion of the energy spent by each API method of each of thosedata structures (RQ 1 answer).

Moreover, we introduced a very simple methodology tooptimize Java programs. Based on their JCF data structuresand methods, and our energy quantifications, a transformationto decrease the energy consumption is suggested. We havepresented our first experimental results that show a decreaseof 6.2% in energy consumption (RQ 2 answer).

REFERENCES

[1] R. R. Harmon and N. Auseklis, “Sustainable it services: Assessing theimpact of green computing practices,” in Management of Engineering& Technology, 2009. PICMET 2009. Portland International Conferenceon. IEEE, 2009, pp. 1707–1717.

[2] I. Manotas, L. Pollock, and J. Clause, “Seeds: A software engineer’senergy-optimization decision support framework,” in Proc. of the 36thInternational Conference on Software Engineering. ACM, 2014, pp.503–514.

[3] H. David, E. Gorbatov, U. R. Hanebutte, R. Khanna, and C. Le, “Rapl:memory power estimation and capping,” in Low-Power Electronicsand Design (ISLPED), 2010 ACM/IEEE International Symposium on.IEEE, 2010, pp. 189–194.

[4] M. Hahnel, B. Dobel, M. Volp, and H. Hartig, “Measuring energyconsumption for short code paths using RAPL,” SIGMETRICSPerformance Evaluation Review, vol. 40, no. 3, pp. 13–17, 2012.[Online]. Available: http://doi.acm.org/10.1145/2425248.2425252

[5] E. Rotem, A. Naveh, A. Ananthakrishnan, E. Weissmann,and D. Rajwan, “Power-management architecture of the intelmicroarchitecture code-named sandy bridge,” IEEE Micro,vol. 32, no. 2, pp. 20–27, 2012. [Online]. Available:http://doi.ieeecomputersociety.org/10.1109/MM.2012.12

[6] K. Liu, G. Pinto, and Y. D. Liu, “Data-oriented characterization ofapplication-level energy optimization,” in Fundamental Approaches toSoftware Engineering. Springer, 2015, pp. 316–331.

[7] T. Yuki and S. Rajopadhye, “Folklore confirmed: Compiling for speed=compiling for energy,” in Languages and Compilers for Parallel Com-puting. Springer, 2014, pp. 169–184.

[8] A. E. Trefethen and J. Thiyagalingam, “Energy-aware software: Chal-lenges, opportunities and strategies,” Journal of Computational Science,vol. 4, no. 6, pp. 444–449, 2013.

[9] G. Pinto, F. Castor, and Y. D. Liu, “Understanding energy behaviorsof thread management constructs,” in Proceedings of the 2014 ACMInternational Conference on Object Oriented Programming SystemsLanguages & Applications. ACM, 2014, pp. 345–360.

[10] T. D. Cook, D. T. Campbell, and A. Day, Quasi-experimentation:Design & analysis issues for field settings. Houghton Mifflin Boston,1979, vol. 351.

[11] M. Cohen, H. S. Zhu, E. E. Senem, and Y. D. Liu, “Energy types,” inACM SIGPLAN Notices, vol. 47, no. 10. ACM, 2012, pp. 831–850.

[12] G. Pinto, F. Castor, and Y. D. Liu, “Mining questions about softwareenergy consumption,” in Proceedings of the 11th Working Conferenceon Mining Software Repositories. ACM, 2014, pp. 22–31.

[13] D. Li, Y. Jin, C. Sahin, J. Clause, and W. G. Halfond, “Integratedenergy-directed test suite optimization,” in Proc. of the 2014 Int. Symp.on Software Testing and Analysis. ACM, 2014, pp. 339–350.

[14] C. Brandolese, W. Fornaciari, F. Salice, and D. Sciuto, “The impact ofsource code transformations on software power and energy consump-tion,” Journal of Circuits, Systems, and Computers, vol. 11, no. 05, pp.477–502, 2002.

[15] D. Li, S. Hao, W. G. Halfond, and R. Govindan, “Calculating sourceline level energy information for android applications,” in Proceedingsof the 2013 International Symposium on Software Testing and Analysis.ACM, 2013, pp. 78–89.

[16] M. Linares-Vasquez, G. Bavota, C. Bernal-Cardenas, R. Oliveto,M. Di Penta, and D. Poshyvanyk, “Mining energy-greedy api usagepatterns in android apps: an empirical study,” in Proceedings of the11th Working Conference on Mining Software Repositories. ACM,2014, pp. 2–11.

[17] C. Sahin, F. Cayci, I. L. M. Gutierrez, J. Clause, F. Kiamilev, L. Pollock,and K. Winbladh, “Initial explorations on design pattern energy usage,”in Green and Sustainable Software (GREENS), 2012 First InternationalWorkshop on. IEEE, 2012, pp. 55–61.

[18] C. Sahin, L. Pollock, and J. Clause, “How do code refactorings affectenergy usage?” in Proceedings of the 8th ACM/IEEE InternationalSymposium on Empirical Software Engineering and Measurement.ACM, 2014, p. 36.

[19] C. Sahin, P. Tornquist, R. McKenna, Z. Pearson, and J. Clause, “Howdoes code obfuscation impact energy usage?” in Software Maintenance(ICSM), 2013 29th IEEE International Conference on. IEEE, 2014.

[20] A. Vetro’, L. Ardito, G. Procaccianti, and M. Morisio, “Definition,implementation and validation of energy code smells: an exploratorystudy on an embedded system,” in Proceedings of ENERGY 2013 : TheThird International Conference on Smart Grids, Green Communicationsand IT Energy-aware Technologies, 2013, pp. 34–39.

[21] M. Couto, T. Carcao, J. Cunha, J. Fernandes, and J. Saraiva, “De-tecting anomalous energy consumption in android applications,” inProgramming Languages, ser. Lecture Notes in Computer Science,F. Quintao Pereira, Ed., vol. 8771. Springer International Publishing,2014, pp. 77–91.

[22] S. Hao, D. Li, W. G. J. Halfond, and R. Govindan, “Estimating androidapplications’ cpu energy usage via bytecode profiling,” in Proceedingsof the First International Workshop on Green and Sustainable Software(GREENS), May 2012, pp. 1–7.

[23] ——, “Estimating mobile application energy consumption using pro-gram analysis,” in Proceedings of the 35th International Conference onSoftware Engineering (ICSE), May 2013.

[24] S. Nakajima, “Model-based power consumption analysis of smartphoneapplications,” in Proceedings of the 6th International Workshop onModel Based Architecting and Construction of Embedded Systemsco-located with ACM/IEEE 16th International Conference on ModelDriven Engineering Languages and Systems (MoDELS 2013), Miami,Florida, USA, September 29th, 2013., 2013.

[25] ——, “Model checking of energy consumption behavior,” in ComplexSystems Design & Management Asia, Designing Smart Cities: Proceed-ings of the First Asia - Pacific Conference on Complex Systems Design& Management, CSD&M Asia 2014, Singapore, December 10-12, 2014,2014, pp. 3–14.

[26] A. Pathak, Y. C. Hu, M. Zhang, P. Bahl, and Y.-M. Wang,“Fine-grained power modeling for smartphones using system calltracing,” in Proceedings of the Sixth Conference on Computer Systems,ser. EuroSys ’11. New York, NY, USA: ACM, 2011, pp. 153–168.[Online]. Available: http://doi.acm.org/10.1145/1966445.1966460

[27] L. Zhang, B. Tiwana, Z. Qian, Z. Wang, R. P. Dick, Z. M. Mao,and L. Yang, “Accurate online power estimation and automatic batterybehavior based power model generation for smartphones,” in Pro-ceedings of the Eighth IEEE/ACM/IFIP International Conference onHardware/Software Codesign and System Synthesis, ser. CODES/ISSS’10. New York, NY, USA: ACM, 2010, pp. 105–114.

[28] S. Hasan, Z. King, M. Hafiz, M. Sayagh, B. Adams, and A. Hindle,“Energy profiles of java collections classes,” in Proceedings of the 38thInternational Conference on Software Engineering (ICSE), Austin, TX,US, May 2016, to appear.

Page 8: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

APPENDIX

Pereira, Couto, Cunha, Fernandes, and Saraiva

Page 9: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

2

Page 10: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

Appendices

A Set data for 25k population

3

Page 11: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

B Set data for 250k population

4

Page 12: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

5

Page 13: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

C Set data for 1m population

6

Page 14: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

7

Page 15: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

D List data for 25k population

8

Page 16: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

E List data for 250k population

9

Page 17: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

10

Page 18: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

F List data for 1m population

11

Page 19: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

12

Page 20: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

13

Page 21: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

G Map data for 25k population

14

Page 22: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

H Map data for 250k population

15

Page 23: The Influence of the Java Collection Framework on Overall ...The Influence of the Java Collection Framework on Overall Energy Consumption Rui Pereira y, Marco Couto , J´acome Cunha

I Map data for 1m population

16