11
Research Article Power Optimization of Multimode Mobile Embedded Systems with Workload-Delay Dependency Hoeseok Yang 1 and Soonhoi Ha 2 1 Department of ECE, Ajou University, Suwon 16499, Republic of Korea 2 Department of CSE, Seoul National University, Seoul 08826, Republic of Korea Correspondence should be addressed to Soonhoi Ha; [email protected] Received 24 March 2016; Accepted 14 June 2016 Academic Editor: Yuh-Shyan Chen Copyright © 2016 H. Yang and S. Ha. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. is paper proposes to take the relationship between delay and workload into account in the power optimization of microprocessors in mobile embedded systems. Since the components outside a device continuously change their values or properties, the workload to be handled by the systems becomes dynamic and variable. is variable workload is formulated as a staircase function of the delay taken at the previous iteration in this paper and applied to the power optimization of DVFS (dynamic voltage-frequency scaling). In doing so, a graph representation of all possible workload/mode changes during the lifetime of a device, Workload Transition Graph (WTG), is proposed. en, the power optimization problem is transformed into finding a cycle (closed walk) in WTG which minimizes the average power consumption over it. Out of the obtained optimal cycle of WTG, one can derive the optimal power management policy of the target device. It is shown that the proposed policy is valid for both continuous and discrete DVFS models. e effectiveness of the proposed power optimization policy is demonstrated with the simulation results of synthetic and real-life examples. 1. Introduction Today’s mobile embedded systems oſten interact with physi- cal processes or external environments, referred to as Cyber- Physical Systems (CPSs). Such systems are usually mod- eled with interactions between the physical world and the devices [1]. For instance, handheld or stationary embedded systems need to continuously interact with environments in the example of smart building [2]. e system performs a computational task and responds through an actuator to the physical side, while the resulting change at the physical side, in turn, makes a variation on the input (sensor) of the device. In order not to make this control loop unstable, it is common that the embedded system has a real-time constraint within which all the computation should be completed. In a class of applications, the computational workload of the embedded systems depends on the variation of the sampled input value, while the computation delay, in turn, affects the input variation of the next iteration. Usually, if it invests more time at one iteration for processing informa- tion, it would have more work to do at the next iteration. One example of such delay-workload dependency can be found in an object tracking which is frequently used in drone, surveillance camera, or augmented reality [3–5]. e image obtained from the camera is processed by the object tracker to follow an object. As the object may continuously change its position meanwhile, the object tracker should reactively take an image from the adjusted position/angle to make the next decision. e more time the object spends in the tracker, the more distance the object will move by. Such workload-delay relations can be popularly found in modern mobile embedded systems, which rely on computer vision algorithms to capture what happens in the external world. In those applications, it is typical that the current internal state is maintained to figure out the difference caused by what happened in the external world. e examples of such internal states range from a simple snapshot of a sensor read- ing to a complicated model of the scene obtained from cam- era. No matter what the model is, it is generally true that the longer execution delay between two consecutive invocations of the algorithm results in the larger workload in the succes- sive iteration as the degree of the heterogeneity gets bigger. Hindawi Publishing Corporation Mobile Information Systems Volume 2016, Article ID 2010837, 10 pages http://dx.doi.org/10.1155/2016/2010837

Research Article Power Optimization of Multimode Mobile

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article Power Optimization of Multimode Mobile

Research ArticlePower Optimization of Multimode Mobile EmbeddedSystems with Workload-Delay Dependency

Hoeseok Yang1 and Soonhoi Ha2

1Department of ECE Ajou University Suwon 16499 Republic of Korea2Department of CSE Seoul National University Seoul 08826 Republic of Korea

Correspondence should be addressed to Soonhoi Ha shasnuackr

Received 24 March 2016 Accepted 14 June 2016

Academic Editor Yuh-Shyan Chen

Copyright copy 2016 H Yang and S Ha This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

This paper proposes to take the relationship between delay andworkload into account in the power optimization ofmicroprocessorsin mobile embedded systems Since the components outside a device continuously change their values or properties the workloadto be handled by the systems becomes dynamic and variableThis variable workload is formulated as a staircase function of the delaytaken at the previous iteration in this paper and applied to the power optimization of DVFS (dynamic voltage-frequency scaling)In doing so a graph representation of all possible workloadmode changes during the lifetime of a device Workload TransitionGraph (WTG) is proposed Then the power optimization problem is transformed into finding a cycle (closed walk) in WTGwhich minimizes the average power consumption over it Out of the obtained optimal cycle of WTG one can derive the optimalpower management policy of the target device It is shown that the proposed policy is valid for both continuous and discrete DVFSmodels The effectiveness of the proposed power optimization policy is demonstrated with the simulation results of synthetic andreal-life examples

1 Introduction

Todayrsquos mobile embedded systems often interact with physi-cal processes or external environments referred to as Cyber-Physical Systems (CPSs) Such systems are usually mod-eled with interactions between the physical world and thedevices [1] For instance handheld or stationary embeddedsystems need to continuously interact with environments inthe example of smart building [2] The system performs acomputational task and responds through an actuator to thephysical side while the resulting change at the physical sidein turn makes a variation on the input (sensor) of the deviceIn order not to make this control loop unstable it is commonthat the embedded system has a real-time constraint withinwhich all the computation should be completed

In a class of applications the computational workloadof the embedded systems depends on the variation of thesampled input value while the computation delay in turnaffects the input variation of the next iteration Usually if itinvests more time at one iteration for processing informa-tion it would have more work to do at the next iteration

One example of such delay-workload dependency can befound in an object tracking which is frequently used in dronesurveillance camera or augmented reality [3ndash5] The imageobtained from the camera is processed by the object trackerto follow an object As the object may continuously change itsposition meanwhile the object tracker should reactively takean image from the adjusted positionangle to make the nextdecision The more time the object spends in the tracker themore distance the object will move by

Such workload-delay relations can be popularly found inmodern mobile embedded systems which rely on computervision algorithms to capture what happens in the externalworld In those applications it is typical that the currentinternal state is maintained to figure out the difference causedbywhat happened in the external worldThe examples of suchinternal states range from a simple snapshot of a sensor read-ing to a complicated model of the scene obtained from cam-era No matter what the model is it is generally true that thelonger execution delay between two consecutive invocationsof the algorithm results in the larger workload in the succes-sive iteration as the degree of the heterogeneity gets bigger

Hindawi Publishing CorporationMobile Information SystemsVolume 2016 Article ID 2010837 10 pageshttpdxdoiorg10115520162010837

2 Mobile Information Systems

The workload-delay dependency can also be foundin many different types of applications Real-time patternmatching over event streams [6] for instance exhibits similarbehavior the queries can be handled either by small amount(shorter delay less workload) or in an aggregated manner(longer delay more workload) Similarly haptic rendering inHuman-Computer Interface (HCI) uses adaptive samplingtechniques to deal with the stringent real-time constraint[7] and the rendering algorithm can be warm-started toexploit the temporal coherence [8] In essence applicationswhich exploit temporal coherence have possible workload-delay dependencies That is any iterative algorithms that canbe warm-started can lead to one

Nowadays mostmodernmicroprocessors used inmobileembedded systems support dynamic voltage-frequency scal-ing (DVFS) [9] for power-efficient operations Generallydelay and energy in the systems with DVFS are in a tradeoffrelationship for a given workload That is given a certainamount of work to be handled a faster solution (with a higherfrequency) is less energy efficient Considering this controlknob with the aforementioned delay-workload dependencythe power optimization problem gets very challenging Con-ventionally it has just been understood that ldquoworking as slowas possiblerdquo within the real-time constraint is the best disci-pline in terms ofminimizing the power dissipation Howeverwith the existence of the workload-delay dependency it isno longer valid since a slower execution may cause a biggerworkload at the next iteration On the other hand ldquoas fast aspossiblerdquo is not optimal either as the power consumption is astrong function of the operating speed [10]

Theworkload-delay dependency has been firstlymodeledand applied to the DVFS optimization in [11] It is assumedthat the workload is a continuous andmonotonically increas-ing function of the delay under which a simple yet effectivepower management technique has been proposed Specifi-cally it has been shown that staying in a certain DVFS modeis better than alternating between different DVFS modesdynamically Later the optimization is generalized to variouspower models and formally proven to be optimal [12]

This work differs from our previous work [12] in thatwe take different optimization approach tailored for discreteworkload levels We observed that the continuity assumptiondoes not always hold true in reality Rather there are anumber of applications that have discrete levels of workloadFor instance recall that many image or signal processingalgorithms handle input data in the unit of macroblock orframe In such application domains the workload tends togrow in a discrete manner In this paper the workload ismodeled as a staircase function of the delay taken in theprevious iteration Since the solution obtained by the previouswork [11 12] is no longer optimal or nonexisting at all inthe staircase model a new power management technique isproposedThe contributions of this paper can be summarizedas follows

(i) The workload-delay dependency is modeled in astaircase function generalizing the previous modeland validated with a real-life example

(ii) A novel data structure Workload Transition Graph(WTG) is proposed to represent all possible opera-tion workloadmode changes of a device

(iii) Based on WTG a power management policy isderived and shown to be optimal

2 Related Work

Bogdan and Marculescu [13] observed that workloads fromphysical processes tend to be nonstationary but exhibit somesystematic relationship in space and time They proposeda workload characterization approach based on statisti-cal physics and showed how the workload-awareness canimprove the design of electronic systems Zhang et al [14]studied the relationship between the control stability andworkload in inverted pendulum control While enlargedinvocation periods may lower the degree of stability moreinverted pendulums can be controlled by a system as thelengthened invocation periods lower the utilization of thealgorithmThis can be seen as trading off the control stabilityfor resource efficiency In other words they proposed tosacrifice the stability to accommodate more workload ina system The proposed technique also deals with variableworkload in electronic systems but differs from the above-mentioned works in that the effect of execution delay onworkload is systematically considered

Recently Pant et al [15] proposed a codesign of computa-tion delay and control stability based on anytime algorithmAnytime algorithm is a kind of algorithms that can be stoppedat any point in time but still provides a decent solutionTypically the quality of the solution is increasing functionof the computation delay In their work it is the duty ofthe control algorithm to adaptively change the real-timedeadline constraint and error bound (quality of control) Onthe contrary the relationship between execution delay andworkload is formally described in the form of workload-delay function thus no explicit runtime monitoringcontrolis required in the proposed technique

A design guideline for flexible delay constraints in dis-tributed embedded systems was proposed by Goswami etal [16] where some of the samples are allowed to violatethe given delay deadline They presented the applicability ofthe proposed approach using the FlexRay dynamic segmentas a communication medium This work is similar to theproposed approach in the sense that they do not stick to agiven fixed real-time deadline While they could avoid theresource overprovisioning by trading off the hard real-timeconstraints the workload dependency to the delay has notbeen considered Moreover from a real-time standpoint theproposed work is more rigorous as it allows no real-timeconstraint violations

3 Problem Definition

This section presents the system model assumed in thispaper which is followed by the formulation of the poweroptimization problem

Mobile Information Systems 3

31 System Model

311 Dynamic Voltage-Frequency Scaling In this paper weassume that a system has multiple operation modes due toDVFS feature where the operating frequency and voltagecan be modulated For simplicity we first assume that thereare infinitely many operation modes available among whichone is chosen at each iteration It will be shown that theproposed technique can be applied to a discrete DVFS aswell in Section 5 The operation mode at the 119894th iteration isrepresentedwith the speed scaling factor 119896

119894ranging from 119896min

to 119896max = 1 (119896min le 119896 le 119896max = 1) Then the operatingfrequency of the 119894th iteration 119891

119894 is

119891119894= 119896119894sdot 119891max (1)

where119891max is themaximum frequency of themicroprocessor

312 Workload The workload is defined to be a number ofclock cycles elapsed to complete the given computation Wedenote the number of cycles elapsed to handle the workloadof the 119894th iteration at the full speed of themicroprocessor (119896

119894=

1) as 119905ref 119894 That is

119908119894= 119905ref119894 = 119896119894 sdot 119905119894 (2)

Note that the elapsed time 119905119894increases as the speed is scaled

down (119896119894lt 1) Then the delay 119905

119894is automatically determined

when a speed scaling factor 119896119894is chosen for the given

workload (119905119894= 119908119894119896119894)

313 Real-Time Constraint The delay 119905119894cannot be unbound-

edly long as the system is associated with real-time constraint119879 For all iterations the elapsed time should be no more than119879

119905119894le 119879 forall119894 (3)

314 Delay-Workload Dependency As stated earlier theworkload is dependent upon the previous execution delayUsually the workload is not a continuous function of thedelay variation Rather the changes happen in a discretemanner Therefore the workload at the 119894th iteration is amonotonically increasing staircase function of the delay ofthe previous iteration 119905

119894minus1 119908119894= 119882(119905

119894minus1) If the given system

has 119899 workload levels the workload function 119882 can beformulated as follows

119882(119905) =

wl1

if 0 lt 119905 le th1

wl2

if th1lt 119905 le th

2

wl119899minus1

if th119899minus2

lt 119905 le th119899minus1

wl119899

if th119899minus1

lt 119905

(4)

in which the workload levels are wl1lt wl2lt sdot sdot sdot lt wl

119899and

the delay thresholds (workload changing moments) are th1lt

th2lt sdot sdot sdot lt th

119899minus1

315 Execution Trace At the 119894th iteration the speed scalingfactor 119896

119894uniquely defines an execution mode as the delay is

fixed accordingly by (2)The initial workload is assumed to begiven as119908

1 Then an execution trace tr of length 119871 is defined

to be a sequence of the speed scaling factors of 119871 iterations

tr fl (1198961 1198962 119896

119871) (5)

316 Average Power Consumption The dynamic power con-sumption of CMOS circuits is 119888V2

119889119889119891 where 119888 V

119889119889 and 119891 are

capacitance operating voltage and frequency respectivelyAs the operating frequency is proportional to (V

119889119889minus Vth)2V119889119889

[10] the power consumption is an increasing function of 119891It is worth noting that the proposed model is not dependentupon any specific DVFS model We denote the energyconsumption of a unit workload at the full speed (119896 = 1) as119862119890and assume that energy dissipation grows linearly to the

size of workloadThen the reference energy of a workload119908119894

at the full speed is 119862119890sdot 119908119894 Given a DVFS energy model

119878 as a function of the speed scaling factor 119896 the energyconsumption at the 119894th iteration EG

119894is formulated as follows

EG119894= 119862119890sdot 119908119894sdot 119878 (119896119894) (6)

in which 119878(0) = 0 and 119878(1) = 1 Then the average powerconsumption of a trace tr can be formulated as follows

AVG (tr) =sum|tr|119894=1

EG119894

sum|tr|119894=1119905119894

(7)

It is worthwhile to mention that the proposed technique isnot specific to a certain workload-energy model While weadopt linear model for the workload-energy relation for easeof presentation any possibly nonlinear model can be used in(6)

32 Problem Formulation Our objective is to minimize theaverage power consumption of a given system as follows

Given the modeling constant 119862119890 DVFS energy mod-

eling function 119878 workload function119882 and the real-time constraint 119879 determine an execution trace trsuch that the average power consumption formulatedin (7) is minimized

4 Proposed Technique

In this section we describe the proposed operation man-agement policy as an answer to the problem defined in theprevious section In doing so we first derive the condition forfeasible and schedulable systems Then we study when theworkload changes and how it affects the power dissipationBased on that we propose a novel graph representationthat captures all possible workload transitions in the power-optimal operation Finally we derive the power-optimaloperation policy with the given workload function119882

41 Feasibility In this subsection we examine under whichcondition a given system is feasible First the system should

4 Mobile Information Systems

W

wlp

wlq

wlr

thr thq

w = t

t

Currentworkload

Invalid Valid

le 1wlpthqgt 1wlpthr

(a)

W

wlp

wlq

wlr

t

Currentworkload

thrminus1thqminus1

w = kmint

InvalidValidminus1 ge kminwlpthq

minus1 lt kminwlpthr

(b)

Figure 1 Examples of workload transitions (a) transitions to lower workload levels and (b) transitions to higher workload levels Thetransitions from wl

119901to wl119902are valid while the ones from wl

119901to wl119903are not

be schedulable within the given real-time constraint 119879 atevery iteration

Theorem 1 (schedulability) Given the workload function 119882and the real-time constraint 119879 the system is not schedulable if119882(119905)119905 gt 1 forall119905 isin (0 119879]

Proof Suppose that the delay at the 119894th iteration is 119905119894 Then

119908119894+1

= 119882(119905119894) gt 119905119894and119908

119894+1= 119896119894+1sdot119905119894+1

Since 119896119894+1

le 1 119905119894lt 119905119894+1

That is the delay is increasing as iteration goes by and willeventually reache the real-time constraint 119905

119895= 119879 At the next

iteration the system becomes unschedulable even with thefull speed as119882(119905

119895= 119879) = 1 sdot 119905

119895+1gt 119879 requires 119905

119895+1gt 119879

Once the workload gets bigger than 119879 the system istrivially not schedulable afterwards even with the full speed119896119894= 1 Thus the workload must not be bigger than 119879 at

any time Moreover once the workload reaches 119879 119896 shouldremain the full speed 119896 = 1 afterwards We can make theupper bound of workload even tighter if there exists 1199051015840 suchthat 119882(119905) gt 119905 for all 119905 isin (119905

1015840 119879] In this case the workload

larger than 119882(1199051015840) is not allowable as it makes the execution

delay longer and longer eventually violating the deadlineGiven the workload function119882 and the initial workload

1199081 one can calculate the lower bound of theworkload as well

If a value exists which satisfies 119882(119905) = and 119882(119905) lt 1199081

the workload will never become smaller than119882(119905) In otherwords even with the full speed the execution delay nevergoes below

Then the valid workload levels and the execution delayrange during the lifetime of a given system can be formulatedas below

Definition 2 (valid ranges) Given the workload function119882and the initial workload 119908

1 the minimum and maximum

workload levels of a system are defined to be

wlmin =

119882(max (119905)) st = 119882 (119905) lt 1199081

if exist

wl1

otherwise(8)

wlmax

=

119882(min (1199051015840)) st 119882(119905) gt 119905 forall119905 isin (1199051015840 119879] if exist 1199051015840

119882(119879) otherwise

(9)

Then the valid range of the execution delay is formulated as[119905min 119905max] according to wlmin and wlmax with

119905min = min (119905) st 119882(119905) = wlmin

119905max = min (119879max (1199051015840)) st 119882(1199051015840) = wlmax

(10)

42 Workload Transitions In this subsection we examinewhen a workload transition between valid workload levelspossibly occurs and how it affects the system

As presented in (4) workload is a function of the delaytaken at the previous iteration If the delay taken at aniteration is 119905 and th

119897minus1lt t le th

119897Then if the systemworks fast

enough to result in a shorter delay 1199051015840 le th119897minus1

the nextworkload will get smaller than wl

119897 Similarly in case that the

delay gets longer (1199051015840 gt th119897) the system will need to handle a

larger workload than wl119896at the successive iteration

However such workload transitions can occur onlywithin limited ranges Figure 1 depicts valid and invalidtransitions from one workload level Figure 1(a) shows twotransitions from a workload level wl

119901to lower ones wl

119902and

wl119903(119903 lt 119902 lt 119901) To make the next workload level wl

119902 the

delay should be in the range of (th119902minus1 th119902] Given the current

workload wl119901 the speed scaling factor should be larger than

or equal to wl119901th119902from (2) If wl

119901th119902

le 119896max = 1this workload transition can possibly occur In contrast ifwl119901th119903gt 1 for another workload level wl

119903 the transition

from wl119901to wl119903never happens because the delay never goes

below th119903even with the full processing speed

The same principle is also applied to the transitionfrom a workload level to higher ones If the delay canbe lengthened properly with a speed scaling factor withinthe range [119896min 119896max] the transitions are valid Figure 1(b)illustrates that the transition from wl

119901to wl119902is valid while

Mobile Information Systems 5

the one to wl119903is not One can tell if a transition can happen

or not with the following definition

Definition 3 (valid transition) A workload transition fromwl119900to wl119899is said to be valid if wl

119900th119899le 119896max = 1 and

wl119900th119899minus1

ge 119896min

43WorkloadTransitionGraph Theessential difficulty of thepresented power optimization problem lies in the fact thattwo conflicting forces should be handled at the same timeIn order to minimize the power on one hand it tries to scaledown the speed (thus lengthen the delay) as much as possibleas described in (2) and (6) On the other hand the lengtheneddelay is not desirable as it imposes a bigger workload in thesuccessive iteration as shown in (4)

Therefore no one simple intuition can be exploited tosolve the problem Rather we need to compare differentmodes in a comprehensive way In order to be able to exploreall possible execution modes and quantify their effects weneed to devise a data structure that includes elementaryinformation on how workload transitions change the systemstatus and power dissipation In line with that purpose wepropose a graph representation of the workload evolutionWorkload Transition Graph (WTG) which captures all pos-sible transition scenarios of the workload changes during thelifetime of a system

A valid workload transition from one workload level toanother can be caused by any delay within the correspondingrange A transition fromwl

119901towl119902in Figure 1(a) for instance

can be caused by any delay within the range of (th119902minus1 th119902] In

other words when handling a workload of wl119901 any scaling

factor within the range of [wl119901th119902wl119901th119902minus1) can cause the

transition In the power-optimal execution trace howeveronly one specific scaling factor is always chosen for a certaintransition even though it happens many times We show thisin the next theorem

Theorem 4 (optimal scaling factor) In the power-optimaltrace 119905119903 = (119896

1 1198962 119896

|119905119903|) if the workload level handled by

119896119894is 119908119897119900and the next workload level is 119908119897

119899

119896119894= min(

119908119897119900

119905ℎ119899

119896min) (11)

Proof We prove this by contradiction Let us suppose thatthere exists a power-optimal trace tr where 119896

119894(which

results in the transition from wl119900to wl119899) is not equal to

min(wl119900th119899 119896min) Then we make a new execution trace tr1015840

from tr by replacing 119896119894with 1198961015840

119894= min(wl

119900th119899 119896min) Note

that all other execution modes in tr1015840 are the same as in trThat is

tr (119895) = tr1015840 (119895) forall119895 = 119894 (12)

By the definition of the scaling factor

119896119894ge 119896min

119896119894gewl119900

th119899

(13)

since the transition is from wl119900to wl119899 Thus

tr (119894) = 119896119894 gt tr1015840 (119894) = 1198961015840119894

(14)

and accordingly we obtain

119878 (119896119894) gt 119878 (119896

1015840

119894) (15)

Then from (6) and (7) the average power in tr1015840 is lower thanthat of tr and this contradicts the proposition

From Theorem 4 we know that only one speed scalingfactor is associated with all transitions of a certain type inthe power-optimal operation scenario So we define a scalingfactor of a valid transition as follows

Definition 5 (scaling factor of a transition) The scaling factorof a valid workload transition from wl

119900to wl119899is

sf (wl119900wl119899) = max(

wl119900

min (th119899 119879)

119896min) (16)

Now we define WTG

Definition 6 (Workload Transition Graph) WTG is definedto be a graph (119881 119864) where119881 and119864 are the sets of vertices andedges respectively Each valid workload level forms a vertex

119881 = V119894= wl119894| wlmin le wl

119894le wlmax (17)

while a valid transition from a workload level to anotherforms an edge between vertices That is there exists an edgefrom V

119894to V119895corresponding to a valid transition from wl

119894to

wl119895

119864 = (V119894 V119895) | exist valid transition from wl

119894to wl119895 (18)

We denote the source and destination vertices of an edge 119890 isin119864 as src(119890) and dst(119890) respectively

With these definitions a power-optimal execution tracetr = (119896

1 1198962 119896

|tr|) can be represented as a walk (asequence of vertices where any pair of consecutive verticesare connected through an edge) of length |tr| + 1 in WTG Ina different form the execution trace is a sequence of edgesin WTG of length |tr|(119890

1 1198902 119890

|tr|) where the workloadtransition from src(119890

119894) to dst(119890

119894) is caused by the execution

mode sf(src(119890119894) dst(119890

119894)) = 119896

119894

Algorithm 1 shows howWTG is generated out of the givenworkload function119882 and the initial workload 119908

1 After the

initialization all valid workload levels are added as verticesin lines (5)-(6) Then for each permutation of two workloadlevels (lines (7)-(8)) it is checked if the in-between transitionis valid or not in line (9) If valid it is added as an edge in line(10)

Let us take Figure 2 as an example of workload function119882 Once given the initial workload one can easily get thevalid workload levels according to (8) and (9) When 119908

1=

wl2 for instance wlmin = wl

2and wlmax = wl

6 Figure 3(b)

illustrates the corresponding WTG of the workload function

6 Mobile Information Systems

(1) procedure GenerateWTG (1198821199081)

(2) wlmin larr equation (8) ⊳ Initializations(3) wlmax larr equation (9)(4) 119881 larr 120601 119864 larr 120601(5) for all wlmin le wl le wlmax do ⊳ Vertices(6) 119881 larr 119881 cup wl(7) for all wlmin le wl

1le wlmax do ⊳ Edges

(8) for all wlmin le wl2le wlmax do

(9) if exist valid transition from wl1to wl2then

(10) 119864 larr 119864 cup (wl1wl2)

Algorithm 1 Generation of WTG

y

t

T

wl1

wl2

wl3

wl4

wl5

wl6

th1 th2 th3 th4 th5

y = kmaxt

y = kmint

Feasible delay range of wl4

Figure 2 A workload staircase function example with six workloadlevels

shown in Figure 2 with the initial workload of wl2 It is not a

complete graph as some pairs of vertices cannot be connecteddirectly since it is not valid according to Definition 3

The feasible delay range to handle workload wl4is

highlighted in Figure 2 justifying that vertex wl4has three

outgoing edges to wl6 wl5 and itself To be more specific

when the workload is given as wl4 the shortest possible delay

is the case when the speed is chosen as 119896max Then the delay(119909 coordinate of the cross point of 119910 = wl

4and 119910 = 119896max119905) is

between th3and th

4as shown in Figure 2Thismeans that the

lowest possibleworkload in the next iteration iswl4 Similarly

the biggest possible workload can also be calculated as wl6

as the biggest possible delay is between th5and 119879 Note also

that some of the vertices may not have an edge directed toitself (self-loop) such as vertex wl

3 It has outgoing edges only

to higher workload levels This means that the computationburden of that state is so big that it only results in higherworkload levels at the next iteration even with the full speed

Different initial workload levels may result in differentWTGs as shown in Figures 3(a)ndash3(c)TheWTGderived fromthe initial workload level of wl

1is illustrated in Figure 3(a)

In contrast to Figure 3(b) vertex wl1is included in the graph

The WTG derived from higher initial workload levels wl4

wl5 and wl

6 is shown in Figure 3(c) Note that the WTGs in

Figures 3(a) and 3(c) are not strongly connected Vertex wl2in

Figure 3(b) for example is not reachable from wl4 However

from the definition of valid workload levels all vertices arereachable from the initial workload level This property is

important for deriving the optimal operation policy that willbe presented in the next subsection

44 Proposed Operation Policy In this subsection we presentthe proposed operation policy that compromises the energy-delay tradeoff caused by the delay-workload dependency

As stated earlier a power-optimal execution trace tr canbe represented as awalk ofWTGThen we have the followingdefinition

Definition 7 (corresponding walk) Given the power-optimalexecution trace tr = (119896

1 1198962 119896

|tr|) and its initial workload1199081 the corresponding walk of the trace is denoted as

cwtr1199081

= (1198901 1198902 119890

|tr|) (19)

where src(1198901) = 1199081and sf(src(119890

119894) dst(119890

119894)) = 119896

119894(for simplicity

we also denote it as sf(119890119894) = sf(src(119890

119894) dst(119890

119894)) = 119896

119894for the rest

of the paper)The average power consumption of thewalk canbe formulated as follows

AVG (cwtr1199081

) =

EG (cwtr1199081

)

DL (cwtr1199081

)

(20)

where DL((1198901 1198902 119890

119899)) is the total delay elapsed

for traversing the walk sum119899

119894=1(src(119890

119894)sf(119890

119894)) and

EG((1198901 1198902 119890

119899)) is the total energy consumption for

the walk 119862119890sum119899

119894=1src(119890119894)119878(sf(119890

119894))

A cycle (closed walk) of WTG is a walk whose startingand ending vertices are the sameThat is awalk (119890

1 1198902 119890

119899)

is a cycle if src(1198901) = dst(119890

119899) Hence if the corresponding

walk of a trace inWTG is a cycle the trace is ever repeatableWe argue that in case that the length of the trace issufficiently long (|tr| rarr infin) the average power consumptionis minimized when the cycle which minimizes (20) repeatsover and over again in the trace

Theorem 8 (optimal cycle of WTG) Suppose that a cycle ofWTG

119888119910119900119901119905

= (1198901 1198902 119890

119899) (21)

has the minimum value of (20) among all cycles of the WTGThen if the length of the trace is long enough the average powerconsumption of the optimal trace converges to

AVG (119888119910119900119901119905) =

EG (119888119910119900119901119905)

DL (119888119910119900119901119905)

(22)

Proof An arbitrary walk of a WTG (1198901 1198902 119890

119899) can be

decomposed into a path (a walk with distinct vertices) fromsrc(1198901) to dst(119890

119899) and a set of cycles (see eg Section 103 of

[17]) Figure 4 depicts a walk example of length 8 where theinitial workload level is wl

2and the last vertex that it traverses

is wl6 If a path from wl

2to wl6 (1198901 1198902 1198905) highlighted in the

dashed arrows is removed from the walk the remaining partis a set of cycles

Now consider a power-optimal trace tr of length 119873 thatstarts from the workload level of119908

1 The corresponding walk

Mobile Information Systems 7

wl1

wl2 wl3 wl4 wl5 wl6

(a)

wl2

wl3 wl4 wl5 wl6

(b)

wl4 wl5 wl6

(c)

Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl

2or wl3 and

(c) wl4 wl5 or wl

6

wl2 wl3 wl4 wl5 wl6e1

e2

e3

e4 e5

e6e7

e8

Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8

of tr can be decomposed into a path pt starting at 1199081and a

set of cycles CYThen the average power consumption of thetrace tr is

AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)

(23)

Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin

AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)

(24)

Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition

EG (cy)DL (cy)

ge AVG (cyopt) =EG (cyopt)

DL (cyopt) forallcy isin CY (25)

Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)

From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy

(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908

119894

Start

No Yes

Generate WTG

Input w1 and W

i = 1

Calculatecyopt and ptopt

Is wi in cyopt

Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))

Update wi+1i++

Update wi+1i++

Figure 5 Flowchart diagram of the proposed operation policy

(ii) Otherwise take a path ptopt = (1198901 1198902 119890

119899) that has

the minimum AVG(ptopt) value and dst(119890119899) is in the

optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost

The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)

Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on

8 Mobile Information Systems

5 Extension to Discrete DVFS

Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique

Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896

119894isin 119870 can be chosen as

a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement

Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl

119900

to wl119899is said to be valid if

th119899minus1

ltwl119900

119896le min (th

119899 119879) exist119896 isin 119870 (26)

Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition

Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl

119900to wl119899is

sf (wl119900wl119899)

= argmin119896isin119870

th119899minus1

ltwl119900

119896le min (th

119899 119879)

(27)

6 Experiments

In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations

61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10

and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time

5 10 15 20 25 30 350Delay (ms)

0

5

10

15

20

25

30

35

Wor

kloa

d (m

s)

Real-time constraint Tkt k = 02

kt k = 04

kt k = 06

kt k = 08

kt k = 10

W(t)

Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm

constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870

We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl

3 that is 119908

1= wl3= 9833

ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl

2 which implies a stable

operation mode forall119894 119896119894= 06

62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896

3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th

1= 02 th

2= 06 th

3= 07

and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example

The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand

Mobile Information Systems 9

W

t

wl4 = 06

wl3 = 05

wl1 = 03

wl2 = 04

k = 09 k = 08

k = 04

T = 10

th1 = 02

th2 = 06

th3 = 07

(a)

wl1

wl2

wl3 wl408

08

08

08

04

04

09

09

(b)

Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl

2= 04 wl

3= 05 and wl

4= 06 and (b) its WTG

representation where the edges are annotated with scaling factors

an alternating operation mode implied in wl2rarr wl

4rarr

wl3rarr wl2shows the least average power consumption in the

discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes

In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870

7 Conclusion

This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones

Disclosure

A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and

Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University

References

[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008

[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010

[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011

[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000

[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013

[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of

10 Mobile Information Systems

Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008

[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007

[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008

[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001

[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000

[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015

[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016

[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011

[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008

[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015

[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001

[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005

[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973

[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001

[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000

[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: Research Article Power Optimization of Multimode Mobile

2 Mobile Information Systems

The workload-delay dependency can also be foundin many different types of applications Real-time patternmatching over event streams [6] for instance exhibits similarbehavior the queries can be handled either by small amount(shorter delay less workload) or in an aggregated manner(longer delay more workload) Similarly haptic rendering inHuman-Computer Interface (HCI) uses adaptive samplingtechniques to deal with the stringent real-time constraint[7] and the rendering algorithm can be warm-started toexploit the temporal coherence [8] In essence applicationswhich exploit temporal coherence have possible workload-delay dependencies That is any iterative algorithms that canbe warm-started can lead to one

Nowadays mostmodernmicroprocessors used inmobileembedded systems support dynamic voltage-frequency scal-ing (DVFS) [9] for power-efficient operations Generallydelay and energy in the systems with DVFS are in a tradeoffrelationship for a given workload That is given a certainamount of work to be handled a faster solution (with a higherfrequency) is less energy efficient Considering this controlknob with the aforementioned delay-workload dependencythe power optimization problem gets very challenging Con-ventionally it has just been understood that ldquoworking as slowas possiblerdquo within the real-time constraint is the best disci-pline in terms ofminimizing the power dissipation Howeverwith the existence of the workload-delay dependency it isno longer valid since a slower execution may cause a biggerworkload at the next iteration On the other hand ldquoas fast aspossiblerdquo is not optimal either as the power consumption is astrong function of the operating speed [10]

Theworkload-delay dependency has been firstlymodeledand applied to the DVFS optimization in [11] It is assumedthat the workload is a continuous andmonotonically increas-ing function of the delay under which a simple yet effectivepower management technique has been proposed Specifi-cally it has been shown that staying in a certain DVFS modeis better than alternating between different DVFS modesdynamically Later the optimization is generalized to variouspower models and formally proven to be optimal [12]

This work differs from our previous work [12] in thatwe take different optimization approach tailored for discreteworkload levels We observed that the continuity assumptiondoes not always hold true in reality Rather there are anumber of applications that have discrete levels of workloadFor instance recall that many image or signal processingalgorithms handle input data in the unit of macroblock orframe In such application domains the workload tends togrow in a discrete manner In this paper the workload ismodeled as a staircase function of the delay taken in theprevious iteration Since the solution obtained by the previouswork [11 12] is no longer optimal or nonexisting at all inthe staircase model a new power management technique isproposedThe contributions of this paper can be summarizedas follows

(i) The workload-delay dependency is modeled in astaircase function generalizing the previous modeland validated with a real-life example

(ii) A novel data structure Workload Transition Graph(WTG) is proposed to represent all possible opera-tion workloadmode changes of a device

(iii) Based on WTG a power management policy isderived and shown to be optimal

2 Related Work

Bogdan and Marculescu [13] observed that workloads fromphysical processes tend to be nonstationary but exhibit somesystematic relationship in space and time They proposeda workload characterization approach based on statisti-cal physics and showed how the workload-awareness canimprove the design of electronic systems Zhang et al [14]studied the relationship between the control stability andworkload in inverted pendulum control While enlargedinvocation periods may lower the degree of stability moreinverted pendulums can be controlled by a system as thelengthened invocation periods lower the utilization of thealgorithmThis can be seen as trading off the control stabilityfor resource efficiency In other words they proposed tosacrifice the stability to accommodate more workload ina system The proposed technique also deals with variableworkload in electronic systems but differs from the above-mentioned works in that the effect of execution delay onworkload is systematically considered

Recently Pant et al [15] proposed a codesign of computa-tion delay and control stability based on anytime algorithmAnytime algorithm is a kind of algorithms that can be stoppedat any point in time but still provides a decent solutionTypically the quality of the solution is increasing functionof the computation delay In their work it is the duty ofthe control algorithm to adaptively change the real-timedeadline constraint and error bound (quality of control) Onthe contrary the relationship between execution delay andworkload is formally described in the form of workload-delay function thus no explicit runtime monitoringcontrolis required in the proposed technique

A design guideline for flexible delay constraints in dis-tributed embedded systems was proposed by Goswami etal [16] where some of the samples are allowed to violatethe given delay deadline They presented the applicability ofthe proposed approach using the FlexRay dynamic segmentas a communication medium This work is similar to theproposed approach in the sense that they do not stick to agiven fixed real-time deadline While they could avoid theresource overprovisioning by trading off the hard real-timeconstraints the workload dependency to the delay has notbeen considered Moreover from a real-time standpoint theproposed work is more rigorous as it allows no real-timeconstraint violations

3 Problem Definition

This section presents the system model assumed in thispaper which is followed by the formulation of the poweroptimization problem

Mobile Information Systems 3

31 System Model

311 Dynamic Voltage-Frequency Scaling In this paper weassume that a system has multiple operation modes due toDVFS feature where the operating frequency and voltagecan be modulated For simplicity we first assume that thereare infinitely many operation modes available among whichone is chosen at each iteration It will be shown that theproposed technique can be applied to a discrete DVFS aswell in Section 5 The operation mode at the 119894th iteration isrepresentedwith the speed scaling factor 119896

119894ranging from 119896min

to 119896max = 1 (119896min le 119896 le 119896max = 1) Then the operatingfrequency of the 119894th iteration 119891

119894 is

119891119894= 119896119894sdot 119891max (1)

where119891max is themaximum frequency of themicroprocessor

312 Workload The workload is defined to be a number ofclock cycles elapsed to complete the given computation Wedenote the number of cycles elapsed to handle the workloadof the 119894th iteration at the full speed of themicroprocessor (119896

119894=

1) as 119905ref 119894 That is

119908119894= 119905ref119894 = 119896119894 sdot 119905119894 (2)

Note that the elapsed time 119905119894increases as the speed is scaled

down (119896119894lt 1) Then the delay 119905

119894is automatically determined

when a speed scaling factor 119896119894is chosen for the given

workload (119905119894= 119908119894119896119894)

313 Real-Time Constraint The delay 119905119894cannot be unbound-

edly long as the system is associated with real-time constraint119879 For all iterations the elapsed time should be no more than119879

119905119894le 119879 forall119894 (3)

314 Delay-Workload Dependency As stated earlier theworkload is dependent upon the previous execution delayUsually the workload is not a continuous function of thedelay variation Rather the changes happen in a discretemanner Therefore the workload at the 119894th iteration is amonotonically increasing staircase function of the delay ofthe previous iteration 119905

119894minus1 119908119894= 119882(119905

119894minus1) If the given system

has 119899 workload levels the workload function 119882 can beformulated as follows

119882(119905) =

wl1

if 0 lt 119905 le th1

wl2

if th1lt 119905 le th

2

wl119899minus1

if th119899minus2

lt 119905 le th119899minus1

wl119899

if th119899minus1

lt 119905

(4)

in which the workload levels are wl1lt wl2lt sdot sdot sdot lt wl

119899and

the delay thresholds (workload changing moments) are th1lt

th2lt sdot sdot sdot lt th

119899minus1

315 Execution Trace At the 119894th iteration the speed scalingfactor 119896

119894uniquely defines an execution mode as the delay is

fixed accordingly by (2)The initial workload is assumed to begiven as119908

1 Then an execution trace tr of length 119871 is defined

to be a sequence of the speed scaling factors of 119871 iterations

tr fl (1198961 1198962 119896

119871) (5)

316 Average Power Consumption The dynamic power con-sumption of CMOS circuits is 119888V2

119889119889119891 where 119888 V

119889119889 and 119891 are

capacitance operating voltage and frequency respectivelyAs the operating frequency is proportional to (V

119889119889minus Vth)2V119889119889

[10] the power consumption is an increasing function of 119891It is worth noting that the proposed model is not dependentupon any specific DVFS model We denote the energyconsumption of a unit workload at the full speed (119896 = 1) as119862119890and assume that energy dissipation grows linearly to the

size of workloadThen the reference energy of a workload119908119894

at the full speed is 119862119890sdot 119908119894 Given a DVFS energy model

119878 as a function of the speed scaling factor 119896 the energyconsumption at the 119894th iteration EG

119894is formulated as follows

EG119894= 119862119890sdot 119908119894sdot 119878 (119896119894) (6)

in which 119878(0) = 0 and 119878(1) = 1 Then the average powerconsumption of a trace tr can be formulated as follows

AVG (tr) =sum|tr|119894=1

EG119894

sum|tr|119894=1119905119894

(7)

It is worthwhile to mention that the proposed technique isnot specific to a certain workload-energy model While weadopt linear model for the workload-energy relation for easeof presentation any possibly nonlinear model can be used in(6)

32 Problem Formulation Our objective is to minimize theaverage power consumption of a given system as follows

Given the modeling constant 119862119890 DVFS energy mod-

eling function 119878 workload function119882 and the real-time constraint 119879 determine an execution trace trsuch that the average power consumption formulatedin (7) is minimized

4 Proposed Technique

In this section we describe the proposed operation man-agement policy as an answer to the problem defined in theprevious section In doing so we first derive the condition forfeasible and schedulable systems Then we study when theworkload changes and how it affects the power dissipationBased on that we propose a novel graph representationthat captures all possible workload transitions in the power-optimal operation Finally we derive the power-optimaloperation policy with the given workload function119882

41 Feasibility In this subsection we examine under whichcondition a given system is feasible First the system should

4 Mobile Information Systems

W

wlp

wlq

wlr

thr thq

w = t

t

Currentworkload

Invalid Valid

le 1wlpthqgt 1wlpthr

(a)

W

wlp

wlq

wlr

t

Currentworkload

thrminus1thqminus1

w = kmint

InvalidValidminus1 ge kminwlpthq

minus1 lt kminwlpthr

(b)

Figure 1 Examples of workload transitions (a) transitions to lower workload levels and (b) transitions to higher workload levels Thetransitions from wl

119901to wl119902are valid while the ones from wl

119901to wl119903are not

be schedulable within the given real-time constraint 119879 atevery iteration

Theorem 1 (schedulability) Given the workload function 119882and the real-time constraint 119879 the system is not schedulable if119882(119905)119905 gt 1 forall119905 isin (0 119879]

Proof Suppose that the delay at the 119894th iteration is 119905119894 Then

119908119894+1

= 119882(119905119894) gt 119905119894and119908

119894+1= 119896119894+1sdot119905119894+1

Since 119896119894+1

le 1 119905119894lt 119905119894+1

That is the delay is increasing as iteration goes by and willeventually reache the real-time constraint 119905

119895= 119879 At the next

iteration the system becomes unschedulable even with thefull speed as119882(119905

119895= 119879) = 1 sdot 119905

119895+1gt 119879 requires 119905

119895+1gt 119879

Once the workload gets bigger than 119879 the system istrivially not schedulable afterwards even with the full speed119896119894= 1 Thus the workload must not be bigger than 119879 at

any time Moreover once the workload reaches 119879 119896 shouldremain the full speed 119896 = 1 afterwards We can make theupper bound of workload even tighter if there exists 1199051015840 suchthat 119882(119905) gt 119905 for all 119905 isin (119905

1015840 119879] In this case the workload

larger than 119882(1199051015840) is not allowable as it makes the execution

delay longer and longer eventually violating the deadlineGiven the workload function119882 and the initial workload

1199081 one can calculate the lower bound of theworkload as well

If a value exists which satisfies 119882(119905) = and 119882(119905) lt 1199081

the workload will never become smaller than119882(119905) In otherwords even with the full speed the execution delay nevergoes below

Then the valid workload levels and the execution delayrange during the lifetime of a given system can be formulatedas below

Definition 2 (valid ranges) Given the workload function119882and the initial workload 119908

1 the minimum and maximum

workload levels of a system are defined to be

wlmin =

119882(max (119905)) st = 119882 (119905) lt 1199081

if exist

wl1

otherwise(8)

wlmax

=

119882(min (1199051015840)) st 119882(119905) gt 119905 forall119905 isin (1199051015840 119879] if exist 1199051015840

119882(119879) otherwise

(9)

Then the valid range of the execution delay is formulated as[119905min 119905max] according to wlmin and wlmax with

119905min = min (119905) st 119882(119905) = wlmin

119905max = min (119879max (1199051015840)) st 119882(1199051015840) = wlmax

(10)

42 Workload Transitions In this subsection we examinewhen a workload transition between valid workload levelspossibly occurs and how it affects the system

As presented in (4) workload is a function of the delaytaken at the previous iteration If the delay taken at aniteration is 119905 and th

119897minus1lt t le th

119897Then if the systemworks fast

enough to result in a shorter delay 1199051015840 le th119897minus1

the nextworkload will get smaller than wl

119897 Similarly in case that the

delay gets longer (1199051015840 gt th119897) the system will need to handle a

larger workload than wl119896at the successive iteration

However such workload transitions can occur onlywithin limited ranges Figure 1 depicts valid and invalidtransitions from one workload level Figure 1(a) shows twotransitions from a workload level wl

119901to lower ones wl

119902and

wl119903(119903 lt 119902 lt 119901) To make the next workload level wl

119902 the

delay should be in the range of (th119902minus1 th119902] Given the current

workload wl119901 the speed scaling factor should be larger than

or equal to wl119901th119902from (2) If wl

119901th119902

le 119896max = 1this workload transition can possibly occur In contrast ifwl119901th119903gt 1 for another workload level wl

119903 the transition

from wl119901to wl119903never happens because the delay never goes

below th119903even with the full processing speed

The same principle is also applied to the transitionfrom a workload level to higher ones If the delay canbe lengthened properly with a speed scaling factor withinthe range [119896min 119896max] the transitions are valid Figure 1(b)illustrates that the transition from wl

119901to wl119902is valid while

Mobile Information Systems 5

the one to wl119903is not One can tell if a transition can happen

or not with the following definition

Definition 3 (valid transition) A workload transition fromwl119900to wl119899is said to be valid if wl

119900th119899le 119896max = 1 and

wl119900th119899minus1

ge 119896min

43WorkloadTransitionGraph Theessential difficulty of thepresented power optimization problem lies in the fact thattwo conflicting forces should be handled at the same timeIn order to minimize the power on one hand it tries to scaledown the speed (thus lengthen the delay) as much as possibleas described in (2) and (6) On the other hand the lengtheneddelay is not desirable as it imposes a bigger workload in thesuccessive iteration as shown in (4)

Therefore no one simple intuition can be exploited tosolve the problem Rather we need to compare differentmodes in a comprehensive way In order to be able to exploreall possible execution modes and quantify their effects weneed to devise a data structure that includes elementaryinformation on how workload transitions change the systemstatus and power dissipation In line with that purpose wepropose a graph representation of the workload evolutionWorkload Transition Graph (WTG) which captures all pos-sible transition scenarios of the workload changes during thelifetime of a system

A valid workload transition from one workload level toanother can be caused by any delay within the correspondingrange A transition fromwl

119901towl119902in Figure 1(a) for instance

can be caused by any delay within the range of (th119902minus1 th119902] In

other words when handling a workload of wl119901 any scaling

factor within the range of [wl119901th119902wl119901th119902minus1) can cause the

transition In the power-optimal execution trace howeveronly one specific scaling factor is always chosen for a certaintransition even though it happens many times We show thisin the next theorem

Theorem 4 (optimal scaling factor) In the power-optimaltrace 119905119903 = (119896

1 1198962 119896

|119905119903|) if the workload level handled by

119896119894is 119908119897119900and the next workload level is 119908119897

119899

119896119894= min(

119908119897119900

119905ℎ119899

119896min) (11)

Proof We prove this by contradiction Let us suppose thatthere exists a power-optimal trace tr where 119896

119894(which

results in the transition from wl119900to wl119899) is not equal to

min(wl119900th119899 119896min) Then we make a new execution trace tr1015840

from tr by replacing 119896119894with 1198961015840

119894= min(wl

119900th119899 119896min) Note

that all other execution modes in tr1015840 are the same as in trThat is

tr (119895) = tr1015840 (119895) forall119895 = 119894 (12)

By the definition of the scaling factor

119896119894ge 119896min

119896119894gewl119900

th119899

(13)

since the transition is from wl119900to wl119899 Thus

tr (119894) = 119896119894 gt tr1015840 (119894) = 1198961015840119894

(14)

and accordingly we obtain

119878 (119896119894) gt 119878 (119896

1015840

119894) (15)

Then from (6) and (7) the average power in tr1015840 is lower thanthat of tr and this contradicts the proposition

From Theorem 4 we know that only one speed scalingfactor is associated with all transitions of a certain type inthe power-optimal operation scenario So we define a scalingfactor of a valid transition as follows

Definition 5 (scaling factor of a transition) The scaling factorof a valid workload transition from wl

119900to wl119899is

sf (wl119900wl119899) = max(

wl119900

min (th119899 119879)

119896min) (16)

Now we define WTG

Definition 6 (Workload Transition Graph) WTG is definedto be a graph (119881 119864) where119881 and119864 are the sets of vertices andedges respectively Each valid workload level forms a vertex

119881 = V119894= wl119894| wlmin le wl

119894le wlmax (17)

while a valid transition from a workload level to anotherforms an edge between vertices That is there exists an edgefrom V

119894to V119895corresponding to a valid transition from wl

119894to

wl119895

119864 = (V119894 V119895) | exist valid transition from wl

119894to wl119895 (18)

We denote the source and destination vertices of an edge 119890 isin119864 as src(119890) and dst(119890) respectively

With these definitions a power-optimal execution tracetr = (119896

1 1198962 119896

|tr|) can be represented as a walk (asequence of vertices where any pair of consecutive verticesare connected through an edge) of length |tr| + 1 in WTG Ina different form the execution trace is a sequence of edgesin WTG of length |tr|(119890

1 1198902 119890

|tr|) where the workloadtransition from src(119890

119894) to dst(119890

119894) is caused by the execution

mode sf(src(119890119894) dst(119890

119894)) = 119896

119894

Algorithm 1 shows howWTG is generated out of the givenworkload function119882 and the initial workload 119908

1 After the

initialization all valid workload levels are added as verticesin lines (5)-(6) Then for each permutation of two workloadlevels (lines (7)-(8)) it is checked if the in-between transitionis valid or not in line (9) If valid it is added as an edge in line(10)

Let us take Figure 2 as an example of workload function119882 Once given the initial workload one can easily get thevalid workload levels according to (8) and (9) When 119908

1=

wl2 for instance wlmin = wl

2and wlmax = wl

6 Figure 3(b)

illustrates the corresponding WTG of the workload function

6 Mobile Information Systems

(1) procedure GenerateWTG (1198821199081)

(2) wlmin larr equation (8) ⊳ Initializations(3) wlmax larr equation (9)(4) 119881 larr 120601 119864 larr 120601(5) for all wlmin le wl le wlmax do ⊳ Vertices(6) 119881 larr 119881 cup wl(7) for all wlmin le wl

1le wlmax do ⊳ Edges

(8) for all wlmin le wl2le wlmax do

(9) if exist valid transition from wl1to wl2then

(10) 119864 larr 119864 cup (wl1wl2)

Algorithm 1 Generation of WTG

y

t

T

wl1

wl2

wl3

wl4

wl5

wl6

th1 th2 th3 th4 th5

y = kmaxt

y = kmint

Feasible delay range of wl4

Figure 2 A workload staircase function example with six workloadlevels

shown in Figure 2 with the initial workload of wl2 It is not a

complete graph as some pairs of vertices cannot be connecteddirectly since it is not valid according to Definition 3

The feasible delay range to handle workload wl4is

highlighted in Figure 2 justifying that vertex wl4has three

outgoing edges to wl6 wl5 and itself To be more specific

when the workload is given as wl4 the shortest possible delay

is the case when the speed is chosen as 119896max Then the delay(119909 coordinate of the cross point of 119910 = wl

4and 119910 = 119896max119905) is

between th3and th

4as shown in Figure 2Thismeans that the

lowest possibleworkload in the next iteration iswl4 Similarly

the biggest possible workload can also be calculated as wl6

as the biggest possible delay is between th5and 119879 Note also

that some of the vertices may not have an edge directed toitself (self-loop) such as vertex wl

3 It has outgoing edges only

to higher workload levels This means that the computationburden of that state is so big that it only results in higherworkload levels at the next iteration even with the full speed

Different initial workload levels may result in differentWTGs as shown in Figures 3(a)ndash3(c)TheWTGderived fromthe initial workload level of wl

1is illustrated in Figure 3(a)

In contrast to Figure 3(b) vertex wl1is included in the graph

The WTG derived from higher initial workload levels wl4

wl5 and wl

6 is shown in Figure 3(c) Note that the WTGs in

Figures 3(a) and 3(c) are not strongly connected Vertex wl2in

Figure 3(b) for example is not reachable from wl4 However

from the definition of valid workload levels all vertices arereachable from the initial workload level This property is

important for deriving the optimal operation policy that willbe presented in the next subsection

44 Proposed Operation Policy In this subsection we presentthe proposed operation policy that compromises the energy-delay tradeoff caused by the delay-workload dependency

As stated earlier a power-optimal execution trace tr canbe represented as awalk ofWTGThen we have the followingdefinition

Definition 7 (corresponding walk) Given the power-optimalexecution trace tr = (119896

1 1198962 119896

|tr|) and its initial workload1199081 the corresponding walk of the trace is denoted as

cwtr1199081

= (1198901 1198902 119890

|tr|) (19)

where src(1198901) = 1199081and sf(src(119890

119894) dst(119890

119894)) = 119896

119894(for simplicity

we also denote it as sf(119890119894) = sf(src(119890

119894) dst(119890

119894)) = 119896

119894for the rest

of the paper)The average power consumption of thewalk canbe formulated as follows

AVG (cwtr1199081

) =

EG (cwtr1199081

)

DL (cwtr1199081

)

(20)

where DL((1198901 1198902 119890

119899)) is the total delay elapsed

for traversing the walk sum119899

119894=1(src(119890

119894)sf(119890

119894)) and

EG((1198901 1198902 119890

119899)) is the total energy consumption for

the walk 119862119890sum119899

119894=1src(119890119894)119878(sf(119890

119894))

A cycle (closed walk) of WTG is a walk whose startingand ending vertices are the sameThat is awalk (119890

1 1198902 119890

119899)

is a cycle if src(1198901) = dst(119890

119899) Hence if the corresponding

walk of a trace inWTG is a cycle the trace is ever repeatableWe argue that in case that the length of the trace issufficiently long (|tr| rarr infin) the average power consumptionis minimized when the cycle which minimizes (20) repeatsover and over again in the trace

Theorem 8 (optimal cycle of WTG) Suppose that a cycle ofWTG

119888119910119900119901119905

= (1198901 1198902 119890

119899) (21)

has the minimum value of (20) among all cycles of the WTGThen if the length of the trace is long enough the average powerconsumption of the optimal trace converges to

AVG (119888119910119900119901119905) =

EG (119888119910119900119901119905)

DL (119888119910119900119901119905)

(22)

Proof An arbitrary walk of a WTG (1198901 1198902 119890

119899) can be

decomposed into a path (a walk with distinct vertices) fromsrc(1198901) to dst(119890

119899) and a set of cycles (see eg Section 103 of

[17]) Figure 4 depicts a walk example of length 8 where theinitial workload level is wl

2and the last vertex that it traverses

is wl6 If a path from wl

2to wl6 (1198901 1198902 1198905) highlighted in the

dashed arrows is removed from the walk the remaining partis a set of cycles

Now consider a power-optimal trace tr of length 119873 thatstarts from the workload level of119908

1 The corresponding walk

Mobile Information Systems 7

wl1

wl2 wl3 wl4 wl5 wl6

(a)

wl2

wl3 wl4 wl5 wl6

(b)

wl4 wl5 wl6

(c)

Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl

2or wl3 and

(c) wl4 wl5 or wl

6

wl2 wl3 wl4 wl5 wl6e1

e2

e3

e4 e5

e6e7

e8

Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8

of tr can be decomposed into a path pt starting at 1199081and a

set of cycles CYThen the average power consumption of thetrace tr is

AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)

(23)

Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin

AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)

(24)

Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition

EG (cy)DL (cy)

ge AVG (cyopt) =EG (cyopt)

DL (cyopt) forallcy isin CY (25)

Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)

From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy

(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908

119894

Start

No Yes

Generate WTG

Input w1 and W

i = 1

Calculatecyopt and ptopt

Is wi in cyopt

Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))

Update wi+1i++

Update wi+1i++

Figure 5 Flowchart diagram of the proposed operation policy

(ii) Otherwise take a path ptopt = (1198901 1198902 119890

119899) that has

the minimum AVG(ptopt) value and dst(119890119899) is in the

optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost

The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)

Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on

8 Mobile Information Systems

5 Extension to Discrete DVFS

Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique

Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896

119894isin 119870 can be chosen as

a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement

Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl

119900

to wl119899is said to be valid if

th119899minus1

ltwl119900

119896le min (th

119899 119879) exist119896 isin 119870 (26)

Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition

Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl

119900to wl119899is

sf (wl119900wl119899)

= argmin119896isin119870

th119899minus1

ltwl119900

119896le min (th

119899 119879)

(27)

6 Experiments

In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations

61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10

and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time

5 10 15 20 25 30 350Delay (ms)

0

5

10

15

20

25

30

35

Wor

kloa

d (m

s)

Real-time constraint Tkt k = 02

kt k = 04

kt k = 06

kt k = 08

kt k = 10

W(t)

Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm

constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870

We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl

3 that is 119908

1= wl3= 9833

ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl

2 which implies a stable

operation mode forall119894 119896119894= 06

62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896

3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th

1= 02 th

2= 06 th

3= 07

and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example

The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand

Mobile Information Systems 9

W

t

wl4 = 06

wl3 = 05

wl1 = 03

wl2 = 04

k = 09 k = 08

k = 04

T = 10

th1 = 02

th2 = 06

th3 = 07

(a)

wl1

wl2

wl3 wl408

08

08

08

04

04

09

09

(b)

Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl

2= 04 wl

3= 05 and wl

4= 06 and (b) its WTG

representation where the edges are annotated with scaling factors

an alternating operation mode implied in wl2rarr wl

4rarr

wl3rarr wl2shows the least average power consumption in the

discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes

In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870

7 Conclusion

This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones

Disclosure

A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and

Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University

References

[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008

[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010

[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011

[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000

[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013

[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of

10 Mobile Information Systems

Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008

[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007

[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008

[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001

[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000

[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015

[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016

[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011

[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008

[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015

[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001

[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005

[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973

[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001

[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000

[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: Research Article Power Optimization of Multimode Mobile

Mobile Information Systems 3

31 System Model

311 Dynamic Voltage-Frequency Scaling In this paper weassume that a system has multiple operation modes due toDVFS feature where the operating frequency and voltagecan be modulated For simplicity we first assume that thereare infinitely many operation modes available among whichone is chosen at each iteration It will be shown that theproposed technique can be applied to a discrete DVFS aswell in Section 5 The operation mode at the 119894th iteration isrepresentedwith the speed scaling factor 119896

119894ranging from 119896min

to 119896max = 1 (119896min le 119896 le 119896max = 1) Then the operatingfrequency of the 119894th iteration 119891

119894 is

119891119894= 119896119894sdot 119891max (1)

where119891max is themaximum frequency of themicroprocessor

312 Workload The workload is defined to be a number ofclock cycles elapsed to complete the given computation Wedenote the number of cycles elapsed to handle the workloadof the 119894th iteration at the full speed of themicroprocessor (119896

119894=

1) as 119905ref 119894 That is

119908119894= 119905ref119894 = 119896119894 sdot 119905119894 (2)

Note that the elapsed time 119905119894increases as the speed is scaled

down (119896119894lt 1) Then the delay 119905

119894is automatically determined

when a speed scaling factor 119896119894is chosen for the given

workload (119905119894= 119908119894119896119894)

313 Real-Time Constraint The delay 119905119894cannot be unbound-

edly long as the system is associated with real-time constraint119879 For all iterations the elapsed time should be no more than119879

119905119894le 119879 forall119894 (3)

314 Delay-Workload Dependency As stated earlier theworkload is dependent upon the previous execution delayUsually the workload is not a continuous function of thedelay variation Rather the changes happen in a discretemanner Therefore the workload at the 119894th iteration is amonotonically increasing staircase function of the delay ofthe previous iteration 119905

119894minus1 119908119894= 119882(119905

119894minus1) If the given system

has 119899 workload levels the workload function 119882 can beformulated as follows

119882(119905) =

wl1

if 0 lt 119905 le th1

wl2

if th1lt 119905 le th

2

wl119899minus1

if th119899minus2

lt 119905 le th119899minus1

wl119899

if th119899minus1

lt 119905

(4)

in which the workload levels are wl1lt wl2lt sdot sdot sdot lt wl

119899and

the delay thresholds (workload changing moments) are th1lt

th2lt sdot sdot sdot lt th

119899minus1

315 Execution Trace At the 119894th iteration the speed scalingfactor 119896

119894uniquely defines an execution mode as the delay is

fixed accordingly by (2)The initial workload is assumed to begiven as119908

1 Then an execution trace tr of length 119871 is defined

to be a sequence of the speed scaling factors of 119871 iterations

tr fl (1198961 1198962 119896

119871) (5)

316 Average Power Consumption The dynamic power con-sumption of CMOS circuits is 119888V2

119889119889119891 where 119888 V

119889119889 and 119891 are

capacitance operating voltage and frequency respectivelyAs the operating frequency is proportional to (V

119889119889minus Vth)2V119889119889

[10] the power consumption is an increasing function of 119891It is worth noting that the proposed model is not dependentupon any specific DVFS model We denote the energyconsumption of a unit workload at the full speed (119896 = 1) as119862119890and assume that energy dissipation grows linearly to the

size of workloadThen the reference energy of a workload119908119894

at the full speed is 119862119890sdot 119908119894 Given a DVFS energy model

119878 as a function of the speed scaling factor 119896 the energyconsumption at the 119894th iteration EG

119894is formulated as follows

EG119894= 119862119890sdot 119908119894sdot 119878 (119896119894) (6)

in which 119878(0) = 0 and 119878(1) = 1 Then the average powerconsumption of a trace tr can be formulated as follows

AVG (tr) =sum|tr|119894=1

EG119894

sum|tr|119894=1119905119894

(7)

It is worthwhile to mention that the proposed technique isnot specific to a certain workload-energy model While weadopt linear model for the workload-energy relation for easeof presentation any possibly nonlinear model can be used in(6)

32 Problem Formulation Our objective is to minimize theaverage power consumption of a given system as follows

Given the modeling constant 119862119890 DVFS energy mod-

eling function 119878 workload function119882 and the real-time constraint 119879 determine an execution trace trsuch that the average power consumption formulatedin (7) is minimized

4 Proposed Technique

In this section we describe the proposed operation man-agement policy as an answer to the problem defined in theprevious section In doing so we first derive the condition forfeasible and schedulable systems Then we study when theworkload changes and how it affects the power dissipationBased on that we propose a novel graph representationthat captures all possible workload transitions in the power-optimal operation Finally we derive the power-optimaloperation policy with the given workload function119882

41 Feasibility In this subsection we examine under whichcondition a given system is feasible First the system should

4 Mobile Information Systems

W

wlp

wlq

wlr

thr thq

w = t

t

Currentworkload

Invalid Valid

le 1wlpthqgt 1wlpthr

(a)

W

wlp

wlq

wlr

t

Currentworkload

thrminus1thqminus1

w = kmint

InvalidValidminus1 ge kminwlpthq

minus1 lt kminwlpthr

(b)

Figure 1 Examples of workload transitions (a) transitions to lower workload levels and (b) transitions to higher workload levels Thetransitions from wl

119901to wl119902are valid while the ones from wl

119901to wl119903are not

be schedulable within the given real-time constraint 119879 atevery iteration

Theorem 1 (schedulability) Given the workload function 119882and the real-time constraint 119879 the system is not schedulable if119882(119905)119905 gt 1 forall119905 isin (0 119879]

Proof Suppose that the delay at the 119894th iteration is 119905119894 Then

119908119894+1

= 119882(119905119894) gt 119905119894and119908

119894+1= 119896119894+1sdot119905119894+1

Since 119896119894+1

le 1 119905119894lt 119905119894+1

That is the delay is increasing as iteration goes by and willeventually reache the real-time constraint 119905

119895= 119879 At the next

iteration the system becomes unschedulable even with thefull speed as119882(119905

119895= 119879) = 1 sdot 119905

119895+1gt 119879 requires 119905

119895+1gt 119879

Once the workload gets bigger than 119879 the system istrivially not schedulable afterwards even with the full speed119896119894= 1 Thus the workload must not be bigger than 119879 at

any time Moreover once the workload reaches 119879 119896 shouldremain the full speed 119896 = 1 afterwards We can make theupper bound of workload even tighter if there exists 1199051015840 suchthat 119882(119905) gt 119905 for all 119905 isin (119905

1015840 119879] In this case the workload

larger than 119882(1199051015840) is not allowable as it makes the execution

delay longer and longer eventually violating the deadlineGiven the workload function119882 and the initial workload

1199081 one can calculate the lower bound of theworkload as well

If a value exists which satisfies 119882(119905) = and 119882(119905) lt 1199081

the workload will never become smaller than119882(119905) In otherwords even with the full speed the execution delay nevergoes below

Then the valid workload levels and the execution delayrange during the lifetime of a given system can be formulatedas below

Definition 2 (valid ranges) Given the workload function119882and the initial workload 119908

1 the minimum and maximum

workload levels of a system are defined to be

wlmin =

119882(max (119905)) st = 119882 (119905) lt 1199081

if exist

wl1

otherwise(8)

wlmax

=

119882(min (1199051015840)) st 119882(119905) gt 119905 forall119905 isin (1199051015840 119879] if exist 1199051015840

119882(119879) otherwise

(9)

Then the valid range of the execution delay is formulated as[119905min 119905max] according to wlmin and wlmax with

119905min = min (119905) st 119882(119905) = wlmin

119905max = min (119879max (1199051015840)) st 119882(1199051015840) = wlmax

(10)

42 Workload Transitions In this subsection we examinewhen a workload transition between valid workload levelspossibly occurs and how it affects the system

As presented in (4) workload is a function of the delaytaken at the previous iteration If the delay taken at aniteration is 119905 and th

119897minus1lt t le th

119897Then if the systemworks fast

enough to result in a shorter delay 1199051015840 le th119897minus1

the nextworkload will get smaller than wl

119897 Similarly in case that the

delay gets longer (1199051015840 gt th119897) the system will need to handle a

larger workload than wl119896at the successive iteration

However such workload transitions can occur onlywithin limited ranges Figure 1 depicts valid and invalidtransitions from one workload level Figure 1(a) shows twotransitions from a workload level wl

119901to lower ones wl

119902and

wl119903(119903 lt 119902 lt 119901) To make the next workload level wl

119902 the

delay should be in the range of (th119902minus1 th119902] Given the current

workload wl119901 the speed scaling factor should be larger than

or equal to wl119901th119902from (2) If wl

119901th119902

le 119896max = 1this workload transition can possibly occur In contrast ifwl119901th119903gt 1 for another workload level wl

119903 the transition

from wl119901to wl119903never happens because the delay never goes

below th119903even with the full processing speed

The same principle is also applied to the transitionfrom a workload level to higher ones If the delay canbe lengthened properly with a speed scaling factor withinthe range [119896min 119896max] the transitions are valid Figure 1(b)illustrates that the transition from wl

119901to wl119902is valid while

Mobile Information Systems 5

the one to wl119903is not One can tell if a transition can happen

or not with the following definition

Definition 3 (valid transition) A workload transition fromwl119900to wl119899is said to be valid if wl

119900th119899le 119896max = 1 and

wl119900th119899minus1

ge 119896min

43WorkloadTransitionGraph Theessential difficulty of thepresented power optimization problem lies in the fact thattwo conflicting forces should be handled at the same timeIn order to minimize the power on one hand it tries to scaledown the speed (thus lengthen the delay) as much as possibleas described in (2) and (6) On the other hand the lengtheneddelay is not desirable as it imposes a bigger workload in thesuccessive iteration as shown in (4)

Therefore no one simple intuition can be exploited tosolve the problem Rather we need to compare differentmodes in a comprehensive way In order to be able to exploreall possible execution modes and quantify their effects weneed to devise a data structure that includes elementaryinformation on how workload transitions change the systemstatus and power dissipation In line with that purpose wepropose a graph representation of the workload evolutionWorkload Transition Graph (WTG) which captures all pos-sible transition scenarios of the workload changes during thelifetime of a system

A valid workload transition from one workload level toanother can be caused by any delay within the correspondingrange A transition fromwl

119901towl119902in Figure 1(a) for instance

can be caused by any delay within the range of (th119902minus1 th119902] In

other words when handling a workload of wl119901 any scaling

factor within the range of [wl119901th119902wl119901th119902minus1) can cause the

transition In the power-optimal execution trace howeveronly one specific scaling factor is always chosen for a certaintransition even though it happens many times We show thisin the next theorem

Theorem 4 (optimal scaling factor) In the power-optimaltrace 119905119903 = (119896

1 1198962 119896

|119905119903|) if the workload level handled by

119896119894is 119908119897119900and the next workload level is 119908119897

119899

119896119894= min(

119908119897119900

119905ℎ119899

119896min) (11)

Proof We prove this by contradiction Let us suppose thatthere exists a power-optimal trace tr where 119896

119894(which

results in the transition from wl119900to wl119899) is not equal to

min(wl119900th119899 119896min) Then we make a new execution trace tr1015840

from tr by replacing 119896119894with 1198961015840

119894= min(wl

119900th119899 119896min) Note

that all other execution modes in tr1015840 are the same as in trThat is

tr (119895) = tr1015840 (119895) forall119895 = 119894 (12)

By the definition of the scaling factor

119896119894ge 119896min

119896119894gewl119900

th119899

(13)

since the transition is from wl119900to wl119899 Thus

tr (119894) = 119896119894 gt tr1015840 (119894) = 1198961015840119894

(14)

and accordingly we obtain

119878 (119896119894) gt 119878 (119896

1015840

119894) (15)

Then from (6) and (7) the average power in tr1015840 is lower thanthat of tr and this contradicts the proposition

From Theorem 4 we know that only one speed scalingfactor is associated with all transitions of a certain type inthe power-optimal operation scenario So we define a scalingfactor of a valid transition as follows

Definition 5 (scaling factor of a transition) The scaling factorof a valid workload transition from wl

119900to wl119899is

sf (wl119900wl119899) = max(

wl119900

min (th119899 119879)

119896min) (16)

Now we define WTG

Definition 6 (Workload Transition Graph) WTG is definedto be a graph (119881 119864) where119881 and119864 are the sets of vertices andedges respectively Each valid workload level forms a vertex

119881 = V119894= wl119894| wlmin le wl

119894le wlmax (17)

while a valid transition from a workload level to anotherforms an edge between vertices That is there exists an edgefrom V

119894to V119895corresponding to a valid transition from wl

119894to

wl119895

119864 = (V119894 V119895) | exist valid transition from wl

119894to wl119895 (18)

We denote the source and destination vertices of an edge 119890 isin119864 as src(119890) and dst(119890) respectively

With these definitions a power-optimal execution tracetr = (119896

1 1198962 119896

|tr|) can be represented as a walk (asequence of vertices where any pair of consecutive verticesare connected through an edge) of length |tr| + 1 in WTG Ina different form the execution trace is a sequence of edgesin WTG of length |tr|(119890

1 1198902 119890

|tr|) where the workloadtransition from src(119890

119894) to dst(119890

119894) is caused by the execution

mode sf(src(119890119894) dst(119890

119894)) = 119896

119894

Algorithm 1 shows howWTG is generated out of the givenworkload function119882 and the initial workload 119908

1 After the

initialization all valid workload levels are added as verticesin lines (5)-(6) Then for each permutation of two workloadlevels (lines (7)-(8)) it is checked if the in-between transitionis valid or not in line (9) If valid it is added as an edge in line(10)

Let us take Figure 2 as an example of workload function119882 Once given the initial workload one can easily get thevalid workload levels according to (8) and (9) When 119908

1=

wl2 for instance wlmin = wl

2and wlmax = wl

6 Figure 3(b)

illustrates the corresponding WTG of the workload function

6 Mobile Information Systems

(1) procedure GenerateWTG (1198821199081)

(2) wlmin larr equation (8) ⊳ Initializations(3) wlmax larr equation (9)(4) 119881 larr 120601 119864 larr 120601(5) for all wlmin le wl le wlmax do ⊳ Vertices(6) 119881 larr 119881 cup wl(7) for all wlmin le wl

1le wlmax do ⊳ Edges

(8) for all wlmin le wl2le wlmax do

(9) if exist valid transition from wl1to wl2then

(10) 119864 larr 119864 cup (wl1wl2)

Algorithm 1 Generation of WTG

y

t

T

wl1

wl2

wl3

wl4

wl5

wl6

th1 th2 th3 th4 th5

y = kmaxt

y = kmint

Feasible delay range of wl4

Figure 2 A workload staircase function example with six workloadlevels

shown in Figure 2 with the initial workload of wl2 It is not a

complete graph as some pairs of vertices cannot be connecteddirectly since it is not valid according to Definition 3

The feasible delay range to handle workload wl4is

highlighted in Figure 2 justifying that vertex wl4has three

outgoing edges to wl6 wl5 and itself To be more specific

when the workload is given as wl4 the shortest possible delay

is the case when the speed is chosen as 119896max Then the delay(119909 coordinate of the cross point of 119910 = wl

4and 119910 = 119896max119905) is

between th3and th

4as shown in Figure 2Thismeans that the

lowest possibleworkload in the next iteration iswl4 Similarly

the biggest possible workload can also be calculated as wl6

as the biggest possible delay is between th5and 119879 Note also

that some of the vertices may not have an edge directed toitself (self-loop) such as vertex wl

3 It has outgoing edges only

to higher workload levels This means that the computationburden of that state is so big that it only results in higherworkload levels at the next iteration even with the full speed

Different initial workload levels may result in differentWTGs as shown in Figures 3(a)ndash3(c)TheWTGderived fromthe initial workload level of wl

1is illustrated in Figure 3(a)

In contrast to Figure 3(b) vertex wl1is included in the graph

The WTG derived from higher initial workload levels wl4

wl5 and wl

6 is shown in Figure 3(c) Note that the WTGs in

Figures 3(a) and 3(c) are not strongly connected Vertex wl2in

Figure 3(b) for example is not reachable from wl4 However

from the definition of valid workload levels all vertices arereachable from the initial workload level This property is

important for deriving the optimal operation policy that willbe presented in the next subsection

44 Proposed Operation Policy In this subsection we presentthe proposed operation policy that compromises the energy-delay tradeoff caused by the delay-workload dependency

As stated earlier a power-optimal execution trace tr canbe represented as awalk ofWTGThen we have the followingdefinition

Definition 7 (corresponding walk) Given the power-optimalexecution trace tr = (119896

1 1198962 119896

|tr|) and its initial workload1199081 the corresponding walk of the trace is denoted as

cwtr1199081

= (1198901 1198902 119890

|tr|) (19)

where src(1198901) = 1199081and sf(src(119890

119894) dst(119890

119894)) = 119896

119894(for simplicity

we also denote it as sf(119890119894) = sf(src(119890

119894) dst(119890

119894)) = 119896

119894for the rest

of the paper)The average power consumption of thewalk canbe formulated as follows

AVG (cwtr1199081

) =

EG (cwtr1199081

)

DL (cwtr1199081

)

(20)

where DL((1198901 1198902 119890

119899)) is the total delay elapsed

for traversing the walk sum119899

119894=1(src(119890

119894)sf(119890

119894)) and

EG((1198901 1198902 119890

119899)) is the total energy consumption for

the walk 119862119890sum119899

119894=1src(119890119894)119878(sf(119890

119894))

A cycle (closed walk) of WTG is a walk whose startingand ending vertices are the sameThat is awalk (119890

1 1198902 119890

119899)

is a cycle if src(1198901) = dst(119890

119899) Hence if the corresponding

walk of a trace inWTG is a cycle the trace is ever repeatableWe argue that in case that the length of the trace issufficiently long (|tr| rarr infin) the average power consumptionis minimized when the cycle which minimizes (20) repeatsover and over again in the trace

Theorem 8 (optimal cycle of WTG) Suppose that a cycle ofWTG

119888119910119900119901119905

= (1198901 1198902 119890

119899) (21)

has the minimum value of (20) among all cycles of the WTGThen if the length of the trace is long enough the average powerconsumption of the optimal trace converges to

AVG (119888119910119900119901119905) =

EG (119888119910119900119901119905)

DL (119888119910119900119901119905)

(22)

Proof An arbitrary walk of a WTG (1198901 1198902 119890

119899) can be

decomposed into a path (a walk with distinct vertices) fromsrc(1198901) to dst(119890

119899) and a set of cycles (see eg Section 103 of

[17]) Figure 4 depicts a walk example of length 8 where theinitial workload level is wl

2and the last vertex that it traverses

is wl6 If a path from wl

2to wl6 (1198901 1198902 1198905) highlighted in the

dashed arrows is removed from the walk the remaining partis a set of cycles

Now consider a power-optimal trace tr of length 119873 thatstarts from the workload level of119908

1 The corresponding walk

Mobile Information Systems 7

wl1

wl2 wl3 wl4 wl5 wl6

(a)

wl2

wl3 wl4 wl5 wl6

(b)

wl4 wl5 wl6

(c)

Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl

2or wl3 and

(c) wl4 wl5 or wl

6

wl2 wl3 wl4 wl5 wl6e1

e2

e3

e4 e5

e6e7

e8

Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8

of tr can be decomposed into a path pt starting at 1199081and a

set of cycles CYThen the average power consumption of thetrace tr is

AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)

(23)

Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin

AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)

(24)

Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition

EG (cy)DL (cy)

ge AVG (cyopt) =EG (cyopt)

DL (cyopt) forallcy isin CY (25)

Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)

From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy

(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908

119894

Start

No Yes

Generate WTG

Input w1 and W

i = 1

Calculatecyopt and ptopt

Is wi in cyopt

Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))

Update wi+1i++

Update wi+1i++

Figure 5 Flowchart diagram of the proposed operation policy

(ii) Otherwise take a path ptopt = (1198901 1198902 119890

119899) that has

the minimum AVG(ptopt) value and dst(119890119899) is in the

optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost

The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)

Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on

8 Mobile Information Systems

5 Extension to Discrete DVFS

Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique

Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896

119894isin 119870 can be chosen as

a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement

Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl

119900

to wl119899is said to be valid if

th119899minus1

ltwl119900

119896le min (th

119899 119879) exist119896 isin 119870 (26)

Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition

Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl

119900to wl119899is

sf (wl119900wl119899)

= argmin119896isin119870

th119899minus1

ltwl119900

119896le min (th

119899 119879)

(27)

6 Experiments

In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations

61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10

and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time

5 10 15 20 25 30 350Delay (ms)

0

5

10

15

20

25

30

35

Wor

kloa

d (m

s)

Real-time constraint Tkt k = 02

kt k = 04

kt k = 06

kt k = 08

kt k = 10

W(t)

Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm

constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870

We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl

3 that is 119908

1= wl3= 9833

ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl

2 which implies a stable

operation mode forall119894 119896119894= 06

62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896

3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th

1= 02 th

2= 06 th

3= 07

and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example

The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand

Mobile Information Systems 9

W

t

wl4 = 06

wl3 = 05

wl1 = 03

wl2 = 04

k = 09 k = 08

k = 04

T = 10

th1 = 02

th2 = 06

th3 = 07

(a)

wl1

wl2

wl3 wl408

08

08

08

04

04

09

09

(b)

Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl

2= 04 wl

3= 05 and wl

4= 06 and (b) its WTG

representation where the edges are annotated with scaling factors

an alternating operation mode implied in wl2rarr wl

4rarr

wl3rarr wl2shows the least average power consumption in the

discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes

In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870

7 Conclusion

This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones

Disclosure

A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and

Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University

References

[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008

[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010

[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011

[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000

[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013

[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of

10 Mobile Information Systems

Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008

[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007

[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008

[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001

[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000

[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015

[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016

[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011

[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008

[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015

[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001

[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005

[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973

[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001

[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000

[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: Research Article Power Optimization of Multimode Mobile

4 Mobile Information Systems

W

wlp

wlq

wlr

thr thq

w = t

t

Currentworkload

Invalid Valid

le 1wlpthqgt 1wlpthr

(a)

W

wlp

wlq

wlr

t

Currentworkload

thrminus1thqminus1

w = kmint

InvalidValidminus1 ge kminwlpthq

minus1 lt kminwlpthr

(b)

Figure 1 Examples of workload transitions (a) transitions to lower workload levels and (b) transitions to higher workload levels Thetransitions from wl

119901to wl119902are valid while the ones from wl

119901to wl119903are not

be schedulable within the given real-time constraint 119879 atevery iteration

Theorem 1 (schedulability) Given the workload function 119882and the real-time constraint 119879 the system is not schedulable if119882(119905)119905 gt 1 forall119905 isin (0 119879]

Proof Suppose that the delay at the 119894th iteration is 119905119894 Then

119908119894+1

= 119882(119905119894) gt 119905119894and119908

119894+1= 119896119894+1sdot119905119894+1

Since 119896119894+1

le 1 119905119894lt 119905119894+1

That is the delay is increasing as iteration goes by and willeventually reache the real-time constraint 119905

119895= 119879 At the next

iteration the system becomes unschedulable even with thefull speed as119882(119905

119895= 119879) = 1 sdot 119905

119895+1gt 119879 requires 119905

119895+1gt 119879

Once the workload gets bigger than 119879 the system istrivially not schedulable afterwards even with the full speed119896119894= 1 Thus the workload must not be bigger than 119879 at

any time Moreover once the workload reaches 119879 119896 shouldremain the full speed 119896 = 1 afterwards We can make theupper bound of workload even tighter if there exists 1199051015840 suchthat 119882(119905) gt 119905 for all 119905 isin (119905

1015840 119879] In this case the workload

larger than 119882(1199051015840) is not allowable as it makes the execution

delay longer and longer eventually violating the deadlineGiven the workload function119882 and the initial workload

1199081 one can calculate the lower bound of theworkload as well

If a value exists which satisfies 119882(119905) = and 119882(119905) lt 1199081

the workload will never become smaller than119882(119905) In otherwords even with the full speed the execution delay nevergoes below

Then the valid workload levels and the execution delayrange during the lifetime of a given system can be formulatedas below

Definition 2 (valid ranges) Given the workload function119882and the initial workload 119908

1 the minimum and maximum

workload levels of a system are defined to be

wlmin =

119882(max (119905)) st = 119882 (119905) lt 1199081

if exist

wl1

otherwise(8)

wlmax

=

119882(min (1199051015840)) st 119882(119905) gt 119905 forall119905 isin (1199051015840 119879] if exist 1199051015840

119882(119879) otherwise

(9)

Then the valid range of the execution delay is formulated as[119905min 119905max] according to wlmin and wlmax with

119905min = min (119905) st 119882(119905) = wlmin

119905max = min (119879max (1199051015840)) st 119882(1199051015840) = wlmax

(10)

42 Workload Transitions In this subsection we examinewhen a workload transition between valid workload levelspossibly occurs and how it affects the system

As presented in (4) workload is a function of the delaytaken at the previous iteration If the delay taken at aniteration is 119905 and th

119897minus1lt t le th

119897Then if the systemworks fast

enough to result in a shorter delay 1199051015840 le th119897minus1

the nextworkload will get smaller than wl

119897 Similarly in case that the

delay gets longer (1199051015840 gt th119897) the system will need to handle a

larger workload than wl119896at the successive iteration

However such workload transitions can occur onlywithin limited ranges Figure 1 depicts valid and invalidtransitions from one workload level Figure 1(a) shows twotransitions from a workload level wl

119901to lower ones wl

119902and

wl119903(119903 lt 119902 lt 119901) To make the next workload level wl

119902 the

delay should be in the range of (th119902minus1 th119902] Given the current

workload wl119901 the speed scaling factor should be larger than

or equal to wl119901th119902from (2) If wl

119901th119902

le 119896max = 1this workload transition can possibly occur In contrast ifwl119901th119903gt 1 for another workload level wl

119903 the transition

from wl119901to wl119903never happens because the delay never goes

below th119903even with the full processing speed

The same principle is also applied to the transitionfrom a workload level to higher ones If the delay canbe lengthened properly with a speed scaling factor withinthe range [119896min 119896max] the transitions are valid Figure 1(b)illustrates that the transition from wl

119901to wl119902is valid while

Mobile Information Systems 5

the one to wl119903is not One can tell if a transition can happen

or not with the following definition

Definition 3 (valid transition) A workload transition fromwl119900to wl119899is said to be valid if wl

119900th119899le 119896max = 1 and

wl119900th119899minus1

ge 119896min

43WorkloadTransitionGraph Theessential difficulty of thepresented power optimization problem lies in the fact thattwo conflicting forces should be handled at the same timeIn order to minimize the power on one hand it tries to scaledown the speed (thus lengthen the delay) as much as possibleas described in (2) and (6) On the other hand the lengtheneddelay is not desirable as it imposes a bigger workload in thesuccessive iteration as shown in (4)

Therefore no one simple intuition can be exploited tosolve the problem Rather we need to compare differentmodes in a comprehensive way In order to be able to exploreall possible execution modes and quantify their effects weneed to devise a data structure that includes elementaryinformation on how workload transitions change the systemstatus and power dissipation In line with that purpose wepropose a graph representation of the workload evolutionWorkload Transition Graph (WTG) which captures all pos-sible transition scenarios of the workload changes during thelifetime of a system

A valid workload transition from one workload level toanother can be caused by any delay within the correspondingrange A transition fromwl

119901towl119902in Figure 1(a) for instance

can be caused by any delay within the range of (th119902minus1 th119902] In

other words when handling a workload of wl119901 any scaling

factor within the range of [wl119901th119902wl119901th119902minus1) can cause the

transition In the power-optimal execution trace howeveronly one specific scaling factor is always chosen for a certaintransition even though it happens many times We show thisin the next theorem

Theorem 4 (optimal scaling factor) In the power-optimaltrace 119905119903 = (119896

1 1198962 119896

|119905119903|) if the workload level handled by

119896119894is 119908119897119900and the next workload level is 119908119897

119899

119896119894= min(

119908119897119900

119905ℎ119899

119896min) (11)

Proof We prove this by contradiction Let us suppose thatthere exists a power-optimal trace tr where 119896

119894(which

results in the transition from wl119900to wl119899) is not equal to

min(wl119900th119899 119896min) Then we make a new execution trace tr1015840

from tr by replacing 119896119894with 1198961015840

119894= min(wl

119900th119899 119896min) Note

that all other execution modes in tr1015840 are the same as in trThat is

tr (119895) = tr1015840 (119895) forall119895 = 119894 (12)

By the definition of the scaling factor

119896119894ge 119896min

119896119894gewl119900

th119899

(13)

since the transition is from wl119900to wl119899 Thus

tr (119894) = 119896119894 gt tr1015840 (119894) = 1198961015840119894

(14)

and accordingly we obtain

119878 (119896119894) gt 119878 (119896

1015840

119894) (15)

Then from (6) and (7) the average power in tr1015840 is lower thanthat of tr and this contradicts the proposition

From Theorem 4 we know that only one speed scalingfactor is associated with all transitions of a certain type inthe power-optimal operation scenario So we define a scalingfactor of a valid transition as follows

Definition 5 (scaling factor of a transition) The scaling factorof a valid workload transition from wl

119900to wl119899is

sf (wl119900wl119899) = max(

wl119900

min (th119899 119879)

119896min) (16)

Now we define WTG

Definition 6 (Workload Transition Graph) WTG is definedto be a graph (119881 119864) where119881 and119864 are the sets of vertices andedges respectively Each valid workload level forms a vertex

119881 = V119894= wl119894| wlmin le wl

119894le wlmax (17)

while a valid transition from a workload level to anotherforms an edge between vertices That is there exists an edgefrom V

119894to V119895corresponding to a valid transition from wl

119894to

wl119895

119864 = (V119894 V119895) | exist valid transition from wl

119894to wl119895 (18)

We denote the source and destination vertices of an edge 119890 isin119864 as src(119890) and dst(119890) respectively

With these definitions a power-optimal execution tracetr = (119896

1 1198962 119896

|tr|) can be represented as a walk (asequence of vertices where any pair of consecutive verticesare connected through an edge) of length |tr| + 1 in WTG Ina different form the execution trace is a sequence of edgesin WTG of length |tr|(119890

1 1198902 119890

|tr|) where the workloadtransition from src(119890

119894) to dst(119890

119894) is caused by the execution

mode sf(src(119890119894) dst(119890

119894)) = 119896

119894

Algorithm 1 shows howWTG is generated out of the givenworkload function119882 and the initial workload 119908

1 After the

initialization all valid workload levels are added as verticesin lines (5)-(6) Then for each permutation of two workloadlevels (lines (7)-(8)) it is checked if the in-between transitionis valid or not in line (9) If valid it is added as an edge in line(10)

Let us take Figure 2 as an example of workload function119882 Once given the initial workload one can easily get thevalid workload levels according to (8) and (9) When 119908

1=

wl2 for instance wlmin = wl

2and wlmax = wl

6 Figure 3(b)

illustrates the corresponding WTG of the workload function

6 Mobile Information Systems

(1) procedure GenerateWTG (1198821199081)

(2) wlmin larr equation (8) ⊳ Initializations(3) wlmax larr equation (9)(4) 119881 larr 120601 119864 larr 120601(5) for all wlmin le wl le wlmax do ⊳ Vertices(6) 119881 larr 119881 cup wl(7) for all wlmin le wl

1le wlmax do ⊳ Edges

(8) for all wlmin le wl2le wlmax do

(9) if exist valid transition from wl1to wl2then

(10) 119864 larr 119864 cup (wl1wl2)

Algorithm 1 Generation of WTG

y

t

T

wl1

wl2

wl3

wl4

wl5

wl6

th1 th2 th3 th4 th5

y = kmaxt

y = kmint

Feasible delay range of wl4

Figure 2 A workload staircase function example with six workloadlevels

shown in Figure 2 with the initial workload of wl2 It is not a

complete graph as some pairs of vertices cannot be connecteddirectly since it is not valid according to Definition 3

The feasible delay range to handle workload wl4is

highlighted in Figure 2 justifying that vertex wl4has three

outgoing edges to wl6 wl5 and itself To be more specific

when the workload is given as wl4 the shortest possible delay

is the case when the speed is chosen as 119896max Then the delay(119909 coordinate of the cross point of 119910 = wl

4and 119910 = 119896max119905) is

between th3and th

4as shown in Figure 2Thismeans that the

lowest possibleworkload in the next iteration iswl4 Similarly

the biggest possible workload can also be calculated as wl6

as the biggest possible delay is between th5and 119879 Note also

that some of the vertices may not have an edge directed toitself (self-loop) such as vertex wl

3 It has outgoing edges only

to higher workload levels This means that the computationburden of that state is so big that it only results in higherworkload levels at the next iteration even with the full speed

Different initial workload levels may result in differentWTGs as shown in Figures 3(a)ndash3(c)TheWTGderived fromthe initial workload level of wl

1is illustrated in Figure 3(a)

In contrast to Figure 3(b) vertex wl1is included in the graph

The WTG derived from higher initial workload levels wl4

wl5 and wl

6 is shown in Figure 3(c) Note that the WTGs in

Figures 3(a) and 3(c) are not strongly connected Vertex wl2in

Figure 3(b) for example is not reachable from wl4 However

from the definition of valid workload levels all vertices arereachable from the initial workload level This property is

important for deriving the optimal operation policy that willbe presented in the next subsection

44 Proposed Operation Policy In this subsection we presentthe proposed operation policy that compromises the energy-delay tradeoff caused by the delay-workload dependency

As stated earlier a power-optimal execution trace tr canbe represented as awalk ofWTGThen we have the followingdefinition

Definition 7 (corresponding walk) Given the power-optimalexecution trace tr = (119896

1 1198962 119896

|tr|) and its initial workload1199081 the corresponding walk of the trace is denoted as

cwtr1199081

= (1198901 1198902 119890

|tr|) (19)

where src(1198901) = 1199081and sf(src(119890

119894) dst(119890

119894)) = 119896

119894(for simplicity

we also denote it as sf(119890119894) = sf(src(119890

119894) dst(119890

119894)) = 119896

119894for the rest

of the paper)The average power consumption of thewalk canbe formulated as follows

AVG (cwtr1199081

) =

EG (cwtr1199081

)

DL (cwtr1199081

)

(20)

where DL((1198901 1198902 119890

119899)) is the total delay elapsed

for traversing the walk sum119899

119894=1(src(119890

119894)sf(119890

119894)) and

EG((1198901 1198902 119890

119899)) is the total energy consumption for

the walk 119862119890sum119899

119894=1src(119890119894)119878(sf(119890

119894))

A cycle (closed walk) of WTG is a walk whose startingand ending vertices are the sameThat is awalk (119890

1 1198902 119890

119899)

is a cycle if src(1198901) = dst(119890

119899) Hence if the corresponding

walk of a trace inWTG is a cycle the trace is ever repeatableWe argue that in case that the length of the trace issufficiently long (|tr| rarr infin) the average power consumptionis minimized when the cycle which minimizes (20) repeatsover and over again in the trace

Theorem 8 (optimal cycle of WTG) Suppose that a cycle ofWTG

119888119910119900119901119905

= (1198901 1198902 119890

119899) (21)

has the minimum value of (20) among all cycles of the WTGThen if the length of the trace is long enough the average powerconsumption of the optimal trace converges to

AVG (119888119910119900119901119905) =

EG (119888119910119900119901119905)

DL (119888119910119900119901119905)

(22)

Proof An arbitrary walk of a WTG (1198901 1198902 119890

119899) can be

decomposed into a path (a walk with distinct vertices) fromsrc(1198901) to dst(119890

119899) and a set of cycles (see eg Section 103 of

[17]) Figure 4 depicts a walk example of length 8 where theinitial workload level is wl

2and the last vertex that it traverses

is wl6 If a path from wl

2to wl6 (1198901 1198902 1198905) highlighted in the

dashed arrows is removed from the walk the remaining partis a set of cycles

Now consider a power-optimal trace tr of length 119873 thatstarts from the workload level of119908

1 The corresponding walk

Mobile Information Systems 7

wl1

wl2 wl3 wl4 wl5 wl6

(a)

wl2

wl3 wl4 wl5 wl6

(b)

wl4 wl5 wl6

(c)

Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl

2or wl3 and

(c) wl4 wl5 or wl

6

wl2 wl3 wl4 wl5 wl6e1

e2

e3

e4 e5

e6e7

e8

Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8

of tr can be decomposed into a path pt starting at 1199081and a

set of cycles CYThen the average power consumption of thetrace tr is

AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)

(23)

Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin

AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)

(24)

Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition

EG (cy)DL (cy)

ge AVG (cyopt) =EG (cyopt)

DL (cyopt) forallcy isin CY (25)

Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)

From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy

(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908

119894

Start

No Yes

Generate WTG

Input w1 and W

i = 1

Calculatecyopt and ptopt

Is wi in cyopt

Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))

Update wi+1i++

Update wi+1i++

Figure 5 Flowchart diagram of the proposed operation policy

(ii) Otherwise take a path ptopt = (1198901 1198902 119890

119899) that has

the minimum AVG(ptopt) value and dst(119890119899) is in the

optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost

The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)

Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on

8 Mobile Information Systems

5 Extension to Discrete DVFS

Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique

Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896

119894isin 119870 can be chosen as

a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement

Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl

119900

to wl119899is said to be valid if

th119899minus1

ltwl119900

119896le min (th

119899 119879) exist119896 isin 119870 (26)

Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition

Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl

119900to wl119899is

sf (wl119900wl119899)

= argmin119896isin119870

th119899minus1

ltwl119900

119896le min (th

119899 119879)

(27)

6 Experiments

In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations

61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10

and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time

5 10 15 20 25 30 350Delay (ms)

0

5

10

15

20

25

30

35

Wor

kloa

d (m

s)

Real-time constraint Tkt k = 02

kt k = 04

kt k = 06

kt k = 08

kt k = 10

W(t)

Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm

constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870

We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl

3 that is 119908

1= wl3= 9833

ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl

2 which implies a stable

operation mode forall119894 119896119894= 06

62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896

3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th

1= 02 th

2= 06 th

3= 07

and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example

The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand

Mobile Information Systems 9

W

t

wl4 = 06

wl3 = 05

wl1 = 03

wl2 = 04

k = 09 k = 08

k = 04

T = 10

th1 = 02

th2 = 06

th3 = 07

(a)

wl1

wl2

wl3 wl408

08

08

08

04

04

09

09

(b)

Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl

2= 04 wl

3= 05 and wl

4= 06 and (b) its WTG

representation where the edges are annotated with scaling factors

an alternating operation mode implied in wl2rarr wl

4rarr

wl3rarr wl2shows the least average power consumption in the

discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes

In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870

7 Conclusion

This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones

Disclosure

A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and

Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University

References

[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008

[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010

[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011

[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000

[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013

[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of

10 Mobile Information Systems

Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008

[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007

[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008

[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001

[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000

[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015

[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016

[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011

[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008

[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015

[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001

[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005

[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973

[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001

[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000

[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: Research Article Power Optimization of Multimode Mobile

Mobile Information Systems 5

the one to wl119903is not One can tell if a transition can happen

or not with the following definition

Definition 3 (valid transition) A workload transition fromwl119900to wl119899is said to be valid if wl

119900th119899le 119896max = 1 and

wl119900th119899minus1

ge 119896min

43WorkloadTransitionGraph Theessential difficulty of thepresented power optimization problem lies in the fact thattwo conflicting forces should be handled at the same timeIn order to minimize the power on one hand it tries to scaledown the speed (thus lengthen the delay) as much as possibleas described in (2) and (6) On the other hand the lengtheneddelay is not desirable as it imposes a bigger workload in thesuccessive iteration as shown in (4)

Therefore no one simple intuition can be exploited tosolve the problem Rather we need to compare differentmodes in a comprehensive way In order to be able to exploreall possible execution modes and quantify their effects weneed to devise a data structure that includes elementaryinformation on how workload transitions change the systemstatus and power dissipation In line with that purpose wepropose a graph representation of the workload evolutionWorkload Transition Graph (WTG) which captures all pos-sible transition scenarios of the workload changes during thelifetime of a system

A valid workload transition from one workload level toanother can be caused by any delay within the correspondingrange A transition fromwl

119901towl119902in Figure 1(a) for instance

can be caused by any delay within the range of (th119902minus1 th119902] In

other words when handling a workload of wl119901 any scaling

factor within the range of [wl119901th119902wl119901th119902minus1) can cause the

transition In the power-optimal execution trace howeveronly one specific scaling factor is always chosen for a certaintransition even though it happens many times We show thisin the next theorem

Theorem 4 (optimal scaling factor) In the power-optimaltrace 119905119903 = (119896

1 1198962 119896

|119905119903|) if the workload level handled by

119896119894is 119908119897119900and the next workload level is 119908119897

119899

119896119894= min(

119908119897119900

119905ℎ119899

119896min) (11)

Proof We prove this by contradiction Let us suppose thatthere exists a power-optimal trace tr where 119896

119894(which

results in the transition from wl119900to wl119899) is not equal to

min(wl119900th119899 119896min) Then we make a new execution trace tr1015840

from tr by replacing 119896119894with 1198961015840

119894= min(wl

119900th119899 119896min) Note

that all other execution modes in tr1015840 are the same as in trThat is

tr (119895) = tr1015840 (119895) forall119895 = 119894 (12)

By the definition of the scaling factor

119896119894ge 119896min

119896119894gewl119900

th119899

(13)

since the transition is from wl119900to wl119899 Thus

tr (119894) = 119896119894 gt tr1015840 (119894) = 1198961015840119894

(14)

and accordingly we obtain

119878 (119896119894) gt 119878 (119896

1015840

119894) (15)

Then from (6) and (7) the average power in tr1015840 is lower thanthat of tr and this contradicts the proposition

From Theorem 4 we know that only one speed scalingfactor is associated with all transitions of a certain type inthe power-optimal operation scenario So we define a scalingfactor of a valid transition as follows

Definition 5 (scaling factor of a transition) The scaling factorof a valid workload transition from wl

119900to wl119899is

sf (wl119900wl119899) = max(

wl119900

min (th119899 119879)

119896min) (16)

Now we define WTG

Definition 6 (Workload Transition Graph) WTG is definedto be a graph (119881 119864) where119881 and119864 are the sets of vertices andedges respectively Each valid workload level forms a vertex

119881 = V119894= wl119894| wlmin le wl

119894le wlmax (17)

while a valid transition from a workload level to anotherforms an edge between vertices That is there exists an edgefrom V

119894to V119895corresponding to a valid transition from wl

119894to

wl119895

119864 = (V119894 V119895) | exist valid transition from wl

119894to wl119895 (18)

We denote the source and destination vertices of an edge 119890 isin119864 as src(119890) and dst(119890) respectively

With these definitions a power-optimal execution tracetr = (119896

1 1198962 119896

|tr|) can be represented as a walk (asequence of vertices where any pair of consecutive verticesare connected through an edge) of length |tr| + 1 in WTG Ina different form the execution trace is a sequence of edgesin WTG of length |tr|(119890

1 1198902 119890

|tr|) where the workloadtransition from src(119890

119894) to dst(119890

119894) is caused by the execution

mode sf(src(119890119894) dst(119890

119894)) = 119896

119894

Algorithm 1 shows howWTG is generated out of the givenworkload function119882 and the initial workload 119908

1 After the

initialization all valid workload levels are added as verticesin lines (5)-(6) Then for each permutation of two workloadlevels (lines (7)-(8)) it is checked if the in-between transitionis valid or not in line (9) If valid it is added as an edge in line(10)

Let us take Figure 2 as an example of workload function119882 Once given the initial workload one can easily get thevalid workload levels according to (8) and (9) When 119908

1=

wl2 for instance wlmin = wl

2and wlmax = wl

6 Figure 3(b)

illustrates the corresponding WTG of the workload function

6 Mobile Information Systems

(1) procedure GenerateWTG (1198821199081)

(2) wlmin larr equation (8) ⊳ Initializations(3) wlmax larr equation (9)(4) 119881 larr 120601 119864 larr 120601(5) for all wlmin le wl le wlmax do ⊳ Vertices(6) 119881 larr 119881 cup wl(7) for all wlmin le wl

1le wlmax do ⊳ Edges

(8) for all wlmin le wl2le wlmax do

(9) if exist valid transition from wl1to wl2then

(10) 119864 larr 119864 cup (wl1wl2)

Algorithm 1 Generation of WTG

y

t

T

wl1

wl2

wl3

wl4

wl5

wl6

th1 th2 th3 th4 th5

y = kmaxt

y = kmint

Feasible delay range of wl4

Figure 2 A workload staircase function example with six workloadlevels

shown in Figure 2 with the initial workload of wl2 It is not a

complete graph as some pairs of vertices cannot be connecteddirectly since it is not valid according to Definition 3

The feasible delay range to handle workload wl4is

highlighted in Figure 2 justifying that vertex wl4has three

outgoing edges to wl6 wl5 and itself To be more specific

when the workload is given as wl4 the shortest possible delay

is the case when the speed is chosen as 119896max Then the delay(119909 coordinate of the cross point of 119910 = wl

4and 119910 = 119896max119905) is

between th3and th

4as shown in Figure 2Thismeans that the

lowest possibleworkload in the next iteration iswl4 Similarly

the biggest possible workload can also be calculated as wl6

as the biggest possible delay is between th5and 119879 Note also

that some of the vertices may not have an edge directed toitself (self-loop) such as vertex wl

3 It has outgoing edges only

to higher workload levels This means that the computationburden of that state is so big that it only results in higherworkload levels at the next iteration even with the full speed

Different initial workload levels may result in differentWTGs as shown in Figures 3(a)ndash3(c)TheWTGderived fromthe initial workload level of wl

1is illustrated in Figure 3(a)

In contrast to Figure 3(b) vertex wl1is included in the graph

The WTG derived from higher initial workload levels wl4

wl5 and wl

6 is shown in Figure 3(c) Note that the WTGs in

Figures 3(a) and 3(c) are not strongly connected Vertex wl2in

Figure 3(b) for example is not reachable from wl4 However

from the definition of valid workload levels all vertices arereachable from the initial workload level This property is

important for deriving the optimal operation policy that willbe presented in the next subsection

44 Proposed Operation Policy In this subsection we presentthe proposed operation policy that compromises the energy-delay tradeoff caused by the delay-workload dependency

As stated earlier a power-optimal execution trace tr canbe represented as awalk ofWTGThen we have the followingdefinition

Definition 7 (corresponding walk) Given the power-optimalexecution trace tr = (119896

1 1198962 119896

|tr|) and its initial workload1199081 the corresponding walk of the trace is denoted as

cwtr1199081

= (1198901 1198902 119890

|tr|) (19)

where src(1198901) = 1199081and sf(src(119890

119894) dst(119890

119894)) = 119896

119894(for simplicity

we also denote it as sf(119890119894) = sf(src(119890

119894) dst(119890

119894)) = 119896

119894for the rest

of the paper)The average power consumption of thewalk canbe formulated as follows

AVG (cwtr1199081

) =

EG (cwtr1199081

)

DL (cwtr1199081

)

(20)

where DL((1198901 1198902 119890

119899)) is the total delay elapsed

for traversing the walk sum119899

119894=1(src(119890

119894)sf(119890

119894)) and

EG((1198901 1198902 119890

119899)) is the total energy consumption for

the walk 119862119890sum119899

119894=1src(119890119894)119878(sf(119890

119894))

A cycle (closed walk) of WTG is a walk whose startingand ending vertices are the sameThat is awalk (119890

1 1198902 119890

119899)

is a cycle if src(1198901) = dst(119890

119899) Hence if the corresponding

walk of a trace inWTG is a cycle the trace is ever repeatableWe argue that in case that the length of the trace issufficiently long (|tr| rarr infin) the average power consumptionis minimized when the cycle which minimizes (20) repeatsover and over again in the trace

Theorem 8 (optimal cycle of WTG) Suppose that a cycle ofWTG

119888119910119900119901119905

= (1198901 1198902 119890

119899) (21)

has the minimum value of (20) among all cycles of the WTGThen if the length of the trace is long enough the average powerconsumption of the optimal trace converges to

AVG (119888119910119900119901119905) =

EG (119888119910119900119901119905)

DL (119888119910119900119901119905)

(22)

Proof An arbitrary walk of a WTG (1198901 1198902 119890

119899) can be

decomposed into a path (a walk with distinct vertices) fromsrc(1198901) to dst(119890

119899) and a set of cycles (see eg Section 103 of

[17]) Figure 4 depicts a walk example of length 8 where theinitial workload level is wl

2and the last vertex that it traverses

is wl6 If a path from wl

2to wl6 (1198901 1198902 1198905) highlighted in the

dashed arrows is removed from the walk the remaining partis a set of cycles

Now consider a power-optimal trace tr of length 119873 thatstarts from the workload level of119908

1 The corresponding walk

Mobile Information Systems 7

wl1

wl2 wl3 wl4 wl5 wl6

(a)

wl2

wl3 wl4 wl5 wl6

(b)

wl4 wl5 wl6

(c)

Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl

2or wl3 and

(c) wl4 wl5 or wl

6

wl2 wl3 wl4 wl5 wl6e1

e2

e3

e4 e5

e6e7

e8

Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8

of tr can be decomposed into a path pt starting at 1199081and a

set of cycles CYThen the average power consumption of thetrace tr is

AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)

(23)

Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin

AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)

(24)

Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition

EG (cy)DL (cy)

ge AVG (cyopt) =EG (cyopt)

DL (cyopt) forallcy isin CY (25)

Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)

From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy

(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908

119894

Start

No Yes

Generate WTG

Input w1 and W

i = 1

Calculatecyopt and ptopt

Is wi in cyopt

Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))

Update wi+1i++

Update wi+1i++

Figure 5 Flowchart diagram of the proposed operation policy

(ii) Otherwise take a path ptopt = (1198901 1198902 119890

119899) that has

the minimum AVG(ptopt) value and dst(119890119899) is in the

optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost

The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)

Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on

8 Mobile Information Systems

5 Extension to Discrete DVFS

Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique

Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896

119894isin 119870 can be chosen as

a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement

Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl

119900

to wl119899is said to be valid if

th119899minus1

ltwl119900

119896le min (th

119899 119879) exist119896 isin 119870 (26)

Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition

Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl

119900to wl119899is

sf (wl119900wl119899)

= argmin119896isin119870

th119899minus1

ltwl119900

119896le min (th

119899 119879)

(27)

6 Experiments

In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations

61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10

and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time

5 10 15 20 25 30 350Delay (ms)

0

5

10

15

20

25

30

35

Wor

kloa

d (m

s)

Real-time constraint Tkt k = 02

kt k = 04

kt k = 06

kt k = 08

kt k = 10

W(t)

Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm

constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870

We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl

3 that is 119908

1= wl3= 9833

ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl

2 which implies a stable

operation mode forall119894 119896119894= 06

62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896

3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th

1= 02 th

2= 06 th

3= 07

and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example

The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand

Mobile Information Systems 9

W

t

wl4 = 06

wl3 = 05

wl1 = 03

wl2 = 04

k = 09 k = 08

k = 04

T = 10

th1 = 02

th2 = 06

th3 = 07

(a)

wl1

wl2

wl3 wl408

08

08

08

04

04

09

09

(b)

Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl

2= 04 wl

3= 05 and wl

4= 06 and (b) its WTG

representation where the edges are annotated with scaling factors

an alternating operation mode implied in wl2rarr wl

4rarr

wl3rarr wl2shows the least average power consumption in the

discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes

In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870

7 Conclusion

This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones

Disclosure

A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and

Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University

References

[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008

[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010

[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011

[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000

[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013

[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of

10 Mobile Information Systems

Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008

[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007

[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008

[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001

[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000

[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015

[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016

[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011

[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008

[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015

[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001

[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005

[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973

[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001

[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000

[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: Research Article Power Optimization of Multimode Mobile

6 Mobile Information Systems

(1) procedure GenerateWTG (1198821199081)

(2) wlmin larr equation (8) ⊳ Initializations(3) wlmax larr equation (9)(4) 119881 larr 120601 119864 larr 120601(5) for all wlmin le wl le wlmax do ⊳ Vertices(6) 119881 larr 119881 cup wl(7) for all wlmin le wl

1le wlmax do ⊳ Edges

(8) for all wlmin le wl2le wlmax do

(9) if exist valid transition from wl1to wl2then

(10) 119864 larr 119864 cup (wl1wl2)

Algorithm 1 Generation of WTG

y

t

T

wl1

wl2

wl3

wl4

wl5

wl6

th1 th2 th3 th4 th5

y = kmaxt

y = kmint

Feasible delay range of wl4

Figure 2 A workload staircase function example with six workloadlevels

shown in Figure 2 with the initial workload of wl2 It is not a

complete graph as some pairs of vertices cannot be connecteddirectly since it is not valid according to Definition 3

The feasible delay range to handle workload wl4is

highlighted in Figure 2 justifying that vertex wl4has three

outgoing edges to wl6 wl5 and itself To be more specific

when the workload is given as wl4 the shortest possible delay

is the case when the speed is chosen as 119896max Then the delay(119909 coordinate of the cross point of 119910 = wl

4and 119910 = 119896max119905) is

between th3and th

4as shown in Figure 2Thismeans that the

lowest possibleworkload in the next iteration iswl4 Similarly

the biggest possible workload can also be calculated as wl6

as the biggest possible delay is between th5and 119879 Note also

that some of the vertices may not have an edge directed toitself (self-loop) such as vertex wl

3 It has outgoing edges only

to higher workload levels This means that the computationburden of that state is so big that it only results in higherworkload levels at the next iteration even with the full speed

Different initial workload levels may result in differentWTGs as shown in Figures 3(a)ndash3(c)TheWTGderived fromthe initial workload level of wl

1is illustrated in Figure 3(a)

In contrast to Figure 3(b) vertex wl1is included in the graph

The WTG derived from higher initial workload levels wl4

wl5 and wl

6 is shown in Figure 3(c) Note that the WTGs in

Figures 3(a) and 3(c) are not strongly connected Vertex wl2in

Figure 3(b) for example is not reachable from wl4 However

from the definition of valid workload levels all vertices arereachable from the initial workload level This property is

important for deriving the optimal operation policy that willbe presented in the next subsection

44 Proposed Operation Policy In this subsection we presentthe proposed operation policy that compromises the energy-delay tradeoff caused by the delay-workload dependency

As stated earlier a power-optimal execution trace tr canbe represented as awalk ofWTGThen we have the followingdefinition

Definition 7 (corresponding walk) Given the power-optimalexecution trace tr = (119896

1 1198962 119896

|tr|) and its initial workload1199081 the corresponding walk of the trace is denoted as

cwtr1199081

= (1198901 1198902 119890

|tr|) (19)

where src(1198901) = 1199081and sf(src(119890

119894) dst(119890

119894)) = 119896

119894(for simplicity

we also denote it as sf(119890119894) = sf(src(119890

119894) dst(119890

119894)) = 119896

119894for the rest

of the paper)The average power consumption of thewalk canbe formulated as follows

AVG (cwtr1199081

) =

EG (cwtr1199081

)

DL (cwtr1199081

)

(20)

where DL((1198901 1198902 119890

119899)) is the total delay elapsed

for traversing the walk sum119899

119894=1(src(119890

119894)sf(119890

119894)) and

EG((1198901 1198902 119890

119899)) is the total energy consumption for

the walk 119862119890sum119899

119894=1src(119890119894)119878(sf(119890

119894))

A cycle (closed walk) of WTG is a walk whose startingand ending vertices are the sameThat is awalk (119890

1 1198902 119890

119899)

is a cycle if src(1198901) = dst(119890

119899) Hence if the corresponding

walk of a trace inWTG is a cycle the trace is ever repeatableWe argue that in case that the length of the trace issufficiently long (|tr| rarr infin) the average power consumptionis minimized when the cycle which minimizes (20) repeatsover and over again in the trace

Theorem 8 (optimal cycle of WTG) Suppose that a cycle ofWTG

119888119910119900119901119905

= (1198901 1198902 119890

119899) (21)

has the minimum value of (20) among all cycles of the WTGThen if the length of the trace is long enough the average powerconsumption of the optimal trace converges to

AVG (119888119910119900119901119905) =

EG (119888119910119900119901119905)

DL (119888119910119900119901119905)

(22)

Proof An arbitrary walk of a WTG (1198901 1198902 119890

119899) can be

decomposed into a path (a walk with distinct vertices) fromsrc(1198901) to dst(119890

119899) and a set of cycles (see eg Section 103 of

[17]) Figure 4 depicts a walk example of length 8 where theinitial workload level is wl

2and the last vertex that it traverses

is wl6 If a path from wl

2to wl6 (1198901 1198902 1198905) highlighted in the

dashed arrows is removed from the walk the remaining partis a set of cycles

Now consider a power-optimal trace tr of length 119873 thatstarts from the workload level of119908

1 The corresponding walk

Mobile Information Systems 7

wl1

wl2 wl3 wl4 wl5 wl6

(a)

wl2

wl3 wl4 wl5 wl6

(b)

wl4 wl5 wl6

(c)

Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl

2or wl3 and

(c) wl4 wl5 or wl

6

wl2 wl3 wl4 wl5 wl6e1

e2

e3

e4 e5

e6e7

e8

Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8

of tr can be decomposed into a path pt starting at 1199081and a

set of cycles CYThen the average power consumption of thetrace tr is

AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)

(23)

Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin

AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)

(24)

Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition

EG (cy)DL (cy)

ge AVG (cyopt) =EG (cyopt)

DL (cyopt) forallcy isin CY (25)

Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)

From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy

(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908

119894

Start

No Yes

Generate WTG

Input w1 and W

i = 1

Calculatecyopt and ptopt

Is wi in cyopt

Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))

Update wi+1i++

Update wi+1i++

Figure 5 Flowchart diagram of the proposed operation policy

(ii) Otherwise take a path ptopt = (1198901 1198902 119890

119899) that has

the minimum AVG(ptopt) value and dst(119890119899) is in the

optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost

The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)

Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on

8 Mobile Information Systems

5 Extension to Discrete DVFS

Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique

Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896

119894isin 119870 can be chosen as

a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement

Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl

119900

to wl119899is said to be valid if

th119899minus1

ltwl119900

119896le min (th

119899 119879) exist119896 isin 119870 (26)

Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition

Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl

119900to wl119899is

sf (wl119900wl119899)

= argmin119896isin119870

th119899minus1

ltwl119900

119896le min (th

119899 119879)

(27)

6 Experiments

In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations

61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10

and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time

5 10 15 20 25 30 350Delay (ms)

0

5

10

15

20

25

30

35

Wor

kloa

d (m

s)

Real-time constraint Tkt k = 02

kt k = 04

kt k = 06

kt k = 08

kt k = 10

W(t)

Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm

constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870

We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl

3 that is 119908

1= wl3= 9833

ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl

2 which implies a stable

operation mode forall119894 119896119894= 06

62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896

3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th

1= 02 th

2= 06 th

3= 07

and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example

The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand

Mobile Information Systems 9

W

t

wl4 = 06

wl3 = 05

wl1 = 03

wl2 = 04

k = 09 k = 08

k = 04

T = 10

th1 = 02

th2 = 06

th3 = 07

(a)

wl1

wl2

wl3 wl408

08

08

08

04

04

09

09

(b)

Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl

2= 04 wl

3= 05 and wl

4= 06 and (b) its WTG

representation where the edges are annotated with scaling factors

an alternating operation mode implied in wl2rarr wl

4rarr

wl3rarr wl2shows the least average power consumption in the

discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes

In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870

7 Conclusion

This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones

Disclosure

A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and

Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University

References

[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008

[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010

[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011

[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000

[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013

[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of

10 Mobile Information Systems

Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008

[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007

[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008

[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001

[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000

[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015

[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016

[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011

[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008

[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015

[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001

[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005

[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973

[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001

[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000

[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: Research Article Power Optimization of Multimode Mobile

Mobile Information Systems 7

wl1

wl2 wl3 wl4 wl5 wl6

(a)

wl2

wl3 wl4 wl5 wl6

(b)

wl4 wl5 wl6

(c)

Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl

2or wl3 and

(c) wl4 wl5 or wl

6

wl2 wl3 wl4 wl5 wl6e1

e2

e3

e4 e5

e6e7

e8

Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8

of tr can be decomposed into a path pt starting at 1199081and a

set of cycles CYThen the average power consumption of thetrace tr is

AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)

(23)

Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin

AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)

(24)

Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition

EG (cy)DL (cy)

ge AVG (cyopt) =EG (cyopt)

DL (cyopt) forallcy isin CY (25)

Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)

From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy

(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908

119894

Start

No Yes

Generate WTG

Input w1 and W

i = 1

Calculatecyopt and ptopt

Is wi in cyopt

Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))

Update wi+1i++

Update wi+1i++

Figure 5 Flowchart diagram of the proposed operation policy

(ii) Otherwise take a path ptopt = (1198901 1198902 119890

119899) that has

the minimum AVG(ptopt) value and dst(119890119899) is in the

optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost

The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)

Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on

8 Mobile Information Systems

5 Extension to Discrete DVFS

Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique

Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896

119894isin 119870 can be chosen as

a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement

Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl

119900

to wl119899is said to be valid if

th119899minus1

ltwl119900

119896le min (th

119899 119879) exist119896 isin 119870 (26)

Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition

Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl

119900to wl119899is

sf (wl119900wl119899)

= argmin119896isin119870

th119899minus1

ltwl119900

119896le min (th

119899 119879)

(27)

6 Experiments

In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations

61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10

and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time

5 10 15 20 25 30 350Delay (ms)

0

5

10

15

20

25

30

35

Wor

kloa

d (m

s)

Real-time constraint Tkt k = 02

kt k = 04

kt k = 06

kt k = 08

kt k = 10

W(t)

Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm

constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870

We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl

3 that is 119908

1= wl3= 9833

ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl

2 which implies a stable

operation mode forall119894 119896119894= 06

62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896

3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th

1= 02 th

2= 06 th

3= 07

and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example

The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand

Mobile Information Systems 9

W

t

wl4 = 06

wl3 = 05

wl1 = 03

wl2 = 04

k = 09 k = 08

k = 04

T = 10

th1 = 02

th2 = 06

th3 = 07

(a)

wl1

wl2

wl3 wl408

08

08

08

04

04

09

09

(b)

Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl

2= 04 wl

3= 05 and wl

4= 06 and (b) its WTG

representation where the edges are annotated with scaling factors

an alternating operation mode implied in wl2rarr wl

4rarr

wl3rarr wl2shows the least average power consumption in the

discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes

In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870

7 Conclusion

This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones

Disclosure

A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and

Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University

References

[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008

[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010

[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011

[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000

[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013

[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of

10 Mobile Information Systems

Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008

[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007

[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008

[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001

[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000

[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015

[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016

[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011

[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008

[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015

[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001

[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005

[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973

[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001

[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000

[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: Research Article Power Optimization of Multimode Mobile

8 Mobile Information Systems

5 Extension to Discrete DVFS

Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique

Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896

119894isin 119870 can be chosen as

a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement

Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl

119900

to wl119899is said to be valid if

th119899minus1

ltwl119900

119896le min (th

119899 119879) exist119896 isin 119870 (26)

Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition

Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl

119900to wl119899is

sf (wl119900wl119899)

= argmin119896isin119870

th119899minus1

ltwl119900

119896le min (th

119899 119879)

(27)

6 Experiments

In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations

61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10

and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time

5 10 15 20 25 30 350Delay (ms)

0

5

10

15

20

25

30

35

Wor

kloa

d (m

s)

Real-time constraint Tkt k = 02

kt k = 04

kt k = 06

kt k = 08

kt k = 10

W(t)

Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm

constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870

We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl

3 that is 119908

1= wl3= 9833

ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl

2 which implies a stable

operation mode forall119894 119896119894= 06

62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896

3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th

1= 02 th

2= 06 th

3= 07

and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example

The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand

Mobile Information Systems 9

W

t

wl4 = 06

wl3 = 05

wl1 = 03

wl2 = 04

k = 09 k = 08

k = 04

T = 10

th1 = 02

th2 = 06

th3 = 07

(a)

wl1

wl2

wl3 wl408

08

08

08

04

04

09

09

(b)

Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl

2= 04 wl

3= 05 and wl

4= 06 and (b) its WTG

representation where the edges are annotated with scaling factors

an alternating operation mode implied in wl2rarr wl

4rarr

wl3rarr wl2shows the least average power consumption in the

discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes

In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870

7 Conclusion

This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones

Disclosure

A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and

Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University

References

[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008

[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010

[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011

[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000

[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013

[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of

10 Mobile Information Systems

Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008

[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007

[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008

[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001

[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000

[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015

[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016

[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011

[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008

[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015

[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001

[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005

[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973

[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001

[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000

[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: Research Article Power Optimization of Multimode Mobile

Mobile Information Systems 9

W

t

wl4 = 06

wl3 = 05

wl1 = 03

wl2 = 04

k = 09 k = 08

k = 04

T = 10

th1 = 02

th2 = 06

th3 = 07

(a)

wl1

wl2

wl3 wl408

08

08

08

04

04

09

09

(b)

Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl

2= 04 wl

3= 05 and wl

4= 06 and (b) its WTG

representation where the edges are annotated with scaling factors

an alternating operation mode implied in wl2rarr wl

4rarr

wl3rarr wl2shows the least average power consumption in the

discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes

In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870

7 Conclusion

This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones

Disclosure

A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and

Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]

Competing Interests

The authors declare that they have no competing interests

Acknowledgments

Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University

References

[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008

[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010

[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011

[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000

[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013

[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of

10 Mobile Information Systems

Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008

[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007

[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008

[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001

[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000

[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015

[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016

[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011

[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008

[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015

[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001

[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005

[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973

[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001

[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000

[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: Research Article Power Optimization of Multimode Mobile

10 Mobile Information Systems

Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008

[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007

[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008

[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001

[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000

[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015

[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016

[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011

[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008

[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015

[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001

[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005

[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973

[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001

[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000

[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: Research Article Power Optimization of Multimode Mobile

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014