Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Research ArticlePower Optimization of Multimode Mobile EmbeddedSystems with Workload-Delay Dependency
Hoeseok Yang1 and Soonhoi Ha2
1Department of ECE Ajou University Suwon 16499 Republic of Korea2Department of CSE Seoul National University Seoul 08826 Republic of Korea
Correspondence should be addressed to Soonhoi Ha shasnuackr
Received 24 March 2016 Accepted 14 June 2016
Academic Editor Yuh-Shyan Chen
Copyright copy 2016 H Yang and S Ha This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
This paper proposes to take the relationship between delay andworkload into account in the power optimization ofmicroprocessorsin mobile embedded systems Since the components outside a device continuously change their values or properties the workloadto be handled by the systems becomes dynamic and variableThis variable workload is formulated as a staircase function of the delaytaken at the previous iteration in this paper and applied to the power optimization of DVFS (dynamic voltage-frequency scaling)In doing so a graph representation of all possible workloadmode changes during the lifetime of a device Workload TransitionGraph (WTG) is proposed Then the power optimization problem is transformed into finding a cycle (closed walk) in WTGwhich minimizes the average power consumption over it Out of the obtained optimal cycle of WTG one can derive the optimalpower management policy of the target device It is shown that the proposed policy is valid for both continuous and discrete DVFSmodels The effectiveness of the proposed power optimization policy is demonstrated with the simulation results of synthetic andreal-life examples
1 Introduction
Todayrsquos mobile embedded systems often interact with physi-cal processes or external environments referred to as Cyber-Physical Systems (CPSs) Such systems are usually mod-eled with interactions between the physical world and thedevices [1] For instance handheld or stationary embeddedsystems need to continuously interact with environments inthe example of smart building [2] The system performs acomputational task and responds through an actuator to thephysical side while the resulting change at the physical sidein turn makes a variation on the input (sensor) of the deviceIn order not to make this control loop unstable it is commonthat the embedded system has a real-time constraint withinwhich all the computation should be completed
In a class of applications the computational workloadof the embedded systems depends on the variation of thesampled input value while the computation delay in turnaffects the input variation of the next iteration Usually if itinvests more time at one iteration for processing informa-tion it would have more work to do at the next iteration
One example of such delay-workload dependency can befound in an object tracking which is frequently used in dronesurveillance camera or augmented reality [3ndash5] The imageobtained from the camera is processed by the object trackerto follow an object As the object may continuously change itsposition meanwhile the object tracker should reactively takean image from the adjusted positionangle to make the nextdecision The more time the object spends in the tracker themore distance the object will move by
Such workload-delay relations can be popularly found inmodern mobile embedded systems which rely on computervision algorithms to capture what happens in the externalworld In those applications it is typical that the currentinternal state is maintained to figure out the difference causedbywhat happened in the external worldThe examples of suchinternal states range from a simple snapshot of a sensor read-ing to a complicated model of the scene obtained from cam-era No matter what the model is it is generally true that thelonger execution delay between two consecutive invocationsof the algorithm results in the larger workload in the succes-sive iteration as the degree of the heterogeneity gets bigger
Hindawi Publishing CorporationMobile Information SystemsVolume 2016 Article ID 2010837 10 pageshttpdxdoiorg10115520162010837
2 Mobile Information Systems
The workload-delay dependency can also be foundin many different types of applications Real-time patternmatching over event streams [6] for instance exhibits similarbehavior the queries can be handled either by small amount(shorter delay less workload) or in an aggregated manner(longer delay more workload) Similarly haptic rendering inHuman-Computer Interface (HCI) uses adaptive samplingtechniques to deal with the stringent real-time constraint[7] and the rendering algorithm can be warm-started toexploit the temporal coherence [8] In essence applicationswhich exploit temporal coherence have possible workload-delay dependencies That is any iterative algorithms that canbe warm-started can lead to one
Nowadays mostmodernmicroprocessors used inmobileembedded systems support dynamic voltage-frequency scal-ing (DVFS) [9] for power-efficient operations Generallydelay and energy in the systems with DVFS are in a tradeoffrelationship for a given workload That is given a certainamount of work to be handled a faster solution (with a higherfrequency) is less energy efficient Considering this controlknob with the aforementioned delay-workload dependencythe power optimization problem gets very challenging Con-ventionally it has just been understood that ldquoworking as slowas possiblerdquo within the real-time constraint is the best disci-pline in terms ofminimizing the power dissipation Howeverwith the existence of the workload-delay dependency it isno longer valid since a slower execution may cause a biggerworkload at the next iteration On the other hand ldquoas fast aspossiblerdquo is not optimal either as the power consumption is astrong function of the operating speed [10]
Theworkload-delay dependency has been firstlymodeledand applied to the DVFS optimization in [11] It is assumedthat the workload is a continuous andmonotonically increas-ing function of the delay under which a simple yet effectivepower management technique has been proposed Specifi-cally it has been shown that staying in a certain DVFS modeis better than alternating between different DVFS modesdynamically Later the optimization is generalized to variouspower models and formally proven to be optimal [12]
This work differs from our previous work [12] in thatwe take different optimization approach tailored for discreteworkload levels We observed that the continuity assumptiondoes not always hold true in reality Rather there are anumber of applications that have discrete levels of workloadFor instance recall that many image or signal processingalgorithms handle input data in the unit of macroblock orframe In such application domains the workload tends togrow in a discrete manner In this paper the workload ismodeled as a staircase function of the delay taken in theprevious iteration Since the solution obtained by the previouswork [11 12] is no longer optimal or nonexisting at all inthe staircase model a new power management technique isproposedThe contributions of this paper can be summarizedas follows
(i) The workload-delay dependency is modeled in astaircase function generalizing the previous modeland validated with a real-life example
(ii) A novel data structure Workload Transition Graph(WTG) is proposed to represent all possible opera-tion workloadmode changes of a device
(iii) Based on WTG a power management policy isderived and shown to be optimal
2 Related Work
Bogdan and Marculescu [13] observed that workloads fromphysical processes tend to be nonstationary but exhibit somesystematic relationship in space and time They proposeda workload characterization approach based on statisti-cal physics and showed how the workload-awareness canimprove the design of electronic systems Zhang et al [14]studied the relationship between the control stability andworkload in inverted pendulum control While enlargedinvocation periods may lower the degree of stability moreinverted pendulums can be controlled by a system as thelengthened invocation periods lower the utilization of thealgorithmThis can be seen as trading off the control stabilityfor resource efficiency In other words they proposed tosacrifice the stability to accommodate more workload ina system The proposed technique also deals with variableworkload in electronic systems but differs from the above-mentioned works in that the effect of execution delay onworkload is systematically considered
Recently Pant et al [15] proposed a codesign of computa-tion delay and control stability based on anytime algorithmAnytime algorithm is a kind of algorithms that can be stoppedat any point in time but still provides a decent solutionTypically the quality of the solution is increasing functionof the computation delay In their work it is the duty ofthe control algorithm to adaptively change the real-timedeadline constraint and error bound (quality of control) Onthe contrary the relationship between execution delay andworkload is formally described in the form of workload-delay function thus no explicit runtime monitoringcontrolis required in the proposed technique
A design guideline for flexible delay constraints in dis-tributed embedded systems was proposed by Goswami etal [16] where some of the samples are allowed to violatethe given delay deadline They presented the applicability ofthe proposed approach using the FlexRay dynamic segmentas a communication medium This work is similar to theproposed approach in the sense that they do not stick to agiven fixed real-time deadline While they could avoid theresource overprovisioning by trading off the hard real-timeconstraints the workload dependency to the delay has notbeen considered Moreover from a real-time standpoint theproposed work is more rigorous as it allows no real-timeconstraint violations
3 Problem Definition
This section presents the system model assumed in thispaper which is followed by the formulation of the poweroptimization problem
Mobile Information Systems 3
31 System Model
311 Dynamic Voltage-Frequency Scaling In this paper weassume that a system has multiple operation modes due toDVFS feature where the operating frequency and voltagecan be modulated For simplicity we first assume that thereare infinitely many operation modes available among whichone is chosen at each iteration It will be shown that theproposed technique can be applied to a discrete DVFS aswell in Section 5 The operation mode at the 119894th iteration isrepresentedwith the speed scaling factor 119896
119894ranging from 119896min
to 119896max = 1 (119896min le 119896 le 119896max = 1) Then the operatingfrequency of the 119894th iteration 119891
119894 is
119891119894= 119896119894sdot 119891max (1)
where119891max is themaximum frequency of themicroprocessor
312 Workload The workload is defined to be a number ofclock cycles elapsed to complete the given computation Wedenote the number of cycles elapsed to handle the workloadof the 119894th iteration at the full speed of themicroprocessor (119896
119894=
1) as 119905ref 119894 That is
119908119894= 119905ref119894 = 119896119894 sdot 119905119894 (2)
Note that the elapsed time 119905119894increases as the speed is scaled
down (119896119894lt 1) Then the delay 119905
119894is automatically determined
when a speed scaling factor 119896119894is chosen for the given
workload (119905119894= 119908119894119896119894)
313 Real-Time Constraint The delay 119905119894cannot be unbound-
edly long as the system is associated with real-time constraint119879 For all iterations the elapsed time should be no more than119879
119905119894le 119879 forall119894 (3)
314 Delay-Workload Dependency As stated earlier theworkload is dependent upon the previous execution delayUsually the workload is not a continuous function of thedelay variation Rather the changes happen in a discretemanner Therefore the workload at the 119894th iteration is amonotonically increasing staircase function of the delay ofthe previous iteration 119905
119894minus1 119908119894= 119882(119905
119894minus1) If the given system
has 119899 workload levels the workload function 119882 can beformulated as follows
119882(119905) =
wl1
if 0 lt 119905 le th1
wl2
if th1lt 119905 le th
2
wl119899minus1
if th119899minus2
lt 119905 le th119899minus1
wl119899
if th119899minus1
lt 119905
(4)
in which the workload levels are wl1lt wl2lt sdot sdot sdot lt wl
119899and
the delay thresholds (workload changing moments) are th1lt
th2lt sdot sdot sdot lt th
119899minus1
315 Execution Trace At the 119894th iteration the speed scalingfactor 119896
119894uniquely defines an execution mode as the delay is
fixed accordingly by (2)The initial workload is assumed to begiven as119908
1 Then an execution trace tr of length 119871 is defined
to be a sequence of the speed scaling factors of 119871 iterations
tr fl (1198961 1198962 119896
119871) (5)
316 Average Power Consumption The dynamic power con-sumption of CMOS circuits is 119888V2
119889119889119891 where 119888 V
119889119889 and 119891 are
capacitance operating voltage and frequency respectivelyAs the operating frequency is proportional to (V
119889119889minus Vth)2V119889119889
[10] the power consumption is an increasing function of 119891It is worth noting that the proposed model is not dependentupon any specific DVFS model We denote the energyconsumption of a unit workload at the full speed (119896 = 1) as119862119890and assume that energy dissipation grows linearly to the
size of workloadThen the reference energy of a workload119908119894
at the full speed is 119862119890sdot 119908119894 Given a DVFS energy model
119878 as a function of the speed scaling factor 119896 the energyconsumption at the 119894th iteration EG
119894is formulated as follows
EG119894= 119862119890sdot 119908119894sdot 119878 (119896119894) (6)
in which 119878(0) = 0 and 119878(1) = 1 Then the average powerconsumption of a trace tr can be formulated as follows
AVG (tr) =sum|tr|119894=1
EG119894
sum|tr|119894=1119905119894
(7)
It is worthwhile to mention that the proposed technique isnot specific to a certain workload-energy model While weadopt linear model for the workload-energy relation for easeof presentation any possibly nonlinear model can be used in(6)
32 Problem Formulation Our objective is to minimize theaverage power consumption of a given system as follows
Given the modeling constant 119862119890 DVFS energy mod-
eling function 119878 workload function119882 and the real-time constraint 119879 determine an execution trace trsuch that the average power consumption formulatedin (7) is minimized
4 Proposed Technique
In this section we describe the proposed operation man-agement policy as an answer to the problem defined in theprevious section In doing so we first derive the condition forfeasible and schedulable systems Then we study when theworkload changes and how it affects the power dissipationBased on that we propose a novel graph representationthat captures all possible workload transitions in the power-optimal operation Finally we derive the power-optimaloperation policy with the given workload function119882
41 Feasibility In this subsection we examine under whichcondition a given system is feasible First the system should
4 Mobile Information Systems
W
wlp
wlq
wlr
thr thq
w = t
t
Currentworkload
Invalid Valid
le 1wlpthqgt 1wlpthr
(a)
W
wlp
wlq
wlr
t
Currentworkload
thrminus1thqminus1
w = kmint
InvalidValidminus1 ge kminwlpthq
minus1 lt kminwlpthr
(b)
Figure 1 Examples of workload transitions (a) transitions to lower workload levels and (b) transitions to higher workload levels Thetransitions from wl
119901to wl119902are valid while the ones from wl
119901to wl119903are not
be schedulable within the given real-time constraint 119879 atevery iteration
Theorem 1 (schedulability) Given the workload function 119882and the real-time constraint 119879 the system is not schedulable if119882(119905)119905 gt 1 forall119905 isin (0 119879]
Proof Suppose that the delay at the 119894th iteration is 119905119894 Then
119908119894+1
= 119882(119905119894) gt 119905119894and119908
119894+1= 119896119894+1sdot119905119894+1
Since 119896119894+1
le 1 119905119894lt 119905119894+1
That is the delay is increasing as iteration goes by and willeventually reache the real-time constraint 119905
119895= 119879 At the next
iteration the system becomes unschedulable even with thefull speed as119882(119905
119895= 119879) = 1 sdot 119905
119895+1gt 119879 requires 119905
119895+1gt 119879
Once the workload gets bigger than 119879 the system istrivially not schedulable afterwards even with the full speed119896119894= 1 Thus the workload must not be bigger than 119879 at
any time Moreover once the workload reaches 119879 119896 shouldremain the full speed 119896 = 1 afterwards We can make theupper bound of workload even tighter if there exists 1199051015840 suchthat 119882(119905) gt 119905 for all 119905 isin (119905
1015840 119879] In this case the workload
larger than 119882(1199051015840) is not allowable as it makes the execution
delay longer and longer eventually violating the deadlineGiven the workload function119882 and the initial workload
1199081 one can calculate the lower bound of theworkload as well
If a value exists which satisfies 119882(119905) = and 119882(119905) lt 1199081
the workload will never become smaller than119882(119905) In otherwords even with the full speed the execution delay nevergoes below
Then the valid workload levels and the execution delayrange during the lifetime of a given system can be formulatedas below
Definition 2 (valid ranges) Given the workload function119882and the initial workload 119908
1 the minimum and maximum
workload levels of a system are defined to be
wlmin =
119882(max (119905)) st = 119882 (119905) lt 1199081
if exist
wl1
otherwise(8)
wlmax
=
119882(min (1199051015840)) st 119882(119905) gt 119905 forall119905 isin (1199051015840 119879] if exist 1199051015840
119882(119879) otherwise
(9)
Then the valid range of the execution delay is formulated as[119905min 119905max] according to wlmin and wlmax with
119905min = min (119905) st 119882(119905) = wlmin
119905max = min (119879max (1199051015840)) st 119882(1199051015840) = wlmax
(10)
42 Workload Transitions In this subsection we examinewhen a workload transition between valid workload levelspossibly occurs and how it affects the system
As presented in (4) workload is a function of the delaytaken at the previous iteration If the delay taken at aniteration is 119905 and th
119897minus1lt t le th
119897Then if the systemworks fast
enough to result in a shorter delay 1199051015840 le th119897minus1
the nextworkload will get smaller than wl
119897 Similarly in case that the
delay gets longer (1199051015840 gt th119897) the system will need to handle a
larger workload than wl119896at the successive iteration
However such workload transitions can occur onlywithin limited ranges Figure 1 depicts valid and invalidtransitions from one workload level Figure 1(a) shows twotransitions from a workload level wl
119901to lower ones wl
119902and
wl119903(119903 lt 119902 lt 119901) To make the next workload level wl
119902 the
delay should be in the range of (th119902minus1 th119902] Given the current
workload wl119901 the speed scaling factor should be larger than
or equal to wl119901th119902from (2) If wl
119901th119902
le 119896max = 1this workload transition can possibly occur In contrast ifwl119901th119903gt 1 for another workload level wl
119903 the transition
from wl119901to wl119903never happens because the delay never goes
below th119903even with the full processing speed
The same principle is also applied to the transitionfrom a workload level to higher ones If the delay canbe lengthened properly with a speed scaling factor withinthe range [119896min 119896max] the transitions are valid Figure 1(b)illustrates that the transition from wl
119901to wl119902is valid while
Mobile Information Systems 5
the one to wl119903is not One can tell if a transition can happen
or not with the following definition
Definition 3 (valid transition) A workload transition fromwl119900to wl119899is said to be valid if wl
119900th119899le 119896max = 1 and
wl119900th119899minus1
ge 119896min
43WorkloadTransitionGraph Theessential difficulty of thepresented power optimization problem lies in the fact thattwo conflicting forces should be handled at the same timeIn order to minimize the power on one hand it tries to scaledown the speed (thus lengthen the delay) as much as possibleas described in (2) and (6) On the other hand the lengtheneddelay is not desirable as it imposes a bigger workload in thesuccessive iteration as shown in (4)
Therefore no one simple intuition can be exploited tosolve the problem Rather we need to compare differentmodes in a comprehensive way In order to be able to exploreall possible execution modes and quantify their effects weneed to devise a data structure that includes elementaryinformation on how workload transitions change the systemstatus and power dissipation In line with that purpose wepropose a graph representation of the workload evolutionWorkload Transition Graph (WTG) which captures all pos-sible transition scenarios of the workload changes during thelifetime of a system
A valid workload transition from one workload level toanother can be caused by any delay within the correspondingrange A transition fromwl
119901towl119902in Figure 1(a) for instance
can be caused by any delay within the range of (th119902minus1 th119902] In
other words when handling a workload of wl119901 any scaling
factor within the range of [wl119901th119902wl119901th119902minus1) can cause the
transition In the power-optimal execution trace howeveronly one specific scaling factor is always chosen for a certaintransition even though it happens many times We show thisin the next theorem
Theorem 4 (optimal scaling factor) In the power-optimaltrace 119905119903 = (119896
1 1198962 119896
|119905119903|) if the workload level handled by
119896119894is 119908119897119900and the next workload level is 119908119897
119899
119896119894= min(
119908119897119900
119905ℎ119899
119896min) (11)
Proof We prove this by contradiction Let us suppose thatthere exists a power-optimal trace tr where 119896
119894(which
results in the transition from wl119900to wl119899) is not equal to
min(wl119900th119899 119896min) Then we make a new execution trace tr1015840
from tr by replacing 119896119894with 1198961015840
119894= min(wl
119900th119899 119896min) Note
that all other execution modes in tr1015840 are the same as in trThat is
tr (119895) = tr1015840 (119895) forall119895 = 119894 (12)
By the definition of the scaling factor
119896119894ge 119896min
119896119894gewl119900
th119899
(13)
since the transition is from wl119900to wl119899 Thus
tr (119894) = 119896119894 gt tr1015840 (119894) = 1198961015840119894
(14)
and accordingly we obtain
119878 (119896119894) gt 119878 (119896
1015840
119894) (15)
Then from (6) and (7) the average power in tr1015840 is lower thanthat of tr and this contradicts the proposition
From Theorem 4 we know that only one speed scalingfactor is associated with all transitions of a certain type inthe power-optimal operation scenario So we define a scalingfactor of a valid transition as follows
Definition 5 (scaling factor of a transition) The scaling factorof a valid workload transition from wl
119900to wl119899is
sf (wl119900wl119899) = max(
wl119900
min (th119899 119879)
119896min) (16)
Now we define WTG
Definition 6 (Workload Transition Graph) WTG is definedto be a graph (119881 119864) where119881 and119864 are the sets of vertices andedges respectively Each valid workload level forms a vertex
119881 = V119894= wl119894| wlmin le wl
119894le wlmax (17)
while a valid transition from a workload level to anotherforms an edge between vertices That is there exists an edgefrom V
119894to V119895corresponding to a valid transition from wl
119894to
wl119895
119864 = (V119894 V119895) | exist valid transition from wl
119894to wl119895 (18)
We denote the source and destination vertices of an edge 119890 isin119864 as src(119890) and dst(119890) respectively
With these definitions a power-optimal execution tracetr = (119896
1 1198962 119896
|tr|) can be represented as a walk (asequence of vertices where any pair of consecutive verticesare connected through an edge) of length |tr| + 1 in WTG Ina different form the execution trace is a sequence of edgesin WTG of length |tr|(119890
1 1198902 119890
|tr|) where the workloadtransition from src(119890
119894) to dst(119890
119894) is caused by the execution
mode sf(src(119890119894) dst(119890
119894)) = 119896
119894
Algorithm 1 shows howWTG is generated out of the givenworkload function119882 and the initial workload 119908
1 After the
initialization all valid workload levels are added as verticesin lines (5)-(6) Then for each permutation of two workloadlevels (lines (7)-(8)) it is checked if the in-between transitionis valid or not in line (9) If valid it is added as an edge in line(10)
Let us take Figure 2 as an example of workload function119882 Once given the initial workload one can easily get thevalid workload levels according to (8) and (9) When 119908
1=
wl2 for instance wlmin = wl
2and wlmax = wl
6 Figure 3(b)
illustrates the corresponding WTG of the workload function
6 Mobile Information Systems
(1) procedure GenerateWTG (1198821199081)
(2) wlmin larr equation (8) ⊳ Initializations(3) wlmax larr equation (9)(4) 119881 larr 120601 119864 larr 120601(5) for all wlmin le wl le wlmax do ⊳ Vertices(6) 119881 larr 119881 cup wl(7) for all wlmin le wl
1le wlmax do ⊳ Edges
(8) for all wlmin le wl2le wlmax do
(9) if exist valid transition from wl1to wl2then
(10) 119864 larr 119864 cup (wl1wl2)
Algorithm 1 Generation of WTG
y
t
T
wl1
wl2
wl3
wl4
wl5
wl6
th1 th2 th3 th4 th5
y = kmaxt
y = kmint
Feasible delay range of wl4
Figure 2 A workload staircase function example with six workloadlevels
shown in Figure 2 with the initial workload of wl2 It is not a
complete graph as some pairs of vertices cannot be connecteddirectly since it is not valid according to Definition 3
The feasible delay range to handle workload wl4is
highlighted in Figure 2 justifying that vertex wl4has three
outgoing edges to wl6 wl5 and itself To be more specific
when the workload is given as wl4 the shortest possible delay
is the case when the speed is chosen as 119896max Then the delay(119909 coordinate of the cross point of 119910 = wl
4and 119910 = 119896max119905) is
between th3and th
4as shown in Figure 2Thismeans that the
lowest possibleworkload in the next iteration iswl4 Similarly
the biggest possible workload can also be calculated as wl6
as the biggest possible delay is between th5and 119879 Note also
that some of the vertices may not have an edge directed toitself (self-loop) such as vertex wl
3 It has outgoing edges only
to higher workload levels This means that the computationburden of that state is so big that it only results in higherworkload levels at the next iteration even with the full speed
Different initial workload levels may result in differentWTGs as shown in Figures 3(a)ndash3(c)TheWTGderived fromthe initial workload level of wl
1is illustrated in Figure 3(a)
In contrast to Figure 3(b) vertex wl1is included in the graph
The WTG derived from higher initial workload levels wl4
wl5 and wl
6 is shown in Figure 3(c) Note that the WTGs in
Figures 3(a) and 3(c) are not strongly connected Vertex wl2in
Figure 3(b) for example is not reachable from wl4 However
from the definition of valid workload levels all vertices arereachable from the initial workload level This property is
important for deriving the optimal operation policy that willbe presented in the next subsection
44 Proposed Operation Policy In this subsection we presentthe proposed operation policy that compromises the energy-delay tradeoff caused by the delay-workload dependency
As stated earlier a power-optimal execution trace tr canbe represented as awalk ofWTGThen we have the followingdefinition
Definition 7 (corresponding walk) Given the power-optimalexecution trace tr = (119896
1 1198962 119896
|tr|) and its initial workload1199081 the corresponding walk of the trace is denoted as
cwtr1199081
= (1198901 1198902 119890
|tr|) (19)
where src(1198901) = 1199081and sf(src(119890
119894) dst(119890
119894)) = 119896
119894(for simplicity
we also denote it as sf(119890119894) = sf(src(119890
119894) dst(119890
119894)) = 119896
119894for the rest
of the paper)The average power consumption of thewalk canbe formulated as follows
AVG (cwtr1199081
) =
EG (cwtr1199081
)
DL (cwtr1199081
)
(20)
where DL((1198901 1198902 119890
119899)) is the total delay elapsed
for traversing the walk sum119899
119894=1(src(119890
119894)sf(119890
119894)) and
EG((1198901 1198902 119890
119899)) is the total energy consumption for
the walk 119862119890sum119899
119894=1src(119890119894)119878(sf(119890
119894))
A cycle (closed walk) of WTG is a walk whose startingand ending vertices are the sameThat is awalk (119890
1 1198902 119890
119899)
is a cycle if src(1198901) = dst(119890
119899) Hence if the corresponding
walk of a trace inWTG is a cycle the trace is ever repeatableWe argue that in case that the length of the trace issufficiently long (|tr| rarr infin) the average power consumptionis minimized when the cycle which minimizes (20) repeatsover and over again in the trace
Theorem 8 (optimal cycle of WTG) Suppose that a cycle ofWTG
119888119910119900119901119905
= (1198901 1198902 119890
119899) (21)
has the minimum value of (20) among all cycles of the WTGThen if the length of the trace is long enough the average powerconsumption of the optimal trace converges to
AVG (119888119910119900119901119905) =
EG (119888119910119900119901119905)
DL (119888119910119900119901119905)
(22)
Proof An arbitrary walk of a WTG (1198901 1198902 119890
119899) can be
decomposed into a path (a walk with distinct vertices) fromsrc(1198901) to dst(119890
119899) and a set of cycles (see eg Section 103 of
[17]) Figure 4 depicts a walk example of length 8 where theinitial workload level is wl
2and the last vertex that it traverses
is wl6 If a path from wl
2to wl6 (1198901 1198902 1198905) highlighted in the
dashed arrows is removed from the walk the remaining partis a set of cycles
Now consider a power-optimal trace tr of length 119873 thatstarts from the workload level of119908
1 The corresponding walk
Mobile Information Systems 7
wl1
wl2 wl3 wl4 wl5 wl6
(a)
wl2
wl3 wl4 wl5 wl6
(b)
wl4 wl5 wl6
(c)
Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl
2or wl3 and
(c) wl4 wl5 or wl
6
wl2 wl3 wl4 wl5 wl6e1
e2
e3
e4 e5
e6e7
e8
Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8
of tr can be decomposed into a path pt starting at 1199081and a
set of cycles CYThen the average power consumption of thetrace tr is
AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)
(23)
Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin
AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)
(24)
Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition
EG (cy)DL (cy)
ge AVG (cyopt) =EG (cyopt)
DL (cyopt) forallcy isin CY (25)
Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)
From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy
(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908
119894
Start
No Yes
Generate WTG
Input w1 and W
i = 1
Calculatecyopt and ptopt
Is wi in cyopt
Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))
Update wi+1i++
Update wi+1i++
Figure 5 Flowchart diagram of the proposed operation policy
(ii) Otherwise take a path ptopt = (1198901 1198902 119890
119899) that has
the minimum AVG(ptopt) value and dst(119890119899) is in the
optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost
The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)
Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on
8 Mobile Information Systems
5 Extension to Discrete DVFS
Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique
Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896
119894isin 119870 can be chosen as
a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement
Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl
119900
to wl119899is said to be valid if
th119899minus1
ltwl119900
119896le min (th
119899 119879) exist119896 isin 119870 (26)
Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition
Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl
119900to wl119899is
sf (wl119900wl119899)
= argmin119896isin119870
th119899minus1
ltwl119900
119896le min (th
119899 119879)
(27)
6 Experiments
In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations
61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10
and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time
5 10 15 20 25 30 350Delay (ms)
0
5
10
15
20
25
30
35
Wor
kloa
d (m
s)
Real-time constraint Tkt k = 02
kt k = 04
kt k = 06
kt k = 08
kt k = 10
W(t)
Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm
constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870
We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl
3 that is 119908
1= wl3= 9833
ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl
2 which implies a stable
operation mode forall119894 119896119894= 06
62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896
3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th
1= 02 th
2= 06 th
3= 07
and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example
The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand
Mobile Information Systems 9
W
t
wl4 = 06
wl3 = 05
wl1 = 03
wl2 = 04
k = 09 k = 08
k = 04
T = 10
th1 = 02
th2 = 06
th3 = 07
(a)
wl1
wl2
wl3 wl408
08
08
08
04
04
09
09
(b)
Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl
2= 04 wl
3= 05 and wl
4= 06 and (b) its WTG
representation where the edges are annotated with scaling factors
an alternating operation mode implied in wl2rarr wl
4rarr
wl3rarr wl2shows the least average power consumption in the
discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes
In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870
7 Conclusion
This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones
Disclosure
A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and
Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University
References
[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008
[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010
[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011
[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000
[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013
[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of
10 Mobile Information Systems
Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008
[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007
[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008
[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001
[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000
[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015
[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016
[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011
[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008
[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015
[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001
[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005
[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973
[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001
[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000
[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
2 Mobile Information Systems
The workload-delay dependency can also be foundin many different types of applications Real-time patternmatching over event streams [6] for instance exhibits similarbehavior the queries can be handled either by small amount(shorter delay less workload) or in an aggregated manner(longer delay more workload) Similarly haptic rendering inHuman-Computer Interface (HCI) uses adaptive samplingtechniques to deal with the stringent real-time constraint[7] and the rendering algorithm can be warm-started toexploit the temporal coherence [8] In essence applicationswhich exploit temporal coherence have possible workload-delay dependencies That is any iterative algorithms that canbe warm-started can lead to one
Nowadays mostmodernmicroprocessors used inmobileembedded systems support dynamic voltage-frequency scal-ing (DVFS) [9] for power-efficient operations Generallydelay and energy in the systems with DVFS are in a tradeoffrelationship for a given workload That is given a certainamount of work to be handled a faster solution (with a higherfrequency) is less energy efficient Considering this controlknob with the aforementioned delay-workload dependencythe power optimization problem gets very challenging Con-ventionally it has just been understood that ldquoworking as slowas possiblerdquo within the real-time constraint is the best disci-pline in terms ofminimizing the power dissipation Howeverwith the existence of the workload-delay dependency it isno longer valid since a slower execution may cause a biggerworkload at the next iteration On the other hand ldquoas fast aspossiblerdquo is not optimal either as the power consumption is astrong function of the operating speed [10]
Theworkload-delay dependency has been firstlymodeledand applied to the DVFS optimization in [11] It is assumedthat the workload is a continuous andmonotonically increas-ing function of the delay under which a simple yet effectivepower management technique has been proposed Specifi-cally it has been shown that staying in a certain DVFS modeis better than alternating between different DVFS modesdynamically Later the optimization is generalized to variouspower models and formally proven to be optimal [12]
This work differs from our previous work [12] in thatwe take different optimization approach tailored for discreteworkload levels We observed that the continuity assumptiondoes not always hold true in reality Rather there are anumber of applications that have discrete levels of workloadFor instance recall that many image or signal processingalgorithms handle input data in the unit of macroblock orframe In such application domains the workload tends togrow in a discrete manner In this paper the workload ismodeled as a staircase function of the delay taken in theprevious iteration Since the solution obtained by the previouswork [11 12] is no longer optimal or nonexisting at all inthe staircase model a new power management technique isproposedThe contributions of this paper can be summarizedas follows
(i) The workload-delay dependency is modeled in astaircase function generalizing the previous modeland validated with a real-life example
(ii) A novel data structure Workload Transition Graph(WTG) is proposed to represent all possible opera-tion workloadmode changes of a device
(iii) Based on WTG a power management policy isderived and shown to be optimal
2 Related Work
Bogdan and Marculescu [13] observed that workloads fromphysical processes tend to be nonstationary but exhibit somesystematic relationship in space and time They proposeda workload characterization approach based on statisti-cal physics and showed how the workload-awareness canimprove the design of electronic systems Zhang et al [14]studied the relationship between the control stability andworkload in inverted pendulum control While enlargedinvocation periods may lower the degree of stability moreinverted pendulums can be controlled by a system as thelengthened invocation periods lower the utilization of thealgorithmThis can be seen as trading off the control stabilityfor resource efficiency In other words they proposed tosacrifice the stability to accommodate more workload ina system The proposed technique also deals with variableworkload in electronic systems but differs from the above-mentioned works in that the effect of execution delay onworkload is systematically considered
Recently Pant et al [15] proposed a codesign of computa-tion delay and control stability based on anytime algorithmAnytime algorithm is a kind of algorithms that can be stoppedat any point in time but still provides a decent solutionTypically the quality of the solution is increasing functionof the computation delay In their work it is the duty ofthe control algorithm to adaptively change the real-timedeadline constraint and error bound (quality of control) Onthe contrary the relationship between execution delay andworkload is formally described in the form of workload-delay function thus no explicit runtime monitoringcontrolis required in the proposed technique
A design guideline for flexible delay constraints in dis-tributed embedded systems was proposed by Goswami etal [16] where some of the samples are allowed to violatethe given delay deadline They presented the applicability ofthe proposed approach using the FlexRay dynamic segmentas a communication medium This work is similar to theproposed approach in the sense that they do not stick to agiven fixed real-time deadline While they could avoid theresource overprovisioning by trading off the hard real-timeconstraints the workload dependency to the delay has notbeen considered Moreover from a real-time standpoint theproposed work is more rigorous as it allows no real-timeconstraint violations
3 Problem Definition
This section presents the system model assumed in thispaper which is followed by the formulation of the poweroptimization problem
Mobile Information Systems 3
31 System Model
311 Dynamic Voltage-Frequency Scaling In this paper weassume that a system has multiple operation modes due toDVFS feature where the operating frequency and voltagecan be modulated For simplicity we first assume that thereare infinitely many operation modes available among whichone is chosen at each iteration It will be shown that theproposed technique can be applied to a discrete DVFS aswell in Section 5 The operation mode at the 119894th iteration isrepresentedwith the speed scaling factor 119896
119894ranging from 119896min
to 119896max = 1 (119896min le 119896 le 119896max = 1) Then the operatingfrequency of the 119894th iteration 119891
119894 is
119891119894= 119896119894sdot 119891max (1)
where119891max is themaximum frequency of themicroprocessor
312 Workload The workload is defined to be a number ofclock cycles elapsed to complete the given computation Wedenote the number of cycles elapsed to handle the workloadof the 119894th iteration at the full speed of themicroprocessor (119896
119894=
1) as 119905ref 119894 That is
119908119894= 119905ref119894 = 119896119894 sdot 119905119894 (2)
Note that the elapsed time 119905119894increases as the speed is scaled
down (119896119894lt 1) Then the delay 119905
119894is automatically determined
when a speed scaling factor 119896119894is chosen for the given
workload (119905119894= 119908119894119896119894)
313 Real-Time Constraint The delay 119905119894cannot be unbound-
edly long as the system is associated with real-time constraint119879 For all iterations the elapsed time should be no more than119879
119905119894le 119879 forall119894 (3)
314 Delay-Workload Dependency As stated earlier theworkload is dependent upon the previous execution delayUsually the workload is not a continuous function of thedelay variation Rather the changes happen in a discretemanner Therefore the workload at the 119894th iteration is amonotonically increasing staircase function of the delay ofthe previous iteration 119905
119894minus1 119908119894= 119882(119905
119894minus1) If the given system
has 119899 workload levels the workload function 119882 can beformulated as follows
119882(119905) =
wl1
if 0 lt 119905 le th1
wl2
if th1lt 119905 le th
2
wl119899minus1
if th119899minus2
lt 119905 le th119899minus1
wl119899
if th119899minus1
lt 119905
(4)
in which the workload levels are wl1lt wl2lt sdot sdot sdot lt wl
119899and
the delay thresholds (workload changing moments) are th1lt
th2lt sdot sdot sdot lt th
119899minus1
315 Execution Trace At the 119894th iteration the speed scalingfactor 119896
119894uniquely defines an execution mode as the delay is
fixed accordingly by (2)The initial workload is assumed to begiven as119908
1 Then an execution trace tr of length 119871 is defined
to be a sequence of the speed scaling factors of 119871 iterations
tr fl (1198961 1198962 119896
119871) (5)
316 Average Power Consumption The dynamic power con-sumption of CMOS circuits is 119888V2
119889119889119891 where 119888 V
119889119889 and 119891 are
capacitance operating voltage and frequency respectivelyAs the operating frequency is proportional to (V
119889119889minus Vth)2V119889119889
[10] the power consumption is an increasing function of 119891It is worth noting that the proposed model is not dependentupon any specific DVFS model We denote the energyconsumption of a unit workload at the full speed (119896 = 1) as119862119890and assume that energy dissipation grows linearly to the
size of workloadThen the reference energy of a workload119908119894
at the full speed is 119862119890sdot 119908119894 Given a DVFS energy model
119878 as a function of the speed scaling factor 119896 the energyconsumption at the 119894th iteration EG
119894is formulated as follows
EG119894= 119862119890sdot 119908119894sdot 119878 (119896119894) (6)
in which 119878(0) = 0 and 119878(1) = 1 Then the average powerconsumption of a trace tr can be formulated as follows
AVG (tr) =sum|tr|119894=1
EG119894
sum|tr|119894=1119905119894
(7)
It is worthwhile to mention that the proposed technique isnot specific to a certain workload-energy model While weadopt linear model for the workload-energy relation for easeof presentation any possibly nonlinear model can be used in(6)
32 Problem Formulation Our objective is to minimize theaverage power consumption of a given system as follows
Given the modeling constant 119862119890 DVFS energy mod-
eling function 119878 workload function119882 and the real-time constraint 119879 determine an execution trace trsuch that the average power consumption formulatedin (7) is minimized
4 Proposed Technique
In this section we describe the proposed operation man-agement policy as an answer to the problem defined in theprevious section In doing so we first derive the condition forfeasible and schedulable systems Then we study when theworkload changes and how it affects the power dissipationBased on that we propose a novel graph representationthat captures all possible workload transitions in the power-optimal operation Finally we derive the power-optimaloperation policy with the given workload function119882
41 Feasibility In this subsection we examine under whichcondition a given system is feasible First the system should
4 Mobile Information Systems
W
wlp
wlq
wlr
thr thq
w = t
t
Currentworkload
Invalid Valid
le 1wlpthqgt 1wlpthr
(a)
W
wlp
wlq
wlr
t
Currentworkload
thrminus1thqminus1
w = kmint
InvalidValidminus1 ge kminwlpthq
minus1 lt kminwlpthr
(b)
Figure 1 Examples of workload transitions (a) transitions to lower workload levels and (b) transitions to higher workload levels Thetransitions from wl
119901to wl119902are valid while the ones from wl
119901to wl119903are not
be schedulable within the given real-time constraint 119879 atevery iteration
Theorem 1 (schedulability) Given the workload function 119882and the real-time constraint 119879 the system is not schedulable if119882(119905)119905 gt 1 forall119905 isin (0 119879]
Proof Suppose that the delay at the 119894th iteration is 119905119894 Then
119908119894+1
= 119882(119905119894) gt 119905119894and119908
119894+1= 119896119894+1sdot119905119894+1
Since 119896119894+1
le 1 119905119894lt 119905119894+1
That is the delay is increasing as iteration goes by and willeventually reache the real-time constraint 119905
119895= 119879 At the next
iteration the system becomes unschedulable even with thefull speed as119882(119905
119895= 119879) = 1 sdot 119905
119895+1gt 119879 requires 119905
119895+1gt 119879
Once the workload gets bigger than 119879 the system istrivially not schedulable afterwards even with the full speed119896119894= 1 Thus the workload must not be bigger than 119879 at
any time Moreover once the workload reaches 119879 119896 shouldremain the full speed 119896 = 1 afterwards We can make theupper bound of workload even tighter if there exists 1199051015840 suchthat 119882(119905) gt 119905 for all 119905 isin (119905
1015840 119879] In this case the workload
larger than 119882(1199051015840) is not allowable as it makes the execution
delay longer and longer eventually violating the deadlineGiven the workload function119882 and the initial workload
1199081 one can calculate the lower bound of theworkload as well
If a value exists which satisfies 119882(119905) = and 119882(119905) lt 1199081
the workload will never become smaller than119882(119905) In otherwords even with the full speed the execution delay nevergoes below
Then the valid workload levels and the execution delayrange during the lifetime of a given system can be formulatedas below
Definition 2 (valid ranges) Given the workload function119882and the initial workload 119908
1 the minimum and maximum
workload levels of a system are defined to be
wlmin =
119882(max (119905)) st = 119882 (119905) lt 1199081
if exist
wl1
otherwise(8)
wlmax
=
119882(min (1199051015840)) st 119882(119905) gt 119905 forall119905 isin (1199051015840 119879] if exist 1199051015840
119882(119879) otherwise
(9)
Then the valid range of the execution delay is formulated as[119905min 119905max] according to wlmin and wlmax with
119905min = min (119905) st 119882(119905) = wlmin
119905max = min (119879max (1199051015840)) st 119882(1199051015840) = wlmax
(10)
42 Workload Transitions In this subsection we examinewhen a workload transition between valid workload levelspossibly occurs and how it affects the system
As presented in (4) workload is a function of the delaytaken at the previous iteration If the delay taken at aniteration is 119905 and th
119897minus1lt t le th
119897Then if the systemworks fast
enough to result in a shorter delay 1199051015840 le th119897minus1
the nextworkload will get smaller than wl
119897 Similarly in case that the
delay gets longer (1199051015840 gt th119897) the system will need to handle a
larger workload than wl119896at the successive iteration
However such workload transitions can occur onlywithin limited ranges Figure 1 depicts valid and invalidtransitions from one workload level Figure 1(a) shows twotransitions from a workload level wl
119901to lower ones wl
119902and
wl119903(119903 lt 119902 lt 119901) To make the next workload level wl
119902 the
delay should be in the range of (th119902minus1 th119902] Given the current
workload wl119901 the speed scaling factor should be larger than
or equal to wl119901th119902from (2) If wl
119901th119902
le 119896max = 1this workload transition can possibly occur In contrast ifwl119901th119903gt 1 for another workload level wl
119903 the transition
from wl119901to wl119903never happens because the delay never goes
below th119903even with the full processing speed
The same principle is also applied to the transitionfrom a workload level to higher ones If the delay canbe lengthened properly with a speed scaling factor withinthe range [119896min 119896max] the transitions are valid Figure 1(b)illustrates that the transition from wl
119901to wl119902is valid while
Mobile Information Systems 5
the one to wl119903is not One can tell if a transition can happen
or not with the following definition
Definition 3 (valid transition) A workload transition fromwl119900to wl119899is said to be valid if wl
119900th119899le 119896max = 1 and
wl119900th119899minus1
ge 119896min
43WorkloadTransitionGraph Theessential difficulty of thepresented power optimization problem lies in the fact thattwo conflicting forces should be handled at the same timeIn order to minimize the power on one hand it tries to scaledown the speed (thus lengthen the delay) as much as possibleas described in (2) and (6) On the other hand the lengtheneddelay is not desirable as it imposes a bigger workload in thesuccessive iteration as shown in (4)
Therefore no one simple intuition can be exploited tosolve the problem Rather we need to compare differentmodes in a comprehensive way In order to be able to exploreall possible execution modes and quantify their effects weneed to devise a data structure that includes elementaryinformation on how workload transitions change the systemstatus and power dissipation In line with that purpose wepropose a graph representation of the workload evolutionWorkload Transition Graph (WTG) which captures all pos-sible transition scenarios of the workload changes during thelifetime of a system
A valid workload transition from one workload level toanother can be caused by any delay within the correspondingrange A transition fromwl
119901towl119902in Figure 1(a) for instance
can be caused by any delay within the range of (th119902minus1 th119902] In
other words when handling a workload of wl119901 any scaling
factor within the range of [wl119901th119902wl119901th119902minus1) can cause the
transition In the power-optimal execution trace howeveronly one specific scaling factor is always chosen for a certaintransition even though it happens many times We show thisin the next theorem
Theorem 4 (optimal scaling factor) In the power-optimaltrace 119905119903 = (119896
1 1198962 119896
|119905119903|) if the workload level handled by
119896119894is 119908119897119900and the next workload level is 119908119897
119899
119896119894= min(
119908119897119900
119905ℎ119899
119896min) (11)
Proof We prove this by contradiction Let us suppose thatthere exists a power-optimal trace tr where 119896
119894(which
results in the transition from wl119900to wl119899) is not equal to
min(wl119900th119899 119896min) Then we make a new execution trace tr1015840
from tr by replacing 119896119894with 1198961015840
119894= min(wl
119900th119899 119896min) Note
that all other execution modes in tr1015840 are the same as in trThat is
tr (119895) = tr1015840 (119895) forall119895 = 119894 (12)
By the definition of the scaling factor
119896119894ge 119896min
119896119894gewl119900
th119899
(13)
since the transition is from wl119900to wl119899 Thus
tr (119894) = 119896119894 gt tr1015840 (119894) = 1198961015840119894
(14)
and accordingly we obtain
119878 (119896119894) gt 119878 (119896
1015840
119894) (15)
Then from (6) and (7) the average power in tr1015840 is lower thanthat of tr and this contradicts the proposition
From Theorem 4 we know that only one speed scalingfactor is associated with all transitions of a certain type inthe power-optimal operation scenario So we define a scalingfactor of a valid transition as follows
Definition 5 (scaling factor of a transition) The scaling factorof a valid workload transition from wl
119900to wl119899is
sf (wl119900wl119899) = max(
wl119900
min (th119899 119879)
119896min) (16)
Now we define WTG
Definition 6 (Workload Transition Graph) WTG is definedto be a graph (119881 119864) where119881 and119864 are the sets of vertices andedges respectively Each valid workload level forms a vertex
119881 = V119894= wl119894| wlmin le wl
119894le wlmax (17)
while a valid transition from a workload level to anotherforms an edge between vertices That is there exists an edgefrom V
119894to V119895corresponding to a valid transition from wl
119894to
wl119895
119864 = (V119894 V119895) | exist valid transition from wl
119894to wl119895 (18)
We denote the source and destination vertices of an edge 119890 isin119864 as src(119890) and dst(119890) respectively
With these definitions a power-optimal execution tracetr = (119896
1 1198962 119896
|tr|) can be represented as a walk (asequence of vertices where any pair of consecutive verticesare connected through an edge) of length |tr| + 1 in WTG Ina different form the execution trace is a sequence of edgesin WTG of length |tr|(119890
1 1198902 119890
|tr|) where the workloadtransition from src(119890
119894) to dst(119890
119894) is caused by the execution
mode sf(src(119890119894) dst(119890
119894)) = 119896
119894
Algorithm 1 shows howWTG is generated out of the givenworkload function119882 and the initial workload 119908
1 After the
initialization all valid workload levels are added as verticesin lines (5)-(6) Then for each permutation of two workloadlevels (lines (7)-(8)) it is checked if the in-between transitionis valid or not in line (9) If valid it is added as an edge in line(10)
Let us take Figure 2 as an example of workload function119882 Once given the initial workload one can easily get thevalid workload levels according to (8) and (9) When 119908
1=
wl2 for instance wlmin = wl
2and wlmax = wl
6 Figure 3(b)
illustrates the corresponding WTG of the workload function
6 Mobile Information Systems
(1) procedure GenerateWTG (1198821199081)
(2) wlmin larr equation (8) ⊳ Initializations(3) wlmax larr equation (9)(4) 119881 larr 120601 119864 larr 120601(5) for all wlmin le wl le wlmax do ⊳ Vertices(6) 119881 larr 119881 cup wl(7) for all wlmin le wl
1le wlmax do ⊳ Edges
(8) for all wlmin le wl2le wlmax do
(9) if exist valid transition from wl1to wl2then
(10) 119864 larr 119864 cup (wl1wl2)
Algorithm 1 Generation of WTG
y
t
T
wl1
wl2
wl3
wl4
wl5
wl6
th1 th2 th3 th4 th5
y = kmaxt
y = kmint
Feasible delay range of wl4
Figure 2 A workload staircase function example with six workloadlevels
shown in Figure 2 with the initial workload of wl2 It is not a
complete graph as some pairs of vertices cannot be connecteddirectly since it is not valid according to Definition 3
The feasible delay range to handle workload wl4is
highlighted in Figure 2 justifying that vertex wl4has three
outgoing edges to wl6 wl5 and itself To be more specific
when the workload is given as wl4 the shortest possible delay
is the case when the speed is chosen as 119896max Then the delay(119909 coordinate of the cross point of 119910 = wl
4and 119910 = 119896max119905) is
between th3and th
4as shown in Figure 2Thismeans that the
lowest possibleworkload in the next iteration iswl4 Similarly
the biggest possible workload can also be calculated as wl6
as the biggest possible delay is between th5and 119879 Note also
that some of the vertices may not have an edge directed toitself (self-loop) such as vertex wl
3 It has outgoing edges only
to higher workload levels This means that the computationburden of that state is so big that it only results in higherworkload levels at the next iteration even with the full speed
Different initial workload levels may result in differentWTGs as shown in Figures 3(a)ndash3(c)TheWTGderived fromthe initial workload level of wl
1is illustrated in Figure 3(a)
In contrast to Figure 3(b) vertex wl1is included in the graph
The WTG derived from higher initial workload levels wl4
wl5 and wl
6 is shown in Figure 3(c) Note that the WTGs in
Figures 3(a) and 3(c) are not strongly connected Vertex wl2in
Figure 3(b) for example is not reachable from wl4 However
from the definition of valid workload levels all vertices arereachable from the initial workload level This property is
important for deriving the optimal operation policy that willbe presented in the next subsection
44 Proposed Operation Policy In this subsection we presentthe proposed operation policy that compromises the energy-delay tradeoff caused by the delay-workload dependency
As stated earlier a power-optimal execution trace tr canbe represented as awalk ofWTGThen we have the followingdefinition
Definition 7 (corresponding walk) Given the power-optimalexecution trace tr = (119896
1 1198962 119896
|tr|) and its initial workload1199081 the corresponding walk of the trace is denoted as
cwtr1199081
= (1198901 1198902 119890
|tr|) (19)
where src(1198901) = 1199081and sf(src(119890
119894) dst(119890
119894)) = 119896
119894(for simplicity
we also denote it as sf(119890119894) = sf(src(119890
119894) dst(119890
119894)) = 119896
119894for the rest
of the paper)The average power consumption of thewalk canbe formulated as follows
AVG (cwtr1199081
) =
EG (cwtr1199081
)
DL (cwtr1199081
)
(20)
where DL((1198901 1198902 119890
119899)) is the total delay elapsed
for traversing the walk sum119899
119894=1(src(119890
119894)sf(119890
119894)) and
EG((1198901 1198902 119890
119899)) is the total energy consumption for
the walk 119862119890sum119899
119894=1src(119890119894)119878(sf(119890
119894))
A cycle (closed walk) of WTG is a walk whose startingand ending vertices are the sameThat is awalk (119890
1 1198902 119890
119899)
is a cycle if src(1198901) = dst(119890
119899) Hence if the corresponding
walk of a trace inWTG is a cycle the trace is ever repeatableWe argue that in case that the length of the trace issufficiently long (|tr| rarr infin) the average power consumptionis minimized when the cycle which minimizes (20) repeatsover and over again in the trace
Theorem 8 (optimal cycle of WTG) Suppose that a cycle ofWTG
119888119910119900119901119905
= (1198901 1198902 119890
119899) (21)
has the minimum value of (20) among all cycles of the WTGThen if the length of the trace is long enough the average powerconsumption of the optimal trace converges to
AVG (119888119910119900119901119905) =
EG (119888119910119900119901119905)
DL (119888119910119900119901119905)
(22)
Proof An arbitrary walk of a WTG (1198901 1198902 119890
119899) can be
decomposed into a path (a walk with distinct vertices) fromsrc(1198901) to dst(119890
119899) and a set of cycles (see eg Section 103 of
[17]) Figure 4 depicts a walk example of length 8 where theinitial workload level is wl
2and the last vertex that it traverses
is wl6 If a path from wl
2to wl6 (1198901 1198902 1198905) highlighted in the
dashed arrows is removed from the walk the remaining partis a set of cycles
Now consider a power-optimal trace tr of length 119873 thatstarts from the workload level of119908
1 The corresponding walk
Mobile Information Systems 7
wl1
wl2 wl3 wl4 wl5 wl6
(a)
wl2
wl3 wl4 wl5 wl6
(b)
wl4 wl5 wl6
(c)
Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl
2or wl3 and
(c) wl4 wl5 or wl
6
wl2 wl3 wl4 wl5 wl6e1
e2
e3
e4 e5
e6e7
e8
Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8
of tr can be decomposed into a path pt starting at 1199081and a
set of cycles CYThen the average power consumption of thetrace tr is
AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)
(23)
Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin
AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)
(24)
Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition
EG (cy)DL (cy)
ge AVG (cyopt) =EG (cyopt)
DL (cyopt) forallcy isin CY (25)
Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)
From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy
(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908
119894
Start
No Yes
Generate WTG
Input w1 and W
i = 1
Calculatecyopt and ptopt
Is wi in cyopt
Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))
Update wi+1i++
Update wi+1i++
Figure 5 Flowchart diagram of the proposed operation policy
(ii) Otherwise take a path ptopt = (1198901 1198902 119890
119899) that has
the minimum AVG(ptopt) value and dst(119890119899) is in the
optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost
The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)
Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on
8 Mobile Information Systems
5 Extension to Discrete DVFS
Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique
Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896
119894isin 119870 can be chosen as
a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement
Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl
119900
to wl119899is said to be valid if
th119899minus1
ltwl119900
119896le min (th
119899 119879) exist119896 isin 119870 (26)
Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition
Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl
119900to wl119899is
sf (wl119900wl119899)
= argmin119896isin119870
th119899minus1
ltwl119900
119896le min (th
119899 119879)
(27)
6 Experiments
In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations
61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10
and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time
5 10 15 20 25 30 350Delay (ms)
0
5
10
15
20
25
30
35
Wor
kloa
d (m
s)
Real-time constraint Tkt k = 02
kt k = 04
kt k = 06
kt k = 08
kt k = 10
W(t)
Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm
constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870
We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl
3 that is 119908
1= wl3= 9833
ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl
2 which implies a stable
operation mode forall119894 119896119894= 06
62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896
3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th
1= 02 th
2= 06 th
3= 07
and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example
The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand
Mobile Information Systems 9
W
t
wl4 = 06
wl3 = 05
wl1 = 03
wl2 = 04
k = 09 k = 08
k = 04
T = 10
th1 = 02
th2 = 06
th3 = 07
(a)
wl1
wl2
wl3 wl408
08
08
08
04
04
09
09
(b)
Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl
2= 04 wl
3= 05 and wl
4= 06 and (b) its WTG
representation where the edges are annotated with scaling factors
an alternating operation mode implied in wl2rarr wl
4rarr
wl3rarr wl2shows the least average power consumption in the
discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes
In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870
7 Conclusion
This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones
Disclosure
A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and
Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University
References
[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008
[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010
[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011
[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000
[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013
[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of
10 Mobile Information Systems
Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008
[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007
[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008
[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001
[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000
[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015
[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016
[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011
[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008
[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015
[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001
[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005
[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973
[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001
[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000
[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mobile Information Systems 3
31 System Model
311 Dynamic Voltage-Frequency Scaling In this paper weassume that a system has multiple operation modes due toDVFS feature where the operating frequency and voltagecan be modulated For simplicity we first assume that thereare infinitely many operation modes available among whichone is chosen at each iteration It will be shown that theproposed technique can be applied to a discrete DVFS aswell in Section 5 The operation mode at the 119894th iteration isrepresentedwith the speed scaling factor 119896
119894ranging from 119896min
to 119896max = 1 (119896min le 119896 le 119896max = 1) Then the operatingfrequency of the 119894th iteration 119891
119894 is
119891119894= 119896119894sdot 119891max (1)
where119891max is themaximum frequency of themicroprocessor
312 Workload The workload is defined to be a number ofclock cycles elapsed to complete the given computation Wedenote the number of cycles elapsed to handle the workloadof the 119894th iteration at the full speed of themicroprocessor (119896
119894=
1) as 119905ref 119894 That is
119908119894= 119905ref119894 = 119896119894 sdot 119905119894 (2)
Note that the elapsed time 119905119894increases as the speed is scaled
down (119896119894lt 1) Then the delay 119905
119894is automatically determined
when a speed scaling factor 119896119894is chosen for the given
workload (119905119894= 119908119894119896119894)
313 Real-Time Constraint The delay 119905119894cannot be unbound-
edly long as the system is associated with real-time constraint119879 For all iterations the elapsed time should be no more than119879
119905119894le 119879 forall119894 (3)
314 Delay-Workload Dependency As stated earlier theworkload is dependent upon the previous execution delayUsually the workload is not a continuous function of thedelay variation Rather the changes happen in a discretemanner Therefore the workload at the 119894th iteration is amonotonically increasing staircase function of the delay ofthe previous iteration 119905
119894minus1 119908119894= 119882(119905
119894minus1) If the given system
has 119899 workload levels the workload function 119882 can beformulated as follows
119882(119905) =
wl1
if 0 lt 119905 le th1
wl2
if th1lt 119905 le th
2
wl119899minus1
if th119899minus2
lt 119905 le th119899minus1
wl119899
if th119899minus1
lt 119905
(4)
in which the workload levels are wl1lt wl2lt sdot sdot sdot lt wl
119899and
the delay thresholds (workload changing moments) are th1lt
th2lt sdot sdot sdot lt th
119899minus1
315 Execution Trace At the 119894th iteration the speed scalingfactor 119896
119894uniquely defines an execution mode as the delay is
fixed accordingly by (2)The initial workload is assumed to begiven as119908
1 Then an execution trace tr of length 119871 is defined
to be a sequence of the speed scaling factors of 119871 iterations
tr fl (1198961 1198962 119896
119871) (5)
316 Average Power Consumption The dynamic power con-sumption of CMOS circuits is 119888V2
119889119889119891 where 119888 V
119889119889 and 119891 are
capacitance operating voltage and frequency respectivelyAs the operating frequency is proportional to (V
119889119889minus Vth)2V119889119889
[10] the power consumption is an increasing function of 119891It is worth noting that the proposed model is not dependentupon any specific DVFS model We denote the energyconsumption of a unit workload at the full speed (119896 = 1) as119862119890and assume that energy dissipation grows linearly to the
size of workloadThen the reference energy of a workload119908119894
at the full speed is 119862119890sdot 119908119894 Given a DVFS energy model
119878 as a function of the speed scaling factor 119896 the energyconsumption at the 119894th iteration EG
119894is formulated as follows
EG119894= 119862119890sdot 119908119894sdot 119878 (119896119894) (6)
in which 119878(0) = 0 and 119878(1) = 1 Then the average powerconsumption of a trace tr can be formulated as follows
AVG (tr) =sum|tr|119894=1
EG119894
sum|tr|119894=1119905119894
(7)
It is worthwhile to mention that the proposed technique isnot specific to a certain workload-energy model While weadopt linear model for the workload-energy relation for easeof presentation any possibly nonlinear model can be used in(6)
32 Problem Formulation Our objective is to minimize theaverage power consumption of a given system as follows
Given the modeling constant 119862119890 DVFS energy mod-
eling function 119878 workload function119882 and the real-time constraint 119879 determine an execution trace trsuch that the average power consumption formulatedin (7) is minimized
4 Proposed Technique
In this section we describe the proposed operation man-agement policy as an answer to the problem defined in theprevious section In doing so we first derive the condition forfeasible and schedulable systems Then we study when theworkload changes and how it affects the power dissipationBased on that we propose a novel graph representationthat captures all possible workload transitions in the power-optimal operation Finally we derive the power-optimaloperation policy with the given workload function119882
41 Feasibility In this subsection we examine under whichcondition a given system is feasible First the system should
4 Mobile Information Systems
W
wlp
wlq
wlr
thr thq
w = t
t
Currentworkload
Invalid Valid
le 1wlpthqgt 1wlpthr
(a)
W
wlp
wlq
wlr
t
Currentworkload
thrminus1thqminus1
w = kmint
InvalidValidminus1 ge kminwlpthq
minus1 lt kminwlpthr
(b)
Figure 1 Examples of workload transitions (a) transitions to lower workload levels and (b) transitions to higher workload levels Thetransitions from wl
119901to wl119902are valid while the ones from wl
119901to wl119903are not
be schedulable within the given real-time constraint 119879 atevery iteration
Theorem 1 (schedulability) Given the workload function 119882and the real-time constraint 119879 the system is not schedulable if119882(119905)119905 gt 1 forall119905 isin (0 119879]
Proof Suppose that the delay at the 119894th iteration is 119905119894 Then
119908119894+1
= 119882(119905119894) gt 119905119894and119908
119894+1= 119896119894+1sdot119905119894+1
Since 119896119894+1
le 1 119905119894lt 119905119894+1
That is the delay is increasing as iteration goes by and willeventually reache the real-time constraint 119905
119895= 119879 At the next
iteration the system becomes unschedulable even with thefull speed as119882(119905
119895= 119879) = 1 sdot 119905
119895+1gt 119879 requires 119905
119895+1gt 119879
Once the workload gets bigger than 119879 the system istrivially not schedulable afterwards even with the full speed119896119894= 1 Thus the workload must not be bigger than 119879 at
any time Moreover once the workload reaches 119879 119896 shouldremain the full speed 119896 = 1 afterwards We can make theupper bound of workload even tighter if there exists 1199051015840 suchthat 119882(119905) gt 119905 for all 119905 isin (119905
1015840 119879] In this case the workload
larger than 119882(1199051015840) is not allowable as it makes the execution
delay longer and longer eventually violating the deadlineGiven the workload function119882 and the initial workload
1199081 one can calculate the lower bound of theworkload as well
If a value exists which satisfies 119882(119905) = and 119882(119905) lt 1199081
the workload will never become smaller than119882(119905) In otherwords even with the full speed the execution delay nevergoes below
Then the valid workload levels and the execution delayrange during the lifetime of a given system can be formulatedas below
Definition 2 (valid ranges) Given the workload function119882and the initial workload 119908
1 the minimum and maximum
workload levels of a system are defined to be
wlmin =
119882(max (119905)) st = 119882 (119905) lt 1199081
if exist
wl1
otherwise(8)
wlmax
=
119882(min (1199051015840)) st 119882(119905) gt 119905 forall119905 isin (1199051015840 119879] if exist 1199051015840
119882(119879) otherwise
(9)
Then the valid range of the execution delay is formulated as[119905min 119905max] according to wlmin and wlmax with
119905min = min (119905) st 119882(119905) = wlmin
119905max = min (119879max (1199051015840)) st 119882(1199051015840) = wlmax
(10)
42 Workload Transitions In this subsection we examinewhen a workload transition between valid workload levelspossibly occurs and how it affects the system
As presented in (4) workload is a function of the delaytaken at the previous iteration If the delay taken at aniteration is 119905 and th
119897minus1lt t le th
119897Then if the systemworks fast
enough to result in a shorter delay 1199051015840 le th119897minus1
the nextworkload will get smaller than wl
119897 Similarly in case that the
delay gets longer (1199051015840 gt th119897) the system will need to handle a
larger workload than wl119896at the successive iteration
However such workload transitions can occur onlywithin limited ranges Figure 1 depicts valid and invalidtransitions from one workload level Figure 1(a) shows twotransitions from a workload level wl
119901to lower ones wl
119902and
wl119903(119903 lt 119902 lt 119901) To make the next workload level wl
119902 the
delay should be in the range of (th119902minus1 th119902] Given the current
workload wl119901 the speed scaling factor should be larger than
or equal to wl119901th119902from (2) If wl
119901th119902
le 119896max = 1this workload transition can possibly occur In contrast ifwl119901th119903gt 1 for another workload level wl
119903 the transition
from wl119901to wl119903never happens because the delay never goes
below th119903even with the full processing speed
The same principle is also applied to the transitionfrom a workload level to higher ones If the delay canbe lengthened properly with a speed scaling factor withinthe range [119896min 119896max] the transitions are valid Figure 1(b)illustrates that the transition from wl
119901to wl119902is valid while
Mobile Information Systems 5
the one to wl119903is not One can tell if a transition can happen
or not with the following definition
Definition 3 (valid transition) A workload transition fromwl119900to wl119899is said to be valid if wl
119900th119899le 119896max = 1 and
wl119900th119899minus1
ge 119896min
43WorkloadTransitionGraph Theessential difficulty of thepresented power optimization problem lies in the fact thattwo conflicting forces should be handled at the same timeIn order to minimize the power on one hand it tries to scaledown the speed (thus lengthen the delay) as much as possibleas described in (2) and (6) On the other hand the lengtheneddelay is not desirable as it imposes a bigger workload in thesuccessive iteration as shown in (4)
Therefore no one simple intuition can be exploited tosolve the problem Rather we need to compare differentmodes in a comprehensive way In order to be able to exploreall possible execution modes and quantify their effects weneed to devise a data structure that includes elementaryinformation on how workload transitions change the systemstatus and power dissipation In line with that purpose wepropose a graph representation of the workload evolutionWorkload Transition Graph (WTG) which captures all pos-sible transition scenarios of the workload changes during thelifetime of a system
A valid workload transition from one workload level toanother can be caused by any delay within the correspondingrange A transition fromwl
119901towl119902in Figure 1(a) for instance
can be caused by any delay within the range of (th119902minus1 th119902] In
other words when handling a workload of wl119901 any scaling
factor within the range of [wl119901th119902wl119901th119902minus1) can cause the
transition In the power-optimal execution trace howeveronly one specific scaling factor is always chosen for a certaintransition even though it happens many times We show thisin the next theorem
Theorem 4 (optimal scaling factor) In the power-optimaltrace 119905119903 = (119896
1 1198962 119896
|119905119903|) if the workload level handled by
119896119894is 119908119897119900and the next workload level is 119908119897
119899
119896119894= min(
119908119897119900
119905ℎ119899
119896min) (11)
Proof We prove this by contradiction Let us suppose thatthere exists a power-optimal trace tr where 119896
119894(which
results in the transition from wl119900to wl119899) is not equal to
min(wl119900th119899 119896min) Then we make a new execution trace tr1015840
from tr by replacing 119896119894with 1198961015840
119894= min(wl
119900th119899 119896min) Note
that all other execution modes in tr1015840 are the same as in trThat is
tr (119895) = tr1015840 (119895) forall119895 = 119894 (12)
By the definition of the scaling factor
119896119894ge 119896min
119896119894gewl119900
th119899
(13)
since the transition is from wl119900to wl119899 Thus
tr (119894) = 119896119894 gt tr1015840 (119894) = 1198961015840119894
(14)
and accordingly we obtain
119878 (119896119894) gt 119878 (119896
1015840
119894) (15)
Then from (6) and (7) the average power in tr1015840 is lower thanthat of tr and this contradicts the proposition
From Theorem 4 we know that only one speed scalingfactor is associated with all transitions of a certain type inthe power-optimal operation scenario So we define a scalingfactor of a valid transition as follows
Definition 5 (scaling factor of a transition) The scaling factorof a valid workload transition from wl
119900to wl119899is
sf (wl119900wl119899) = max(
wl119900
min (th119899 119879)
119896min) (16)
Now we define WTG
Definition 6 (Workload Transition Graph) WTG is definedto be a graph (119881 119864) where119881 and119864 are the sets of vertices andedges respectively Each valid workload level forms a vertex
119881 = V119894= wl119894| wlmin le wl
119894le wlmax (17)
while a valid transition from a workload level to anotherforms an edge between vertices That is there exists an edgefrom V
119894to V119895corresponding to a valid transition from wl
119894to
wl119895
119864 = (V119894 V119895) | exist valid transition from wl
119894to wl119895 (18)
We denote the source and destination vertices of an edge 119890 isin119864 as src(119890) and dst(119890) respectively
With these definitions a power-optimal execution tracetr = (119896
1 1198962 119896
|tr|) can be represented as a walk (asequence of vertices where any pair of consecutive verticesare connected through an edge) of length |tr| + 1 in WTG Ina different form the execution trace is a sequence of edgesin WTG of length |tr|(119890
1 1198902 119890
|tr|) where the workloadtransition from src(119890
119894) to dst(119890
119894) is caused by the execution
mode sf(src(119890119894) dst(119890
119894)) = 119896
119894
Algorithm 1 shows howWTG is generated out of the givenworkload function119882 and the initial workload 119908
1 After the
initialization all valid workload levels are added as verticesin lines (5)-(6) Then for each permutation of two workloadlevels (lines (7)-(8)) it is checked if the in-between transitionis valid or not in line (9) If valid it is added as an edge in line(10)
Let us take Figure 2 as an example of workload function119882 Once given the initial workload one can easily get thevalid workload levels according to (8) and (9) When 119908
1=
wl2 for instance wlmin = wl
2and wlmax = wl
6 Figure 3(b)
illustrates the corresponding WTG of the workload function
6 Mobile Information Systems
(1) procedure GenerateWTG (1198821199081)
(2) wlmin larr equation (8) ⊳ Initializations(3) wlmax larr equation (9)(4) 119881 larr 120601 119864 larr 120601(5) for all wlmin le wl le wlmax do ⊳ Vertices(6) 119881 larr 119881 cup wl(7) for all wlmin le wl
1le wlmax do ⊳ Edges
(8) for all wlmin le wl2le wlmax do
(9) if exist valid transition from wl1to wl2then
(10) 119864 larr 119864 cup (wl1wl2)
Algorithm 1 Generation of WTG
y
t
T
wl1
wl2
wl3
wl4
wl5
wl6
th1 th2 th3 th4 th5
y = kmaxt
y = kmint
Feasible delay range of wl4
Figure 2 A workload staircase function example with six workloadlevels
shown in Figure 2 with the initial workload of wl2 It is not a
complete graph as some pairs of vertices cannot be connecteddirectly since it is not valid according to Definition 3
The feasible delay range to handle workload wl4is
highlighted in Figure 2 justifying that vertex wl4has three
outgoing edges to wl6 wl5 and itself To be more specific
when the workload is given as wl4 the shortest possible delay
is the case when the speed is chosen as 119896max Then the delay(119909 coordinate of the cross point of 119910 = wl
4and 119910 = 119896max119905) is
between th3and th
4as shown in Figure 2Thismeans that the
lowest possibleworkload in the next iteration iswl4 Similarly
the biggest possible workload can also be calculated as wl6
as the biggest possible delay is between th5and 119879 Note also
that some of the vertices may not have an edge directed toitself (self-loop) such as vertex wl
3 It has outgoing edges only
to higher workload levels This means that the computationburden of that state is so big that it only results in higherworkload levels at the next iteration even with the full speed
Different initial workload levels may result in differentWTGs as shown in Figures 3(a)ndash3(c)TheWTGderived fromthe initial workload level of wl
1is illustrated in Figure 3(a)
In contrast to Figure 3(b) vertex wl1is included in the graph
The WTG derived from higher initial workload levels wl4
wl5 and wl
6 is shown in Figure 3(c) Note that the WTGs in
Figures 3(a) and 3(c) are not strongly connected Vertex wl2in
Figure 3(b) for example is not reachable from wl4 However
from the definition of valid workload levels all vertices arereachable from the initial workload level This property is
important for deriving the optimal operation policy that willbe presented in the next subsection
44 Proposed Operation Policy In this subsection we presentthe proposed operation policy that compromises the energy-delay tradeoff caused by the delay-workload dependency
As stated earlier a power-optimal execution trace tr canbe represented as awalk ofWTGThen we have the followingdefinition
Definition 7 (corresponding walk) Given the power-optimalexecution trace tr = (119896
1 1198962 119896
|tr|) and its initial workload1199081 the corresponding walk of the trace is denoted as
cwtr1199081
= (1198901 1198902 119890
|tr|) (19)
where src(1198901) = 1199081and sf(src(119890
119894) dst(119890
119894)) = 119896
119894(for simplicity
we also denote it as sf(119890119894) = sf(src(119890
119894) dst(119890
119894)) = 119896
119894for the rest
of the paper)The average power consumption of thewalk canbe formulated as follows
AVG (cwtr1199081
) =
EG (cwtr1199081
)
DL (cwtr1199081
)
(20)
where DL((1198901 1198902 119890
119899)) is the total delay elapsed
for traversing the walk sum119899
119894=1(src(119890
119894)sf(119890
119894)) and
EG((1198901 1198902 119890
119899)) is the total energy consumption for
the walk 119862119890sum119899
119894=1src(119890119894)119878(sf(119890
119894))
A cycle (closed walk) of WTG is a walk whose startingand ending vertices are the sameThat is awalk (119890
1 1198902 119890
119899)
is a cycle if src(1198901) = dst(119890
119899) Hence if the corresponding
walk of a trace inWTG is a cycle the trace is ever repeatableWe argue that in case that the length of the trace issufficiently long (|tr| rarr infin) the average power consumptionis minimized when the cycle which minimizes (20) repeatsover and over again in the trace
Theorem 8 (optimal cycle of WTG) Suppose that a cycle ofWTG
119888119910119900119901119905
= (1198901 1198902 119890
119899) (21)
has the minimum value of (20) among all cycles of the WTGThen if the length of the trace is long enough the average powerconsumption of the optimal trace converges to
AVG (119888119910119900119901119905) =
EG (119888119910119900119901119905)
DL (119888119910119900119901119905)
(22)
Proof An arbitrary walk of a WTG (1198901 1198902 119890
119899) can be
decomposed into a path (a walk with distinct vertices) fromsrc(1198901) to dst(119890
119899) and a set of cycles (see eg Section 103 of
[17]) Figure 4 depicts a walk example of length 8 where theinitial workload level is wl
2and the last vertex that it traverses
is wl6 If a path from wl
2to wl6 (1198901 1198902 1198905) highlighted in the
dashed arrows is removed from the walk the remaining partis a set of cycles
Now consider a power-optimal trace tr of length 119873 thatstarts from the workload level of119908
1 The corresponding walk
Mobile Information Systems 7
wl1
wl2 wl3 wl4 wl5 wl6
(a)
wl2
wl3 wl4 wl5 wl6
(b)
wl4 wl5 wl6
(c)
Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl
2or wl3 and
(c) wl4 wl5 or wl
6
wl2 wl3 wl4 wl5 wl6e1
e2
e3
e4 e5
e6e7
e8
Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8
of tr can be decomposed into a path pt starting at 1199081and a
set of cycles CYThen the average power consumption of thetrace tr is
AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)
(23)
Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin
AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)
(24)
Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition
EG (cy)DL (cy)
ge AVG (cyopt) =EG (cyopt)
DL (cyopt) forallcy isin CY (25)
Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)
From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy
(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908
119894
Start
No Yes
Generate WTG
Input w1 and W
i = 1
Calculatecyopt and ptopt
Is wi in cyopt
Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))
Update wi+1i++
Update wi+1i++
Figure 5 Flowchart diagram of the proposed operation policy
(ii) Otherwise take a path ptopt = (1198901 1198902 119890
119899) that has
the minimum AVG(ptopt) value and dst(119890119899) is in the
optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost
The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)
Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on
8 Mobile Information Systems
5 Extension to Discrete DVFS
Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique
Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896
119894isin 119870 can be chosen as
a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement
Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl
119900
to wl119899is said to be valid if
th119899minus1
ltwl119900
119896le min (th
119899 119879) exist119896 isin 119870 (26)
Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition
Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl
119900to wl119899is
sf (wl119900wl119899)
= argmin119896isin119870
th119899minus1
ltwl119900
119896le min (th
119899 119879)
(27)
6 Experiments
In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations
61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10
and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time
5 10 15 20 25 30 350Delay (ms)
0
5
10
15
20
25
30
35
Wor
kloa
d (m
s)
Real-time constraint Tkt k = 02
kt k = 04
kt k = 06
kt k = 08
kt k = 10
W(t)
Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm
constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870
We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl
3 that is 119908
1= wl3= 9833
ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl
2 which implies a stable
operation mode forall119894 119896119894= 06
62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896
3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th
1= 02 th
2= 06 th
3= 07
and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example
The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand
Mobile Information Systems 9
W
t
wl4 = 06
wl3 = 05
wl1 = 03
wl2 = 04
k = 09 k = 08
k = 04
T = 10
th1 = 02
th2 = 06
th3 = 07
(a)
wl1
wl2
wl3 wl408
08
08
08
04
04
09
09
(b)
Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl
2= 04 wl
3= 05 and wl
4= 06 and (b) its WTG
representation where the edges are annotated with scaling factors
an alternating operation mode implied in wl2rarr wl
4rarr
wl3rarr wl2shows the least average power consumption in the
discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes
In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870
7 Conclusion
This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones
Disclosure
A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and
Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University
References
[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008
[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010
[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011
[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000
[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013
[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of
10 Mobile Information Systems
Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008
[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007
[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008
[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001
[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000
[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015
[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016
[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011
[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008
[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015
[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001
[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005
[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973
[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001
[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000
[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
4 Mobile Information Systems
W
wlp
wlq
wlr
thr thq
w = t
t
Currentworkload
Invalid Valid
le 1wlpthqgt 1wlpthr
(a)
W
wlp
wlq
wlr
t
Currentworkload
thrminus1thqminus1
w = kmint
InvalidValidminus1 ge kminwlpthq
minus1 lt kminwlpthr
(b)
Figure 1 Examples of workload transitions (a) transitions to lower workload levels and (b) transitions to higher workload levels Thetransitions from wl
119901to wl119902are valid while the ones from wl
119901to wl119903are not
be schedulable within the given real-time constraint 119879 atevery iteration
Theorem 1 (schedulability) Given the workload function 119882and the real-time constraint 119879 the system is not schedulable if119882(119905)119905 gt 1 forall119905 isin (0 119879]
Proof Suppose that the delay at the 119894th iteration is 119905119894 Then
119908119894+1
= 119882(119905119894) gt 119905119894and119908
119894+1= 119896119894+1sdot119905119894+1
Since 119896119894+1
le 1 119905119894lt 119905119894+1
That is the delay is increasing as iteration goes by and willeventually reache the real-time constraint 119905
119895= 119879 At the next
iteration the system becomes unschedulable even with thefull speed as119882(119905
119895= 119879) = 1 sdot 119905
119895+1gt 119879 requires 119905
119895+1gt 119879
Once the workload gets bigger than 119879 the system istrivially not schedulable afterwards even with the full speed119896119894= 1 Thus the workload must not be bigger than 119879 at
any time Moreover once the workload reaches 119879 119896 shouldremain the full speed 119896 = 1 afterwards We can make theupper bound of workload even tighter if there exists 1199051015840 suchthat 119882(119905) gt 119905 for all 119905 isin (119905
1015840 119879] In this case the workload
larger than 119882(1199051015840) is not allowable as it makes the execution
delay longer and longer eventually violating the deadlineGiven the workload function119882 and the initial workload
1199081 one can calculate the lower bound of theworkload as well
If a value exists which satisfies 119882(119905) = and 119882(119905) lt 1199081
the workload will never become smaller than119882(119905) In otherwords even with the full speed the execution delay nevergoes below
Then the valid workload levels and the execution delayrange during the lifetime of a given system can be formulatedas below
Definition 2 (valid ranges) Given the workload function119882and the initial workload 119908
1 the minimum and maximum
workload levels of a system are defined to be
wlmin =
119882(max (119905)) st = 119882 (119905) lt 1199081
if exist
wl1
otherwise(8)
wlmax
=
119882(min (1199051015840)) st 119882(119905) gt 119905 forall119905 isin (1199051015840 119879] if exist 1199051015840
119882(119879) otherwise
(9)
Then the valid range of the execution delay is formulated as[119905min 119905max] according to wlmin and wlmax with
119905min = min (119905) st 119882(119905) = wlmin
119905max = min (119879max (1199051015840)) st 119882(1199051015840) = wlmax
(10)
42 Workload Transitions In this subsection we examinewhen a workload transition between valid workload levelspossibly occurs and how it affects the system
As presented in (4) workload is a function of the delaytaken at the previous iteration If the delay taken at aniteration is 119905 and th
119897minus1lt t le th
119897Then if the systemworks fast
enough to result in a shorter delay 1199051015840 le th119897minus1
the nextworkload will get smaller than wl
119897 Similarly in case that the
delay gets longer (1199051015840 gt th119897) the system will need to handle a
larger workload than wl119896at the successive iteration
However such workload transitions can occur onlywithin limited ranges Figure 1 depicts valid and invalidtransitions from one workload level Figure 1(a) shows twotransitions from a workload level wl
119901to lower ones wl
119902and
wl119903(119903 lt 119902 lt 119901) To make the next workload level wl
119902 the
delay should be in the range of (th119902minus1 th119902] Given the current
workload wl119901 the speed scaling factor should be larger than
or equal to wl119901th119902from (2) If wl
119901th119902
le 119896max = 1this workload transition can possibly occur In contrast ifwl119901th119903gt 1 for another workload level wl
119903 the transition
from wl119901to wl119903never happens because the delay never goes
below th119903even with the full processing speed
The same principle is also applied to the transitionfrom a workload level to higher ones If the delay canbe lengthened properly with a speed scaling factor withinthe range [119896min 119896max] the transitions are valid Figure 1(b)illustrates that the transition from wl
119901to wl119902is valid while
Mobile Information Systems 5
the one to wl119903is not One can tell if a transition can happen
or not with the following definition
Definition 3 (valid transition) A workload transition fromwl119900to wl119899is said to be valid if wl
119900th119899le 119896max = 1 and
wl119900th119899minus1
ge 119896min
43WorkloadTransitionGraph Theessential difficulty of thepresented power optimization problem lies in the fact thattwo conflicting forces should be handled at the same timeIn order to minimize the power on one hand it tries to scaledown the speed (thus lengthen the delay) as much as possibleas described in (2) and (6) On the other hand the lengtheneddelay is not desirable as it imposes a bigger workload in thesuccessive iteration as shown in (4)
Therefore no one simple intuition can be exploited tosolve the problem Rather we need to compare differentmodes in a comprehensive way In order to be able to exploreall possible execution modes and quantify their effects weneed to devise a data structure that includes elementaryinformation on how workload transitions change the systemstatus and power dissipation In line with that purpose wepropose a graph representation of the workload evolutionWorkload Transition Graph (WTG) which captures all pos-sible transition scenarios of the workload changes during thelifetime of a system
A valid workload transition from one workload level toanother can be caused by any delay within the correspondingrange A transition fromwl
119901towl119902in Figure 1(a) for instance
can be caused by any delay within the range of (th119902minus1 th119902] In
other words when handling a workload of wl119901 any scaling
factor within the range of [wl119901th119902wl119901th119902minus1) can cause the
transition In the power-optimal execution trace howeveronly one specific scaling factor is always chosen for a certaintransition even though it happens many times We show thisin the next theorem
Theorem 4 (optimal scaling factor) In the power-optimaltrace 119905119903 = (119896
1 1198962 119896
|119905119903|) if the workload level handled by
119896119894is 119908119897119900and the next workload level is 119908119897
119899
119896119894= min(
119908119897119900
119905ℎ119899
119896min) (11)
Proof We prove this by contradiction Let us suppose thatthere exists a power-optimal trace tr where 119896
119894(which
results in the transition from wl119900to wl119899) is not equal to
min(wl119900th119899 119896min) Then we make a new execution trace tr1015840
from tr by replacing 119896119894with 1198961015840
119894= min(wl
119900th119899 119896min) Note
that all other execution modes in tr1015840 are the same as in trThat is
tr (119895) = tr1015840 (119895) forall119895 = 119894 (12)
By the definition of the scaling factor
119896119894ge 119896min
119896119894gewl119900
th119899
(13)
since the transition is from wl119900to wl119899 Thus
tr (119894) = 119896119894 gt tr1015840 (119894) = 1198961015840119894
(14)
and accordingly we obtain
119878 (119896119894) gt 119878 (119896
1015840
119894) (15)
Then from (6) and (7) the average power in tr1015840 is lower thanthat of tr and this contradicts the proposition
From Theorem 4 we know that only one speed scalingfactor is associated with all transitions of a certain type inthe power-optimal operation scenario So we define a scalingfactor of a valid transition as follows
Definition 5 (scaling factor of a transition) The scaling factorof a valid workload transition from wl
119900to wl119899is
sf (wl119900wl119899) = max(
wl119900
min (th119899 119879)
119896min) (16)
Now we define WTG
Definition 6 (Workload Transition Graph) WTG is definedto be a graph (119881 119864) where119881 and119864 are the sets of vertices andedges respectively Each valid workload level forms a vertex
119881 = V119894= wl119894| wlmin le wl
119894le wlmax (17)
while a valid transition from a workload level to anotherforms an edge between vertices That is there exists an edgefrom V
119894to V119895corresponding to a valid transition from wl
119894to
wl119895
119864 = (V119894 V119895) | exist valid transition from wl
119894to wl119895 (18)
We denote the source and destination vertices of an edge 119890 isin119864 as src(119890) and dst(119890) respectively
With these definitions a power-optimal execution tracetr = (119896
1 1198962 119896
|tr|) can be represented as a walk (asequence of vertices where any pair of consecutive verticesare connected through an edge) of length |tr| + 1 in WTG Ina different form the execution trace is a sequence of edgesin WTG of length |tr|(119890
1 1198902 119890
|tr|) where the workloadtransition from src(119890
119894) to dst(119890
119894) is caused by the execution
mode sf(src(119890119894) dst(119890
119894)) = 119896
119894
Algorithm 1 shows howWTG is generated out of the givenworkload function119882 and the initial workload 119908
1 After the
initialization all valid workload levels are added as verticesin lines (5)-(6) Then for each permutation of two workloadlevels (lines (7)-(8)) it is checked if the in-between transitionis valid or not in line (9) If valid it is added as an edge in line(10)
Let us take Figure 2 as an example of workload function119882 Once given the initial workload one can easily get thevalid workload levels according to (8) and (9) When 119908
1=
wl2 for instance wlmin = wl
2and wlmax = wl
6 Figure 3(b)
illustrates the corresponding WTG of the workload function
6 Mobile Information Systems
(1) procedure GenerateWTG (1198821199081)
(2) wlmin larr equation (8) ⊳ Initializations(3) wlmax larr equation (9)(4) 119881 larr 120601 119864 larr 120601(5) for all wlmin le wl le wlmax do ⊳ Vertices(6) 119881 larr 119881 cup wl(7) for all wlmin le wl
1le wlmax do ⊳ Edges
(8) for all wlmin le wl2le wlmax do
(9) if exist valid transition from wl1to wl2then
(10) 119864 larr 119864 cup (wl1wl2)
Algorithm 1 Generation of WTG
y
t
T
wl1
wl2
wl3
wl4
wl5
wl6
th1 th2 th3 th4 th5
y = kmaxt
y = kmint
Feasible delay range of wl4
Figure 2 A workload staircase function example with six workloadlevels
shown in Figure 2 with the initial workload of wl2 It is not a
complete graph as some pairs of vertices cannot be connecteddirectly since it is not valid according to Definition 3
The feasible delay range to handle workload wl4is
highlighted in Figure 2 justifying that vertex wl4has three
outgoing edges to wl6 wl5 and itself To be more specific
when the workload is given as wl4 the shortest possible delay
is the case when the speed is chosen as 119896max Then the delay(119909 coordinate of the cross point of 119910 = wl
4and 119910 = 119896max119905) is
between th3and th
4as shown in Figure 2Thismeans that the
lowest possibleworkload in the next iteration iswl4 Similarly
the biggest possible workload can also be calculated as wl6
as the biggest possible delay is between th5and 119879 Note also
that some of the vertices may not have an edge directed toitself (self-loop) such as vertex wl
3 It has outgoing edges only
to higher workload levels This means that the computationburden of that state is so big that it only results in higherworkload levels at the next iteration even with the full speed
Different initial workload levels may result in differentWTGs as shown in Figures 3(a)ndash3(c)TheWTGderived fromthe initial workload level of wl
1is illustrated in Figure 3(a)
In contrast to Figure 3(b) vertex wl1is included in the graph
The WTG derived from higher initial workload levels wl4
wl5 and wl
6 is shown in Figure 3(c) Note that the WTGs in
Figures 3(a) and 3(c) are not strongly connected Vertex wl2in
Figure 3(b) for example is not reachable from wl4 However
from the definition of valid workload levels all vertices arereachable from the initial workload level This property is
important for deriving the optimal operation policy that willbe presented in the next subsection
44 Proposed Operation Policy In this subsection we presentthe proposed operation policy that compromises the energy-delay tradeoff caused by the delay-workload dependency
As stated earlier a power-optimal execution trace tr canbe represented as awalk ofWTGThen we have the followingdefinition
Definition 7 (corresponding walk) Given the power-optimalexecution trace tr = (119896
1 1198962 119896
|tr|) and its initial workload1199081 the corresponding walk of the trace is denoted as
cwtr1199081
= (1198901 1198902 119890
|tr|) (19)
where src(1198901) = 1199081and sf(src(119890
119894) dst(119890
119894)) = 119896
119894(for simplicity
we also denote it as sf(119890119894) = sf(src(119890
119894) dst(119890
119894)) = 119896
119894for the rest
of the paper)The average power consumption of thewalk canbe formulated as follows
AVG (cwtr1199081
) =
EG (cwtr1199081
)
DL (cwtr1199081
)
(20)
where DL((1198901 1198902 119890
119899)) is the total delay elapsed
for traversing the walk sum119899
119894=1(src(119890
119894)sf(119890
119894)) and
EG((1198901 1198902 119890
119899)) is the total energy consumption for
the walk 119862119890sum119899
119894=1src(119890119894)119878(sf(119890
119894))
A cycle (closed walk) of WTG is a walk whose startingand ending vertices are the sameThat is awalk (119890
1 1198902 119890
119899)
is a cycle if src(1198901) = dst(119890
119899) Hence if the corresponding
walk of a trace inWTG is a cycle the trace is ever repeatableWe argue that in case that the length of the trace issufficiently long (|tr| rarr infin) the average power consumptionis minimized when the cycle which minimizes (20) repeatsover and over again in the trace
Theorem 8 (optimal cycle of WTG) Suppose that a cycle ofWTG
119888119910119900119901119905
= (1198901 1198902 119890
119899) (21)
has the minimum value of (20) among all cycles of the WTGThen if the length of the trace is long enough the average powerconsumption of the optimal trace converges to
AVG (119888119910119900119901119905) =
EG (119888119910119900119901119905)
DL (119888119910119900119901119905)
(22)
Proof An arbitrary walk of a WTG (1198901 1198902 119890
119899) can be
decomposed into a path (a walk with distinct vertices) fromsrc(1198901) to dst(119890
119899) and a set of cycles (see eg Section 103 of
[17]) Figure 4 depicts a walk example of length 8 where theinitial workload level is wl
2and the last vertex that it traverses
is wl6 If a path from wl
2to wl6 (1198901 1198902 1198905) highlighted in the
dashed arrows is removed from the walk the remaining partis a set of cycles
Now consider a power-optimal trace tr of length 119873 thatstarts from the workload level of119908
1 The corresponding walk
Mobile Information Systems 7
wl1
wl2 wl3 wl4 wl5 wl6
(a)
wl2
wl3 wl4 wl5 wl6
(b)
wl4 wl5 wl6
(c)
Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl
2or wl3 and
(c) wl4 wl5 or wl
6
wl2 wl3 wl4 wl5 wl6e1
e2
e3
e4 e5
e6e7
e8
Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8
of tr can be decomposed into a path pt starting at 1199081and a
set of cycles CYThen the average power consumption of thetrace tr is
AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)
(23)
Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin
AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)
(24)
Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition
EG (cy)DL (cy)
ge AVG (cyopt) =EG (cyopt)
DL (cyopt) forallcy isin CY (25)
Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)
From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy
(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908
119894
Start
No Yes
Generate WTG
Input w1 and W
i = 1
Calculatecyopt and ptopt
Is wi in cyopt
Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))
Update wi+1i++
Update wi+1i++
Figure 5 Flowchart diagram of the proposed operation policy
(ii) Otherwise take a path ptopt = (1198901 1198902 119890
119899) that has
the minimum AVG(ptopt) value and dst(119890119899) is in the
optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost
The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)
Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on
8 Mobile Information Systems
5 Extension to Discrete DVFS
Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique
Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896
119894isin 119870 can be chosen as
a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement
Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl
119900
to wl119899is said to be valid if
th119899minus1
ltwl119900
119896le min (th
119899 119879) exist119896 isin 119870 (26)
Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition
Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl
119900to wl119899is
sf (wl119900wl119899)
= argmin119896isin119870
th119899minus1
ltwl119900
119896le min (th
119899 119879)
(27)
6 Experiments
In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations
61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10
and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time
5 10 15 20 25 30 350Delay (ms)
0
5
10
15
20
25
30
35
Wor
kloa
d (m
s)
Real-time constraint Tkt k = 02
kt k = 04
kt k = 06
kt k = 08
kt k = 10
W(t)
Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm
constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870
We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl
3 that is 119908
1= wl3= 9833
ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl
2 which implies a stable
operation mode forall119894 119896119894= 06
62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896
3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th
1= 02 th
2= 06 th
3= 07
and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example
The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand
Mobile Information Systems 9
W
t
wl4 = 06
wl3 = 05
wl1 = 03
wl2 = 04
k = 09 k = 08
k = 04
T = 10
th1 = 02
th2 = 06
th3 = 07
(a)
wl1
wl2
wl3 wl408
08
08
08
04
04
09
09
(b)
Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl
2= 04 wl
3= 05 and wl
4= 06 and (b) its WTG
representation where the edges are annotated with scaling factors
an alternating operation mode implied in wl2rarr wl
4rarr
wl3rarr wl2shows the least average power consumption in the
discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes
In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870
7 Conclusion
This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones
Disclosure
A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and
Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University
References
[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008
[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010
[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011
[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000
[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013
[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of
10 Mobile Information Systems
Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008
[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007
[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008
[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001
[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000
[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015
[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016
[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011
[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008
[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015
[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001
[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005
[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973
[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001
[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000
[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mobile Information Systems 5
the one to wl119903is not One can tell if a transition can happen
or not with the following definition
Definition 3 (valid transition) A workload transition fromwl119900to wl119899is said to be valid if wl
119900th119899le 119896max = 1 and
wl119900th119899minus1
ge 119896min
43WorkloadTransitionGraph Theessential difficulty of thepresented power optimization problem lies in the fact thattwo conflicting forces should be handled at the same timeIn order to minimize the power on one hand it tries to scaledown the speed (thus lengthen the delay) as much as possibleas described in (2) and (6) On the other hand the lengtheneddelay is not desirable as it imposes a bigger workload in thesuccessive iteration as shown in (4)
Therefore no one simple intuition can be exploited tosolve the problem Rather we need to compare differentmodes in a comprehensive way In order to be able to exploreall possible execution modes and quantify their effects weneed to devise a data structure that includes elementaryinformation on how workload transitions change the systemstatus and power dissipation In line with that purpose wepropose a graph representation of the workload evolutionWorkload Transition Graph (WTG) which captures all pos-sible transition scenarios of the workload changes during thelifetime of a system
A valid workload transition from one workload level toanother can be caused by any delay within the correspondingrange A transition fromwl
119901towl119902in Figure 1(a) for instance
can be caused by any delay within the range of (th119902minus1 th119902] In
other words when handling a workload of wl119901 any scaling
factor within the range of [wl119901th119902wl119901th119902minus1) can cause the
transition In the power-optimal execution trace howeveronly one specific scaling factor is always chosen for a certaintransition even though it happens many times We show thisin the next theorem
Theorem 4 (optimal scaling factor) In the power-optimaltrace 119905119903 = (119896
1 1198962 119896
|119905119903|) if the workload level handled by
119896119894is 119908119897119900and the next workload level is 119908119897
119899
119896119894= min(
119908119897119900
119905ℎ119899
119896min) (11)
Proof We prove this by contradiction Let us suppose thatthere exists a power-optimal trace tr where 119896
119894(which
results in the transition from wl119900to wl119899) is not equal to
min(wl119900th119899 119896min) Then we make a new execution trace tr1015840
from tr by replacing 119896119894with 1198961015840
119894= min(wl
119900th119899 119896min) Note
that all other execution modes in tr1015840 are the same as in trThat is
tr (119895) = tr1015840 (119895) forall119895 = 119894 (12)
By the definition of the scaling factor
119896119894ge 119896min
119896119894gewl119900
th119899
(13)
since the transition is from wl119900to wl119899 Thus
tr (119894) = 119896119894 gt tr1015840 (119894) = 1198961015840119894
(14)
and accordingly we obtain
119878 (119896119894) gt 119878 (119896
1015840
119894) (15)
Then from (6) and (7) the average power in tr1015840 is lower thanthat of tr and this contradicts the proposition
From Theorem 4 we know that only one speed scalingfactor is associated with all transitions of a certain type inthe power-optimal operation scenario So we define a scalingfactor of a valid transition as follows
Definition 5 (scaling factor of a transition) The scaling factorof a valid workload transition from wl
119900to wl119899is
sf (wl119900wl119899) = max(
wl119900
min (th119899 119879)
119896min) (16)
Now we define WTG
Definition 6 (Workload Transition Graph) WTG is definedto be a graph (119881 119864) where119881 and119864 are the sets of vertices andedges respectively Each valid workload level forms a vertex
119881 = V119894= wl119894| wlmin le wl
119894le wlmax (17)
while a valid transition from a workload level to anotherforms an edge between vertices That is there exists an edgefrom V
119894to V119895corresponding to a valid transition from wl
119894to
wl119895
119864 = (V119894 V119895) | exist valid transition from wl
119894to wl119895 (18)
We denote the source and destination vertices of an edge 119890 isin119864 as src(119890) and dst(119890) respectively
With these definitions a power-optimal execution tracetr = (119896
1 1198962 119896
|tr|) can be represented as a walk (asequence of vertices where any pair of consecutive verticesare connected through an edge) of length |tr| + 1 in WTG Ina different form the execution trace is a sequence of edgesin WTG of length |tr|(119890
1 1198902 119890
|tr|) where the workloadtransition from src(119890
119894) to dst(119890
119894) is caused by the execution
mode sf(src(119890119894) dst(119890
119894)) = 119896
119894
Algorithm 1 shows howWTG is generated out of the givenworkload function119882 and the initial workload 119908
1 After the
initialization all valid workload levels are added as verticesin lines (5)-(6) Then for each permutation of two workloadlevels (lines (7)-(8)) it is checked if the in-between transitionis valid or not in line (9) If valid it is added as an edge in line(10)
Let us take Figure 2 as an example of workload function119882 Once given the initial workload one can easily get thevalid workload levels according to (8) and (9) When 119908
1=
wl2 for instance wlmin = wl
2and wlmax = wl
6 Figure 3(b)
illustrates the corresponding WTG of the workload function
6 Mobile Information Systems
(1) procedure GenerateWTG (1198821199081)
(2) wlmin larr equation (8) ⊳ Initializations(3) wlmax larr equation (9)(4) 119881 larr 120601 119864 larr 120601(5) for all wlmin le wl le wlmax do ⊳ Vertices(6) 119881 larr 119881 cup wl(7) for all wlmin le wl
1le wlmax do ⊳ Edges
(8) for all wlmin le wl2le wlmax do
(9) if exist valid transition from wl1to wl2then
(10) 119864 larr 119864 cup (wl1wl2)
Algorithm 1 Generation of WTG
y
t
T
wl1
wl2
wl3
wl4
wl5
wl6
th1 th2 th3 th4 th5
y = kmaxt
y = kmint
Feasible delay range of wl4
Figure 2 A workload staircase function example with six workloadlevels
shown in Figure 2 with the initial workload of wl2 It is not a
complete graph as some pairs of vertices cannot be connecteddirectly since it is not valid according to Definition 3
The feasible delay range to handle workload wl4is
highlighted in Figure 2 justifying that vertex wl4has three
outgoing edges to wl6 wl5 and itself To be more specific
when the workload is given as wl4 the shortest possible delay
is the case when the speed is chosen as 119896max Then the delay(119909 coordinate of the cross point of 119910 = wl
4and 119910 = 119896max119905) is
between th3and th
4as shown in Figure 2Thismeans that the
lowest possibleworkload in the next iteration iswl4 Similarly
the biggest possible workload can also be calculated as wl6
as the biggest possible delay is between th5and 119879 Note also
that some of the vertices may not have an edge directed toitself (self-loop) such as vertex wl
3 It has outgoing edges only
to higher workload levels This means that the computationburden of that state is so big that it only results in higherworkload levels at the next iteration even with the full speed
Different initial workload levels may result in differentWTGs as shown in Figures 3(a)ndash3(c)TheWTGderived fromthe initial workload level of wl
1is illustrated in Figure 3(a)
In contrast to Figure 3(b) vertex wl1is included in the graph
The WTG derived from higher initial workload levels wl4
wl5 and wl
6 is shown in Figure 3(c) Note that the WTGs in
Figures 3(a) and 3(c) are not strongly connected Vertex wl2in
Figure 3(b) for example is not reachable from wl4 However
from the definition of valid workload levels all vertices arereachable from the initial workload level This property is
important for deriving the optimal operation policy that willbe presented in the next subsection
44 Proposed Operation Policy In this subsection we presentthe proposed operation policy that compromises the energy-delay tradeoff caused by the delay-workload dependency
As stated earlier a power-optimal execution trace tr canbe represented as awalk ofWTGThen we have the followingdefinition
Definition 7 (corresponding walk) Given the power-optimalexecution trace tr = (119896
1 1198962 119896
|tr|) and its initial workload1199081 the corresponding walk of the trace is denoted as
cwtr1199081
= (1198901 1198902 119890
|tr|) (19)
where src(1198901) = 1199081and sf(src(119890
119894) dst(119890
119894)) = 119896
119894(for simplicity
we also denote it as sf(119890119894) = sf(src(119890
119894) dst(119890
119894)) = 119896
119894for the rest
of the paper)The average power consumption of thewalk canbe formulated as follows
AVG (cwtr1199081
) =
EG (cwtr1199081
)
DL (cwtr1199081
)
(20)
where DL((1198901 1198902 119890
119899)) is the total delay elapsed
for traversing the walk sum119899
119894=1(src(119890
119894)sf(119890
119894)) and
EG((1198901 1198902 119890
119899)) is the total energy consumption for
the walk 119862119890sum119899
119894=1src(119890119894)119878(sf(119890
119894))
A cycle (closed walk) of WTG is a walk whose startingand ending vertices are the sameThat is awalk (119890
1 1198902 119890
119899)
is a cycle if src(1198901) = dst(119890
119899) Hence if the corresponding
walk of a trace inWTG is a cycle the trace is ever repeatableWe argue that in case that the length of the trace issufficiently long (|tr| rarr infin) the average power consumptionis minimized when the cycle which minimizes (20) repeatsover and over again in the trace
Theorem 8 (optimal cycle of WTG) Suppose that a cycle ofWTG
119888119910119900119901119905
= (1198901 1198902 119890
119899) (21)
has the minimum value of (20) among all cycles of the WTGThen if the length of the trace is long enough the average powerconsumption of the optimal trace converges to
AVG (119888119910119900119901119905) =
EG (119888119910119900119901119905)
DL (119888119910119900119901119905)
(22)
Proof An arbitrary walk of a WTG (1198901 1198902 119890
119899) can be
decomposed into a path (a walk with distinct vertices) fromsrc(1198901) to dst(119890
119899) and a set of cycles (see eg Section 103 of
[17]) Figure 4 depicts a walk example of length 8 where theinitial workload level is wl
2and the last vertex that it traverses
is wl6 If a path from wl
2to wl6 (1198901 1198902 1198905) highlighted in the
dashed arrows is removed from the walk the remaining partis a set of cycles
Now consider a power-optimal trace tr of length 119873 thatstarts from the workload level of119908
1 The corresponding walk
Mobile Information Systems 7
wl1
wl2 wl3 wl4 wl5 wl6
(a)
wl2
wl3 wl4 wl5 wl6
(b)
wl4 wl5 wl6
(c)
Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl
2or wl3 and
(c) wl4 wl5 or wl
6
wl2 wl3 wl4 wl5 wl6e1
e2
e3
e4 e5
e6e7
e8
Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8
of tr can be decomposed into a path pt starting at 1199081and a
set of cycles CYThen the average power consumption of thetrace tr is
AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)
(23)
Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin
AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)
(24)
Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition
EG (cy)DL (cy)
ge AVG (cyopt) =EG (cyopt)
DL (cyopt) forallcy isin CY (25)
Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)
From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy
(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908
119894
Start
No Yes
Generate WTG
Input w1 and W
i = 1
Calculatecyopt and ptopt
Is wi in cyopt
Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))
Update wi+1i++
Update wi+1i++
Figure 5 Flowchart diagram of the proposed operation policy
(ii) Otherwise take a path ptopt = (1198901 1198902 119890
119899) that has
the minimum AVG(ptopt) value and dst(119890119899) is in the
optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost
The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)
Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on
8 Mobile Information Systems
5 Extension to Discrete DVFS
Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique
Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896
119894isin 119870 can be chosen as
a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement
Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl
119900
to wl119899is said to be valid if
th119899minus1
ltwl119900
119896le min (th
119899 119879) exist119896 isin 119870 (26)
Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition
Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl
119900to wl119899is
sf (wl119900wl119899)
= argmin119896isin119870
th119899minus1
ltwl119900
119896le min (th
119899 119879)
(27)
6 Experiments
In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations
61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10
and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time
5 10 15 20 25 30 350Delay (ms)
0
5
10
15
20
25
30
35
Wor
kloa
d (m
s)
Real-time constraint Tkt k = 02
kt k = 04
kt k = 06
kt k = 08
kt k = 10
W(t)
Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm
constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870
We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl
3 that is 119908
1= wl3= 9833
ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl
2 which implies a stable
operation mode forall119894 119896119894= 06
62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896
3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th
1= 02 th
2= 06 th
3= 07
and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example
The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand
Mobile Information Systems 9
W
t
wl4 = 06
wl3 = 05
wl1 = 03
wl2 = 04
k = 09 k = 08
k = 04
T = 10
th1 = 02
th2 = 06
th3 = 07
(a)
wl1
wl2
wl3 wl408
08
08
08
04
04
09
09
(b)
Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl
2= 04 wl
3= 05 and wl
4= 06 and (b) its WTG
representation where the edges are annotated with scaling factors
an alternating operation mode implied in wl2rarr wl
4rarr
wl3rarr wl2shows the least average power consumption in the
discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes
In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870
7 Conclusion
This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones
Disclosure
A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and
Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University
References
[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008
[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010
[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011
[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000
[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013
[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of
10 Mobile Information Systems
Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008
[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007
[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008
[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001
[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000
[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015
[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016
[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011
[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008
[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015
[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001
[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005
[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973
[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001
[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000
[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
6 Mobile Information Systems
(1) procedure GenerateWTG (1198821199081)
(2) wlmin larr equation (8) ⊳ Initializations(3) wlmax larr equation (9)(4) 119881 larr 120601 119864 larr 120601(5) for all wlmin le wl le wlmax do ⊳ Vertices(6) 119881 larr 119881 cup wl(7) for all wlmin le wl
1le wlmax do ⊳ Edges
(8) for all wlmin le wl2le wlmax do
(9) if exist valid transition from wl1to wl2then
(10) 119864 larr 119864 cup (wl1wl2)
Algorithm 1 Generation of WTG
y
t
T
wl1
wl2
wl3
wl4
wl5
wl6
th1 th2 th3 th4 th5
y = kmaxt
y = kmint
Feasible delay range of wl4
Figure 2 A workload staircase function example with six workloadlevels
shown in Figure 2 with the initial workload of wl2 It is not a
complete graph as some pairs of vertices cannot be connecteddirectly since it is not valid according to Definition 3
The feasible delay range to handle workload wl4is
highlighted in Figure 2 justifying that vertex wl4has three
outgoing edges to wl6 wl5 and itself To be more specific
when the workload is given as wl4 the shortest possible delay
is the case when the speed is chosen as 119896max Then the delay(119909 coordinate of the cross point of 119910 = wl
4and 119910 = 119896max119905) is
between th3and th
4as shown in Figure 2Thismeans that the
lowest possibleworkload in the next iteration iswl4 Similarly
the biggest possible workload can also be calculated as wl6
as the biggest possible delay is between th5and 119879 Note also
that some of the vertices may not have an edge directed toitself (self-loop) such as vertex wl
3 It has outgoing edges only
to higher workload levels This means that the computationburden of that state is so big that it only results in higherworkload levels at the next iteration even with the full speed
Different initial workload levels may result in differentWTGs as shown in Figures 3(a)ndash3(c)TheWTGderived fromthe initial workload level of wl
1is illustrated in Figure 3(a)
In contrast to Figure 3(b) vertex wl1is included in the graph
The WTG derived from higher initial workload levels wl4
wl5 and wl
6 is shown in Figure 3(c) Note that the WTGs in
Figures 3(a) and 3(c) are not strongly connected Vertex wl2in
Figure 3(b) for example is not reachable from wl4 However
from the definition of valid workload levels all vertices arereachable from the initial workload level This property is
important for deriving the optimal operation policy that willbe presented in the next subsection
44 Proposed Operation Policy In this subsection we presentthe proposed operation policy that compromises the energy-delay tradeoff caused by the delay-workload dependency
As stated earlier a power-optimal execution trace tr canbe represented as awalk ofWTGThen we have the followingdefinition
Definition 7 (corresponding walk) Given the power-optimalexecution trace tr = (119896
1 1198962 119896
|tr|) and its initial workload1199081 the corresponding walk of the trace is denoted as
cwtr1199081
= (1198901 1198902 119890
|tr|) (19)
where src(1198901) = 1199081and sf(src(119890
119894) dst(119890
119894)) = 119896
119894(for simplicity
we also denote it as sf(119890119894) = sf(src(119890
119894) dst(119890
119894)) = 119896
119894for the rest
of the paper)The average power consumption of thewalk canbe formulated as follows
AVG (cwtr1199081
) =
EG (cwtr1199081
)
DL (cwtr1199081
)
(20)
where DL((1198901 1198902 119890
119899)) is the total delay elapsed
for traversing the walk sum119899
119894=1(src(119890
119894)sf(119890
119894)) and
EG((1198901 1198902 119890
119899)) is the total energy consumption for
the walk 119862119890sum119899
119894=1src(119890119894)119878(sf(119890
119894))
A cycle (closed walk) of WTG is a walk whose startingand ending vertices are the sameThat is awalk (119890
1 1198902 119890
119899)
is a cycle if src(1198901) = dst(119890
119899) Hence if the corresponding
walk of a trace inWTG is a cycle the trace is ever repeatableWe argue that in case that the length of the trace issufficiently long (|tr| rarr infin) the average power consumptionis minimized when the cycle which minimizes (20) repeatsover and over again in the trace
Theorem 8 (optimal cycle of WTG) Suppose that a cycle ofWTG
119888119910119900119901119905
= (1198901 1198902 119890
119899) (21)
has the minimum value of (20) among all cycles of the WTGThen if the length of the trace is long enough the average powerconsumption of the optimal trace converges to
AVG (119888119910119900119901119905) =
EG (119888119910119900119901119905)
DL (119888119910119900119901119905)
(22)
Proof An arbitrary walk of a WTG (1198901 1198902 119890
119899) can be
decomposed into a path (a walk with distinct vertices) fromsrc(1198901) to dst(119890
119899) and a set of cycles (see eg Section 103 of
[17]) Figure 4 depicts a walk example of length 8 where theinitial workload level is wl
2and the last vertex that it traverses
is wl6 If a path from wl
2to wl6 (1198901 1198902 1198905) highlighted in the
dashed arrows is removed from the walk the remaining partis a set of cycles
Now consider a power-optimal trace tr of length 119873 thatstarts from the workload level of119908
1 The corresponding walk
Mobile Information Systems 7
wl1
wl2 wl3 wl4 wl5 wl6
(a)
wl2
wl3 wl4 wl5 wl6
(b)
wl4 wl5 wl6
(c)
Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl
2or wl3 and
(c) wl4 wl5 or wl
6
wl2 wl3 wl4 wl5 wl6e1
e2
e3
e4 e5
e6e7
e8
Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8
of tr can be decomposed into a path pt starting at 1199081and a
set of cycles CYThen the average power consumption of thetrace tr is
AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)
(23)
Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin
AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)
(24)
Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition
EG (cy)DL (cy)
ge AVG (cyopt) =EG (cyopt)
DL (cyopt) forallcy isin CY (25)
Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)
From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy
(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908
119894
Start
No Yes
Generate WTG
Input w1 and W
i = 1
Calculatecyopt and ptopt
Is wi in cyopt
Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))
Update wi+1i++
Update wi+1i++
Figure 5 Flowchart diagram of the proposed operation policy
(ii) Otherwise take a path ptopt = (1198901 1198902 119890
119899) that has
the minimum AVG(ptopt) value and dst(119890119899) is in the
optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost
The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)
Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on
8 Mobile Information Systems
5 Extension to Discrete DVFS
Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique
Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896
119894isin 119870 can be chosen as
a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement
Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl
119900
to wl119899is said to be valid if
th119899minus1
ltwl119900
119896le min (th
119899 119879) exist119896 isin 119870 (26)
Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition
Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl
119900to wl119899is
sf (wl119900wl119899)
= argmin119896isin119870
th119899minus1
ltwl119900
119896le min (th
119899 119879)
(27)
6 Experiments
In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations
61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10
and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time
5 10 15 20 25 30 350Delay (ms)
0
5
10
15
20
25
30
35
Wor
kloa
d (m
s)
Real-time constraint Tkt k = 02
kt k = 04
kt k = 06
kt k = 08
kt k = 10
W(t)
Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm
constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870
We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl
3 that is 119908
1= wl3= 9833
ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl
2 which implies a stable
operation mode forall119894 119896119894= 06
62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896
3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th
1= 02 th
2= 06 th
3= 07
and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example
The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand
Mobile Information Systems 9
W
t
wl4 = 06
wl3 = 05
wl1 = 03
wl2 = 04
k = 09 k = 08
k = 04
T = 10
th1 = 02
th2 = 06
th3 = 07
(a)
wl1
wl2
wl3 wl408
08
08
08
04
04
09
09
(b)
Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl
2= 04 wl
3= 05 and wl
4= 06 and (b) its WTG
representation where the edges are annotated with scaling factors
an alternating operation mode implied in wl2rarr wl
4rarr
wl3rarr wl2shows the least average power consumption in the
discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes
In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870
7 Conclusion
This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones
Disclosure
A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and
Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University
References
[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008
[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010
[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011
[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000
[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013
[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of
10 Mobile Information Systems
Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008
[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007
[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008
[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001
[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000
[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015
[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016
[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011
[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008
[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015
[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001
[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005
[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973
[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001
[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000
[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mobile Information Systems 7
wl1
wl2 wl3 wl4 wl5 wl6
(a)
wl2
wl3 wl4 wl5 wl6
(b)
wl4 wl5 wl6
(c)
Figure 3 The corresponding WTG of the workload function119882 shown in Figure 2 with the initial workload of (a) wl1 (b) wl
2or wl3 and
(c) wl4 wl5 or wl
6
wl2 wl3 wl4 wl5 wl6e1
e2
e3
e4 e5
e6e7
e8
Figure 4 A walk example of the WTG shown in Figure 3(b) oflength 8
of tr can be decomposed into a path pt starting at 1199081and a
set of cycles CYThen the average power consumption of thetrace tr is
AVG (tr) =EG (pt) + sumcyisinCY EG (cy)DL (pt) + sumcyisinCY DL (cy)
(23)
Since pt contains at most |119881| minus 1 edges by definition EG(pt)and DL(pt) can be upper bounded by a certain value Thenif 119873 is sufficiently large EG(pt) ≪ sumcyisinCY EG(cy) andDL(pt) ≪ sumcyisinCY DL(cy) That is if |tr| rarr infin
AVG (tr) ≃sumcyisinCY EG (cy)sumcyisinCY DL (cy)
(24)
Note also that the average power consumption of every cyclein CY is not smaller than EG(cyopt)119905(cyopt) by definition
EG (cy)DL (cy)
ge AVG (cyopt) =EG (cyopt)
DL (cyopt) forallcy isin CY (25)
Therefore the average power consumption of the power-optimal trace tr AVG(tr) will get infinitesimally close toAVG(cyopt)
From Theorem 8 it is understood that changing theDVFS mode of the given system following the optimal cyclepresented above results in asymptotically optimal averagepower consumption Thus we propose following rules forDVFS operation policy
(i) If the current workload level vertex is in the optimalcycle cyopt just follow the cycle repeatedly ever thatis at the 119894th iteration the speed scaling factor is cho-sen to be sf(119890) such that 119890 is in the optimal cycle andsrc(119890) = 119908
119894
Start
No Yes
Generate WTG
Input w1 and W
i = 1
Calculatecyopt and ptopt
Is wi in cyopt
Set ki = sf(ptopt(i)) Set ki = sf(cyopt(i))
Update wi+1i++
Update wi+1i++
Figure 5 Flowchart diagram of the proposed operation policy
(ii) Otherwise take a path ptopt = (1198901 1198902 119890
119899) that has
the minimum AVG(ptopt) value and dst(119890119899) is in the
optimal cycle cyopt In other words try to get in theoptimal cycle with the minimum cost
The optimal cycle cyopt of WTG can be searched by usingan existing cycle enumeration algorithm In this paper weuse the one proposed by Tarjan [18] The minimum pathto the optimal cycle ptopt can also be searched by simpleenumeration It is worthwhile to mention that we cannotsimply use the minimum weight cycle searching algorithmas the weight is not simply a summation of the weights but acomplex function of them as presented in (20)
Figure 5 shows the proposed operation policy in aflowchart diagram Given the initial workload and the work-load function we generate WTG from which cyopt and ptoptare derived in the next steps It is worth mentioning that thiscan be done in a tractable time and is just one-time efforttaken offline As long as the initial workload vertex is notincluded in cyopt the system follows the trace represented inptopt until it reaches the optimal cycle ofWTGThen it simplyrepeats the trace implied in cyopt from that iteration on
8 Mobile Information Systems
5 Extension to Discrete DVFS
Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique
Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896
119894isin 119870 can be chosen as
a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement
Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl
119900
to wl119899is said to be valid if
th119899minus1
ltwl119900
119896le min (th
119899 119879) exist119896 isin 119870 (26)
Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition
Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl
119900to wl119899is
sf (wl119900wl119899)
= argmin119896isin119870
th119899minus1
ltwl119900
119896le min (th
119899 119879)
(27)
6 Experiments
In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations
61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10
and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time
5 10 15 20 25 30 350Delay (ms)
0
5
10
15
20
25
30
35
Wor
kloa
d (m
s)
Real-time constraint Tkt k = 02
kt k = 04
kt k = 06
kt k = 08
kt k = 10
W(t)
Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm
constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870
We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl
3 that is 119908
1= wl3= 9833
ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl
2 which implies a stable
operation mode forall119894 119896119894= 06
62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896
3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th
1= 02 th
2= 06 th
3= 07
and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example
The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand
Mobile Information Systems 9
W
t
wl4 = 06
wl3 = 05
wl1 = 03
wl2 = 04
k = 09 k = 08
k = 04
T = 10
th1 = 02
th2 = 06
th3 = 07
(a)
wl1
wl2
wl3 wl408
08
08
08
04
04
09
09
(b)
Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl
2= 04 wl
3= 05 and wl
4= 06 and (b) its WTG
representation where the edges are annotated with scaling factors
an alternating operation mode implied in wl2rarr wl
4rarr
wl3rarr wl2shows the least average power consumption in the
discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes
In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870
7 Conclusion
This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones
Disclosure
A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and
Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University
References
[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008
[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010
[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011
[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000
[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013
[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of
10 Mobile Information Systems
Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008
[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007
[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008
[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001
[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000
[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015
[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016
[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011
[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008
[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015
[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001
[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005
[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973
[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001
[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000
[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
8 Mobile Information Systems
5 Extension to Discrete DVFS
Whilst we assume a continuous DVFS model for easeof presentation and generality modern microprocessors inreality have finite DVFS modes with a set of predefinedoperation voltages and frequencies In this section we showthat the continuity of the model presented in (1) can berelaxed by modifying Definitions 3 and 5 without harmingthe effectiveness of the proposed technique
Let us suppose that we now have a system with a discreteand finite DVFS model where only 119896
119894isin 119870 can be chosen as
a speed scaling factor at the 119894th iteration Valid transitions inDefinition 3 are redefined the transition is valid if there is119896 isin 119870 that meets the same requirement
Definition 9 (valid transition in discrete DVFS) Given a setof feasible scaling factors 119870 a workload transition from wl
119900
to wl119899is said to be valid if
th119899minus1
ltwl119900
119896le min (th
119899 119879) exist119896 isin 119870 (26)
Likewise the scaling factor of a valid transition inDefinition 5 is also reformed to be minimum 119896 isin 119870 thatkeeps the elapsed delay fallen into the range which results inthe same transition One has the following definition
Definition 10 (scaling factor in discrete DVFS) Given a setof feasible scaling factors 119870 the scaling factor of a validworkload transition from wl
119900to wl119899is
sf (wl119900wl119899)
= argmin119896isin119870
th119899minus1
ltwl119900
119896le min (th
119899 119879)
(27)
6 Experiments
In this section we validate the proposedmodel and operationpolicy with experimental analysis and simulations
61 A Case Study Object Tracking It is firstly shown that theproposed workload-delay dependency is evidently observedin an object tracking application The performance of acommonly used object tracking method [19] is profiledas a tracking solution using the publicly available imple-mentation [20] We choose Exynos5422 [21] as the targetmobile embedded computing platform which has 2GBmainmemory running Linux operating system The actual powerdissipations of the processor are measured individually forfive different DVFS modesThat is119870 = 02 04 06 08 10
and at the maximum speed the core is operating at 2GHzLeveraging on a priori knowledge on the maximum speedof the object the maximum distance that the object couldhave moved between two iterations is calculated In thisexperiment it is assumed that the objectrsquos speed neverexceeds 10 pixelsms If an iteration takes 119905 ms for instancethe search area for the next iteration is given as a square withthe side length of (10 sdot 2 sdot 119905 + 125) pixels This search areais growing in a discrete manner at every 5ms The real-time
5 10 15 20 25 30 350Delay (ms)
0
5
10
15
20
25
30
35
Wor
kloa
d (m
s)
Real-time constraint Tkt k = 02
kt k = 04
kt k = 06
kt k = 08
kt k = 10
W(t)
Figure 6 Workload-delay dependency function 119882 of Lucas-Kanade object tracking algorithm
constraint 119879 is set to 25msThe workload function in a shapeof staircase is illustrated in Figure 6 with five guiding lineseach of which denotes 119896 sdot 119905 for 119896 isin 119870
We compare the proposed power management policywith two others The first one is ALAP where the speed ischosen to be the slowest one with respect to the real-timeconstraint The other comparison is made against a stabletrace with the maximum speed as another extreme (ASAP)When the initial workload is wl
3 that is 119908
1= wl3= 9833
ASAP and ALAP result in the average power consumptionof 25032W and 12623W respectively The average powerconsumption of the proposed power management policyoutperforms the others as 07592W The optimal cycle of itsWTG is the self-loop of node wl
2 which implies a stable
operation mode forall119894 119896119894= 06
62 Stable versus Alternating OperationModes There are twokinds of cycles in WTG the first one is a self-loop whichimplies a stable operation mode where no mode changeshappen over the edge Other than these self-loops the WTGhas one non-self-loop cycle as well From the perspectiveof the operation policy this non-self-loop cycle implies apredefined sequence of mode changes that can repeat overand over again We call this alternating operation modeMany examples including a real-life application shown in theprevious subsection aswell as one presented in [11] tend to beoptimal in a stable operationmodeHowever in principle theoptimal solution cannot be achieved in a stablemode in someconfigurations In order to illustrate this we show a counterexample shown in Figure 7 and apply the proposed techniqueto the example with a DVFS power modeling function of119878(119896) = 5 sdot 119896
3 and 119870 = 04 08 09 The delay threshold andthe real-time constraint are th
1= 02 th
2= 06 th
3= 07
and119879 = 10 respectively Figure 7(b) shows the derivedWTGfor the synthetic example
The average power consumption of all self-loops in thesynthetic example is tied to 25600W On the other hand
Mobile Information Systems 9
W
t
wl4 = 06
wl3 = 05
wl1 = 03
wl2 = 04
k = 09 k = 08
k = 04
T = 10
th1 = 02
th2 = 06
th3 = 07
(a)
wl1
wl2
wl3 wl408
08
08
08
04
04
09
09
(b)
Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl
2= 04 wl
3= 05 and wl
4= 06 and (b) its WTG
representation where the edges are annotated with scaling factors
an alternating operation mode implied in wl2rarr wl
4rarr
wl3rarr wl2shows the least average power consumption in the
discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes
In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870
7 Conclusion
This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones
Disclosure
A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and
Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University
References
[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008
[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010
[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011
[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000
[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013
[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of
10 Mobile Information Systems
Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008
[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007
[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008
[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001
[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000
[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015
[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016
[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011
[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008
[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015
[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001
[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005
[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973
[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001
[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000
[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mobile Information Systems 9
W
t
wl4 = 06
wl3 = 05
wl1 = 03
wl2 = 04
k = 09 k = 08
k = 04
T = 10
th1 = 02
th2 = 06
th3 = 07
(a)
wl1
wl2
wl3 wl408
08
08
08
04
04
09
09
(b)
Figure 7 (a) A synthetic example 119882(119905) with four workload levels wl1= 03 wl
2= 04 wl
3= 05 and wl
4= 06 and (b) its WTG
representation where the edges are annotated with scaling factors
an alternating operation mode implied in wl2rarr wl
4rarr
wl3rarr wl2shows the least average power consumption in the
discrete model This is due to the fact that a computer systemcannot simply operate on an ideal design point In case thatthe theoretically optimal design point cannot be captured by acommodity hardware the proposed technique is particularlyuseful It can effectively explore design space and find out thebest one in alternating modes
In principle a stable operation mode is the case that thestaircase workload function 119882(119905) has a crossing point with119896 sdot 119905 If this 119896 is sufficiently small it is likely to be a near-optimal operation mode However it does not always resultin a near-optimal power consumption Particularly in casethat only a limited number of DVFS modes are available ina microprocessor this 119896 which crosses the current workloadlevel may not exist in 119870
7 Conclusion
This paper formulates the delay-workload dependency inpower optimization problem of embedded systems as a stair-case function of the delay taken at the previous iteration Inapplying it to the power optimization of DVFS-enabled elec-tronic devices a novel graph representation called WTG isproposed for exploring all possible workloadmode changesThen it is shown that the power optimization problemis equivalent to finding a cycle of the graph that has theminimum average power consumption The effectivenessof the proposed operation policy is proven by the powersimulations of synthetic and real-life examples It has beenobserved that staying in a low speed scaling factor in astable operationmode is often the best discipline (self-loop inWTG) However alternating modes where the DVFS modeschange over a predefined pattern sometimes outperform thestable ones
Disclosure
A preliminary version of this paper appeared in July 2015at the International Symposium on Low Power Electronicsand Design (ISLPED) under the title of ldquoModeling and
Power Optimization of Cyber-Physical Systems with Energy-Workload Tradeoffrdquo [11]
Competing Interests
The authors declare that they have no competing interests
Acknowledgments
Thisworkwas supported by ICTRampDprogramofMSIPIITP(B0101-15-0661 the research and development of the self-adaptive software framework for various IoT devices) BasicScience Research Program through the National ResearchFoundation of Korea (NRF) funded by the Ministry of Sci-ence ICTamp Future Planning (NRF-2013R1A2A2A01067907)and the new faculty research fund of Ajou University
References
[1] E A Lee ldquoCyber physical systems design challengesrdquo TechRep UCBEECS-2008-8 EECS Department University of Cal-ifornia Berkeley Calif USA 2008
[2] J Kleissl and Y Agarwal ldquoCyber-physical energy systemsfocus on smart buildingsrdquo in Proceedings of the 47th DesignAutomation Conference (DAC rsquo10) pp 749ndash754 ACM AustinTex USA June 2010
[3] J S Kim D H Yeom and Y H Joo ldquoFast and robustalgorithm of tracking multiple moving objects for intelligentvideo surveillance systemsrdquo IEEE Transactions on ConsumerElectronics vol 57 no 3 pp 1165ndash1170 2011
[4] D K Park H S Yoon and C SunWon ldquoFast object tracking indigital videordquo IEEE Transactions on Consumer Electronics vol46 no 3 pp 785ndash790 2000
[5] J Pestana J L Sanchez-Lopez P Campoy and S SaripallildquoVision based GPS-denied object tracking and following forunmanned aerial vehiclesrdquo in Proceedings of the 2013 IEEEInternational Symposiumon Safety Security andRescueRobotics(SSRR rsquo13) pp 1ndash6 IEEE Linkoping Sweden October 2013
[6] J Agrawal Y Diao D Gyllstrom and N Immerman ldquoEfficientpattern matching over event streamsrdquo in Proceedings of theACM SIGMOD International Conference on Management of
10 Mobile Information Systems
Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008
[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007
[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008
[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001
[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000
[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015
[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016
[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011
[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008
[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015
[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001
[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005
[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973
[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001
[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000
[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
10 Mobile Information Systems
Data (SIGMOD rsquo08) pp 147ndash160 ACM Vancouver CanadaJune 2008
[7] J Barbic and D James ldquoTime-critical distributed contact for 6-dof haptic rendering of adaptively sampled reduced deformablemodelsrdquo in Proceedings of the ACM SIGGRAPHEurographicsSymposium on Computer Animation (SCA rsquo07) pp 171ndash180Eurographics Association San Diego Calif USA August 2007
[8] J Barbic and D L James ldquoSix-DoF haptic rendering of contactbetween geometrically complex reduced deformable modelsrdquoIEEE Transactions on Haptics vol 1 no 1 pp 39ndash52 2008
[9] P Padmanabhan and K G Shin ldquoReal-time dynamic voltagescaling for low-power embedded operating systemsrdquo SIGOPSmdashOperating Systems Review vol 35 no 5 pp 89ndash102 2001
[10] T Mudge ldquoPower a first class design constraint for futurearchitecturesrdquo in High Performance ComputingmdashHiPC 2000pp 215ndash224 Springer 2000
[11] H Yang and S Ha ldquoModeling and power optimization of cyber-physical systemswith energy-workload tradeoffrdquo in Proceedingsof the 20th IEEEACM International Symposium on Low PowerElectronics and Design (ISLPED rsquo15) pp 315ndash320 Rome ItalyJuly 2015
[12] H-CAnH Yang and SHa ldquoA formal approach to power opti-mization in cpss with delay-workload dependence awarenessrdquoIEEE Transactions on Computer-Aided Design of IntegratedCircuits and Systems vol 35 no 5 pp 750ndash763 2016
[13] P Bogdan and R Marculescu ldquoCyberphysical systems work-load modeling and design optimizationrdquo IEEE Design amp Test ofComputers vol 28 no 4 pp 78ndash87 2011
[14] F Zhang K Szwaykowska W Wolf and V Mooney ldquoTaskscheduling for control oriented requirements for cyber-physicalsystemsrdquo in Proceedings of the Real-Time Systems Symposium(RTSS rsquo08) pp 47ndash56 Barcelona Spain December 2008
[15] Y V Pant H Abbas K Mohta T X Nghiem J Devietti andRMangharam ldquoCo-design of anytime computation and robustcontrolrdquo in Proceedings of the 2015 IEEE Real-Time SystemsSymposium (RTSS rsquo15) pp 43ndash52 IEEE San Antonio Tex USADecember 2015
[16] D Goswami R Schneider and S Chakraborty ldquoCo-design ofcyber-physical systems via controllers with exible delay con-straintsrdquo in Proceedings of the 16th Asia and South Pacific DesignAutomation Conference pp 225ndash230 IEEE Press YokohamaJapan January 2001
[17] A Schrijver ldquoCombinatorial optimization polyhedra and effi-ciencyrdquo Discrete Applied Mathematics vol 146 pp 120ndash1222005
[18] R Tarjan ldquoEnumeration of the elementary circuits of a directedgraphrdquo SIAM Journal on Computing vol 2 no 3 pp 211ndash2161973
[19] J-Y Bouguet ldquoPyramidal implementation of the affine lucaskanade feature tracker description of the algorithmrdquo vol 5 pp1ndash10 2001
[20] G Bradski ldquoThe opencv libraryrdquo Doctor Dobbs Journal vol 25no 11 pp 120ndash126 2000
[21] Samsung Exynos Octa 5422 2015 httpwwwsamsungcom
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Submit your manuscripts athttpwwwhindawicom
Computer Games Technology
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Distributed Sensor Networks
International Journal of
Advances in
FuzzySystems
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014
International Journal of
ReconfigurableComputing
Hindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Applied Computational Intelligence and Soft Computing
thinspAdvancesthinspinthinsp
Artificial Intelligence
HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014
Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Electrical and Computer Engineering
Journal of
Journal of
Computer Networks and Communications
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporation
httpwwwhindawicom Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
ArtificialNeural Systems
Advances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
RoboticsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Computational Intelligence and Neuroscience
Industrial EngineeringJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Human-ComputerInteraction
Advances in
Computer EngineeringAdvances in
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014