14
Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads q Eduardo Tavares * , Paulo Maciel, Bruno Silva Universidade Federal de Pernambuco, Centro de Informática, 50732-970 Recife, Pernambuco, Brazil article info Article history: Available online 8 August 2008 Keywords: Hard real-time system Scheduling Formal methods Dynamic voltage scaling Energy consumption abstract Dynamic voltage scaling (DVS) has been adopted as an effective technique for reducing energy consump- tion in embedded systems. Although several scheduling approaches have been developed to address volt- age scaling together with stringent timing constraints, inter-task relations have been neglected. This work presents a pre-runtime method for hard real-time systems scheduling considering dynamic voltage scaling, overheads and inter-task relations. The proposed work considers time Petri nets as a formal model in order to provide a basis for precise schedule generation as well as to allow property analysis and verification. Experimental results depict the proposed approach feasibility, in the sense that energy consumption is minimized as well as system constraints are met. Ó 2008 Elsevier B.V. All rights reserved. 1. Introduction Whenever designing embedded systems, several constraints, such as size, reliability, energy consumption and timing con- straints, may have to be considered to satisfy system requirements. Lately, considerable special attention has been devoted to energy consumption, mainly due to the great expansion of the mobile de- vice market. During the last decade, DVS (dynamic voltage scaling) has been adopted as one of the most effective techniques for reducing en- ergy consumption in embedded systems. Adjusting CPU supply voltage has great impact on energy consumption, since the con- sumption is proportional to the square of supply voltage in CMOS microprocessors [11]. However, lowering the supply voltage line- arly affects the maximum operating frequency. Therefore, DVS may be seen as a technique for trading-off energy consumption and performance. When considering hard real-time systems, DVS needs to be adopted with caution, since stringent timing constraints may be af- fected. In this case, equipment damage or even loss of human lives may occur due to timing constraint violations. Thus, several sched- uling approaches, mainly based on runtime techniques, have been devised to cope with DVS in time-critical systems. However, sys- tem specifications often either oversimplify tasks’ relations such as precedence and exclusion relation or do not consider them at all. Furthermore, overheads, such as preemptions and voltage/fre- quency switching, are issues that must be considered during sche- dule generation. Indeed, if overheads are neglected, tasks’ constraints may be affected and even the gains obtained with DVS may be significantly reduced [15]. This paper presents a pre-runtime scheduling method for hard real-time systems that considers DVS, inter-task relations and overheads. More specifically, the contributions are: (i) the proposi- tion of a formal model based on time Petri nets (TPN) that provides the basis for precise schedule generation as well as for the verifica- tion/analysis of behavioral and structural properties; (ii) the expli- cit modeling of inter-task relations and overheads (e.g. voltage/ frequency switching), so as to allow considering them for the sche- dule generation; and (iii) a pre-runtime scheduling algorithm that finds out feasible schedules that satisfy timing and energy con- straints. Besides, a technique for dealing with dynamic slack times is presented in order to take advantage of new opportunities to fur- ther reduce energy consumption during system execution. One challenge designers have to face when dealing with embed- ded hard real-time systems is the modeling power of tools. In order to be of practical usability, they might provide means for repre- senting concurrent communicating tasks, synchronization mecha- nisms and communication primitives as well as they should describe timing constraints and requirements. Furthermore, the availability of precise methods for analysis and verification of sys- tems’ representation is a requirement of remarkable importance. Petri nets [16] are a very well suited model for representing real- time embedded systems, since concurrency, synchronization and communication mechanisms are naturally represented. It should 0141-9331/$ - see front matter Ó 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.micpro.2008.07.001 q The authors would like to thank the anonymous reviewers for their valuable comments, which greatly improved the quality of the paper. * Corresponding author. Tel.: +55 81 2126 8430. E-mail addresses: [email protected] (E. Tavares), [email protected] (P. Maciel), [email protected] (B. Silva). Microprocessors and Microsystems 32 (2008) 460–473 Contents lists available at ScienceDirect Microprocessors and Microsystems journal homepage: www.elsevier.com/locate/micpro

Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

Embed Size (px)

Citation preview

Page 1: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

Microprocessors and Microsystems 32 (2008) 460–473

Contents lists available at ScienceDirect

Microprocessors and Microsystems

journal homepage: www.elsevier .com/ locate /micpro

Modeling hard real-time systems considering inter-task relations, dynamicvoltage scaling and overheads q

Eduardo Tavares *, Paulo Maciel, Bruno SilvaUniversidade Federal de Pernambuco, Centro de Informática, 50732-970 Recife, Pernambuco, Brazil

a r t i c l e i n f o

Article history:Available online 8 August 2008

Keywords:Hard real-time systemSchedulingFormal methodsDynamic voltage scalingEnergy consumption

0141-9331/$ - see front matter � 2008 Elsevier B.V. Adoi:10.1016/j.micpro.2008.07.001

q The authors would like to thank the anonymouscomments, which greatly improved the quality of the

* Corresponding author. Tel.: +55 81 2126 8430.E-mail addresses: [email protected] (E. Tavares), p

[email protected] (B. Silva).

a b s t r a c t

Dynamic voltage scaling (DVS) has been adopted as an effective technique for reducing energy consump-tion in embedded systems. Although several scheduling approaches have been developed to address volt-age scaling together with stringent timing constraints, inter-task relations have been neglected. Thiswork presents a pre-runtime method for hard real-time systems scheduling considering dynamic voltagescaling, overheads and inter-task relations. The proposed work considers time Petri nets as a formalmodel in order to provide a basis for precise schedule generation as well as to allow property analysisand verification. Experimental results depict the proposed approach feasibility, in the sense that energyconsumption is minimized as well as system constraints are met.

� 2008 Elsevier B.V. All rights reserved.

1. Introduction as precedence and exclusion relation or do not consider them at

Whenever designing embedded systems, several constraints,such as size, reliability, energy consumption and timing con-straints, may have to be considered to satisfy system requirements.Lately, considerable special attention has been devoted to energyconsumption, mainly due to the great expansion of the mobile de-vice market.

During the last decade, DVS (dynamic voltage scaling) has beenadopted as one of the most effective techniques for reducing en-ergy consumption in embedded systems. Adjusting CPU supplyvoltage has great impact on energy consumption, since the con-sumption is proportional to the square of supply voltage in CMOSmicroprocessors [11]. However, lowering the supply voltage line-arly affects the maximum operating frequency. Therefore, DVSmay be seen as a technique for trading-off energy consumptionand performance.

When considering hard real-time systems, DVS needs to beadopted with caution, since stringent timing constraints may be af-fected. In this case, equipment damage or even loss of human livesmay occur due to timing constraint violations. Thus, several sched-uling approaches, mainly based on runtime techniques, have beendevised to cope with DVS in time-critical systems. However, sys-tem specifications often either oversimplify tasks’ relations such

ll rights reserved.

reviewers for their valuablepaper.

[email protected] (P. Maciel),

all. Furthermore, overheads, such as preemptions and voltage/fre-quency switching, are issues that must be considered during sche-dule generation. Indeed, if overheads are neglected, tasks’constraints may be affected and even the gains obtained withDVS may be significantly reduced [15].

This paper presents a pre-runtime scheduling method for hardreal-time systems that considers DVS, inter-task relations andoverheads. More specifically, the contributions are: (i) the proposi-tion of a formal model based on time Petri nets (TPN) that providesthe basis for precise schedule generation as well as for the verifica-tion/analysis of behavioral and structural properties; (ii) the expli-cit modeling of inter-task relations and overheads (e.g. voltage/frequency switching), so as to allow considering them for the sche-dule generation; and (iii) a pre-runtime scheduling algorithm thatfinds out feasible schedules that satisfy timing and energy con-straints. Besides, a technique for dealing with dynamic slack timesis presented in order to take advantage of new opportunities to fur-ther reduce energy consumption during system execution.

One challenge designers have to face when dealing with embed-ded hard real-time systems is the modeling power of tools. In orderto be of practical usability, they might provide means for repre-senting concurrent communicating tasks, synchronization mecha-nisms and communication primitives as well as they shoulddescribe timing constraints and requirements. Furthermore, theavailability of precise methods for analysis and verification of sys-tems’ representation is a requirement of remarkable importance.Petri nets [16] are a very well suited model for representing real-time embedded systems, since concurrency, synchronization andcommunication mechanisms are naturally represented. It should

Page 2: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

E. Tavares et al. / Microprocessors and Microsystems 32 (2008) 460–473 461

also be emphasized the sound mathematical basis related to Petrinet analysis methods.

The rest of the paper is organized as follows: Section 2 summa-rizes related works. Section 3 presents some preliminaries with thepurpose of providing a better comprehension of the proposed ap-proach. Section 4 describes the computational model and Section5 introduces the formal modeling. Section 6 describes the proposedpre-runtime schedule synthesis as well as discussing the respec-tive algorithm complexity. Section 7 presents a technique for deal-ing with slack times that may appear during system execution.Section 8 describes experimental results, and Section 9 concludesthis paper and introduces future works.

2. Related works

Many scheduling methods [1,2,7,10,15,21,26] have been devel-oped to cope with voltage scaling in time-critical systems. Works,such as [2,21], are based on runtime scheduling policies, which cangreatly improve energy consumption as shown by their experi-mental results. Some of them apply a preprocessing for definingan initial voltage for each task before runtime. This can be viewedas a hybrid approach, which mixes runtime and pre-runtime ap-proaches. However, some of these works do not properly tackleoverheads related to voltage/frequency switching, preemption,and runtime calculations, and neglect precedence and exclusionrelations. A common approach in dealing with runtime overheadsis considering them in tasks’ worst-case execution cycles (WCEC).Nevertheless, this approach may be too pessimistic, since the totaloverhead is not known before schedule generation. In this context,[15,1] explicitly take into account overheads related to voltage/fre-quency switching during scheduling generation without relying onthe previous statement. Nevertheless, dispatcher/scheduler andpreemption overheads are disregarded. In [10], the authors pro-pose a technique for reducing the impact of preemptions in systemenergy consumption. Although interesting results are provided, thetechnique does not consider inter-task relations and assumes CPUswith continuously variable voltage.

In the literature, few works cope with inter-task relations. Thework described in [8] proposes a runtime method for dealing withexclusion relations. Nevertheless, precedence relations are ne-glected and preemption and voltage/frequency switching over-heads are adopted in tasks’ computation time. In [3], the authorsdescribe a DVS scheduling method for a distributed environmentconsidering precedence relations, but they ignore mutual exclu-sions and consider the scheduler overhead in tasks’ worst-caseexecution time (WCET). Cortés [4] proposes a scheduling methodconsidering precedence relations, assuming that all tasks arenon-preemptable. Besides, the adopted task model assumes taskswith mandatory and optional parts, in the sense that optional partscan be left incomplete in order not to violate timing constraints. Inrelation to formal methods, some works have been proposed overthe years to tackle real-time systems with energy constraints.However, in general, they consider soft timing constraints (e.g.

Specification Modeling

Measurement

[no tasks' timing information or no HW energy con-sumption]

PropertyAnalysis/Verificat

[check pro-perties]

Fig. 1. Methodology a

[19]) or adopt fixed-priority scheduling (e.g. [13]). In the latter sit-uation, feasible schedules may not be properly generated whenconsidering arbitrary inter-task relations [25].

As an alternative, this work proposes a pre-runtime schedulingmethod that considers DVS, inter-task relations and runtime over-heads. A formal model based on time Petri nets is adopted to pro-vide a basis for precise schedule generation as well as to allowproperty analysis and verification.

3. Preliminaries

This section aims at presenting fundamental concepts and amotivational example in order to show the feasibility of the pro-posed method.

3.1. Problem formulation

Before providing details related to the proposed methodology, itis important to describe its context. The software specification isrepresented by a set of periodic hard real-time tasks (T) with in-ter-task relations, such as precedence and mutual exclusion. Addi-tionally, it is considered a processor that can vary its voltage/frequency level within a discrete range (vff). It is worth notingthe overheads that may occur during system execution, more spe-cifically, voltage/frequency switching, preemption, and runtimecalculations (e.g. dispatcher execution). This paper concerns theproblem of scheduling those tasks on a DVS-capable processor,such that timing constraints as well as inter-task relations aremet and energy consumption is minimized, respecting a given en-ergy constraint (emax). Besides, overheads are considered duringscheduling process in order to provide a more realistic systembehaviour. Throughout this paper, the term overhead encom-passes, as stated previously, voltage/frequency switching, preemp-tion, and dispatcher executions.

3.2. Methodology

Fig. 1 provides an overview of the design methodology forimplementing embedded software synthesis, in which the pro-posed scheduling method is a fundamental activity.

Initially, the designer defines the system specification, whichconsists of a set of concurrent tasks with their respective con-straints, behavioral descriptions, information related to the hard-ware platform (e.g. voltage/frequency levels and energyconsumption) as well as the system energy constraints. A measure-ment activity may be required whether the designer does not pos-sess the tasks’ timing information or the information regarding thehardware energy consumption. Next, the specification is translatedinto an internal model able to represent concurrent activities, tim-ing information, inter-task relations, such as precedence and mu-tual exclusion, as well as energy constraints. The adoptedinternal model is a time Petri net extension, labeled with energyconsumption values. After generating the internal model (TPN),

Scheduling Code Generation

Validation

Deployment

[schedule notfound]

[inconsistentresults]

ion[propertiesnot found]

[proper-ties ok]

ctivity diagram.

Page 3: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

462 E. Tavares et al. / Microprocessors and Microsystems 32 (2008) 460–473

the designer may firstly choose to perform property analysis/veri-fication or carry out the scheduling activity. This work adopts apre-runtime scheduling method in order to find out a feasible sche-dule that satisfies timing, inter-task and energy constraints. Next,the feasible schedule is adopted as input to the automatic codegeneration mechanism, such that a tailored code is obtained withthe respective runtime control, namely, dispatcher. Finally, theapplication is validated on a DVS platform in order to check systembehaviour as well as the respective constraints. Once the system isvalidated, it can be deployed to the real environment.

The phases depicted in the methodology diagram are imple-mented by a set of integrated tools. For instance, the schedulegeneration is implemented by the adoption of DENTES tool [22]– Development Environment for Time-Critical Embedded Systems– such that, from a system specification, the respective time Petrinet model is automatically generated, and, the pre-runtime sched-uling is performed. Additionally, another tool has been imple-mented for automatic hardware and software measurement.AMALGHMA [22] – Advanced Measurement Algorithms for Hard-ware Architectures – allows the capture of tasks’ WCEC as wellas the energy consumption information related to the targetplatform.

3.3. Specification model

The specification model is composed of: (i) a set of periodictasks with bounded discrete time constraints; (ii) inter-task rela-tions, such as precedence and exclusion relations; (iii) a discreteset of supply voltages and their respective maximum frequenciesin the CPU; and (iv) the system energy constraint.

Let T be the set of tasks in a system. A periodic task is definedby si = (phi,ri,ci, di,pi), where:

� phi is the initial phase (delay until the task is requested for thefirst time after the start of the system);

� ri is the release time (time interval between the beginning of aperiod and the earliest time that an execution of task si can bestarted in each period);

0 1 2 3 4 5 6 7 8 9 10 11 12 13

1.26V

1.07V

1.04V

-

-

-

I I I I I I I I I I I I I

Not possible

0 1 2 3 4 5 6 7 8 9 10 11 12 13

-

-

-

I I I I I I I I I I I I I

Deadline missing

0 1 2 3 4 5 6 7

-

-

-

I I I I I I I

τ1τ2 τ1 τ3 τ3

τ4

τ1 τ2 τ4

τ1τ2 τ3

τ3

1.26V

1.07V

1.04V

1.26V

1.07V

1.04V

a

c

e

Fig. 2. Schedules generated according

� ci is the worst-case execution cycles (WCEC) required for execu-tion of task si;

� di is the deadline (time interval between the beginning of a per-iod and the time by which an execution of task si must be com-pleted in each period);

� pi is the period (period in which si should be repeatedlyexecuted).

A sporadic task is defined by sk = (ck, dk, mink), in which mink isthe minimum period between two activations of task sk. In thiswork, sporadic tasks are also considered by translating then intoequivalent periodic tasks [25].

Tasks may have precedence and exclusion relations betweenthem. A task si precedes task sj, if sj can only start executing aftersi has finished. A task si excludes task sj, if no execution of sj canstart while task si is executing. In other words, task si cannot bepreempted by task sj.

Let us call V and F be two sets of discrete CPU supply voltagelevels and frequencies, respectively, where jVj ¼ jFj; andvff : V!F (voltage–frequency function) a bijective function thatmaps each voltage level to one, and only one, processor executionfrequency, which is the maximum operating frequency in that sup-ply voltage. In this work, voltage/frequency levels that do not pro-vide energy saving due to the leakage current are not considered inthe scheduling process.

In addition to the specification above, the system energy con-straint (emax) needs to be defined, which sets an upper bound interms of energy consumption that a schedule must not surpass.

3.4. Energy model

As defined in [11], the energy consumed by CMOS microproces-sors can be described as E ¼ Asw � N � V2

dd, in which N is the num-ber of cycles for the execution of task si, Asw is the averageswitching capacitance per clock cycle, and Vdd is the voltage sup-plied. In this work, Asw is considered to be constant for all tasksin T. As energy has a quadratic dependence on the supply voltage,lowering the supply voltage is the most effective way to reduce

0 1 2 3 4 5 6 7 8 9 10 11 12 13

-

-

-

I I I I I I I I I I I I I

Deadline missing

0 1 2 3 4 5 6 7 8 9 10 11 12 13

-

-

-

I I I I I I I I I I I I I

8 9 10 11 12 13 I I I I I I

τ2τ1τ3 τ3

τ4

τ3 τ3τ4

τ4 τ3

τ1τ2

1.26V

1.07V

1.04V

1.26V

1.07V

1.04V

b

d

to different scheduling methods.

Page 4: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

p1 p2

t1[1,3]

p1 p2

t1[1,3]

a b

Fig. 3. Petri Net example.

E. Tavares et al. / Microprocessors and Microsystems 32 (2008) 460–473 463

energy consumption. However, lowering supply voltage linearlyaffects CPU maximum operating frequency (f / V).

3.5. Example

Assume the following task set T ¼ fs1 ¼ ð0;0;150� 106;6;13Þ; s2 ¼ ð0;2;50� 106;3;13Þ; s3 ¼ ð0;0;100� 106;13;13Þ; s4 ¼ð0;7;60� 106;9;13Þg. In addition to timing constraints, the speci-fication contains the following relations: s1 excludes s2, s1 pre-cedes s3, s2 excludes s1, and s2 precedes s4. For this example,the DVS platform described in [18] is adopted, which utilizes aPhillips LPC2106 processor, a 32-bit microcontroller with ARM7core. More specifically, the CPU supply voltage/frequency levelsadopted for this example are vff = {(1.04 V,20 MHz), (1.07V,30 MHz), (1.26 V, 50 MHz)}. Moreover, considering an averageswitching capacitance of 0.28 nF per clock cycle, the energy con-sumption is 0.45 nJ/cycle at 50 MHz, 0.34 nJ/cycle at 30 MHz, and0.31 nJ/cycle at 20 MHz. These values were obtained from theequation described previously.

Fig. 2a shows a schedule obtained by adopting the optimal off-line DVS algorithm defined by [26], which relies on the optimalruntime scheduling method earliest deadline first (EDF). The sche-dule is invalid, since s1 cannot be preempted by s2. As an alterna-tive, Fig. 2b shows a schedule obtained by blocking the executionof s2 while s1 is executing. Again, the schedule is infeasible, sinces2 misses its deadline due to the earlier release of s1. Even discard-ing DVS, in other words, running every task at the maximal volt-age/frequency level, a schedule could not be found if EDF isadopted (or other runtime scheduling algorithm) (see Fig. 2c). Onthe other hand, Fig. 2d depicts a schedule found using the proposedpre-runtime method. All tasks meet their deadlines, and additionalenergy savings can be obtained comparing to a pre-runtime sche-dule without DVS (Fig. 2e). Fig. 2d presents a schedule that con-sumes 0.1414 J, while the schedule depicted in Fig. 2e utilizes0.162 J. Additionally, the processor needed to be left idle to find avalid schedule [25]. In this example, context and voltage/frequencyswitching overheads have not been taken into account and the CPUis assumed to have a halt instruction that avoids energy consump-tion on idle state. Henceforth, overheads are considered in thiswork.

1 Rþ is the set of positive real numbers.

4. Computational model

Computational model syntax is given by a time Petri net [14],and its semantics by a timed labeled transition system.

(Time Petri Net) A time Petri net (TPN) is a bipartite directedgraph represented by a tuple P ¼ ðP; T; F;W;m0; IÞ, where P (setof places) and T (set of transitions) are non-empty disjoint sets ofnodes. The edges are represented by F, whereF # A = (P � T) [ (T � P). W : A! N represents the weight of theedges, such that W(f) = {(i) x 2 N; if (f 2 F), or (ii) 0, if (f R F)}. ATPN marking mi is a vector (mi 2 NjPj), and m0 is the initial marking.

I : T ! N�N represents the timing constraints, whereI(t) = [EFT(t),LFT(t)] "t 2 T, EFT(t) 6 LFT(t), EFT(t) is the Earliest Fir-ing Time, and LFT(t) is the Latest Firing Time.

Considering the previous definition, places (P) represent localstates and transitions (T) denote local actions. The set of arcs F rep-resents the relationships between places and transitions, in such away that arcs connect places to transitions and vice-versa. Func-tion W assigns to each arc a natural number, which may be inter-preted as the amount of parallel arcs. A marking vector mi

associates to each place a natural number, which represents thenumber of tokens in the respective place. Places holding tokensare usually denominated as marked places. Typical interpretationsassume marked places as the truth of a condition or the availability

of resources, and transitions are considered events or computa-tional steps. Graphically, places are represented by circles, transi-tions are depicted as bars or rectangles, arcs are represented bydirected arrows labeled with the weight, and tokens (the marking)are generally represented by filled small circles. Fig. 3a depicts aPetri net model.

(Time Petri Net with Energy Consumption – TPN PE) An extendedtime Petri net with energy consumption values is represented byPE¼ðP;EÞ. P is the underlying time Petri net, and E : T!Rþ

1 [ {0}is a function that assigns transitions to energy consumption values.

(Enabled Transitions) A set of enabled transitions, at marking mi,is denoted by: ET(mi) = {t 2 Tj mi(pj) P W (pj,t), "pj 2 P}.

A transition t 2 T is enabled, if each input place p 2 P contains atleast W(p,t) tokens. In Place/Transition nets, usually called just Pet-ri nets, this is a sufficient condition to fire a transition (e.g. actionexecution). Nevertheless, in time Petri nets, timing constraintsneed to be met in addition to the marking. The time elapsed, sincethe respective transition enabling, is denoted by a clock vectorc 2 ðN [ f#gÞjTj, where # represents the null value for disabledtransitions. As an example, the clock vector for the net in Fig. 3contains one element: c(t1) = 0.

At this point, the difference between static and dynamic firingintervals associated with transitions is required. The dynamic fir-ing interval of transition t, ID(t) = [DLB(t),DUB(t)], is dynamicallymodified whenever the respective clock variable c(t) is incre-mented, and t does not fire. DLB(t) is the Dynamic Lower Bound,and DUB(t) is the Dynamic Upper Bound. The dynamic firing inter-val is computed in the following way: ID(t) = [DLB(t),DUB(t)],where DLB(t) = max(0,EFT(t) � c(t)), DUB(t) = LFT(t) � c(t). When-ever DLB(t) = 0, t can fire, and, when DUB(t) = 0, t must fire, sincestrong firing mode is adopted. Initially, at the moment transition tbecomes enabled, c(t) is set to 0 and thus I(t) = ID(t). In Fig. 3,assuming c(t1) is incremented in one time unit (c(t1) = 1),ID(t1) = [0, 2].

(States) Let PE be a time Petri net extended with energy con-sumption values, M # NjPj be the set of reachable markings (e.g.all possible markings) of PE;C # ðN [ f#gÞjTj be the set of clockvectors, and E # Rþ [ f0g be the set of accumulated energy con-sumptions. The set of states S of PE is given by S # (M � C � E),that is, a state is defined by a marking, the respective clock vector,and the accumulated energy consumption from the initial state upto this state.

Considering the Petri net model in Fig. 3, the initial state iss0 = (m0 = [1, 0],c0 = [0],e0 = 0).

(Fireable Transitions) The set of fireable transitions at state s 2 Sis defined by: FT(s,emax) = {ti 2 ET(m)j ðeþ EðtiÞ 6 emaxÞ ^ ðDLBðtiÞ 6minðDUBðtkÞÞÞ;8tk 2 ETðmÞg. emax is the system energy constraint(Section 3) and e is the accumulated energy consumption fromthe initial state up to state s. (Firing Domain) The firing domainfor a transition t at state s, is defined by the interval:FDs(t) = [DLB(t), min(DUB(tk))], "tk 2 ET(m).

Without loss of generality, enabled transitions are only relatedto the marking (see (Enabled Transitions)), and fireable transitions

Page 5: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

464 E. Tavares et al. / Microprocessors and Microsystems 32 (2008) 460–473

take into account the marking and their respective clock values(the time elapsed of each enabled transition). Additionally, in theproposed Petri net extension, a transition is only fireable if therespective firing does not surpass the system energy constraint(emax). Considering firing domain, a fireable transition t at state scan only fire in the interval denoted by FDs(t). Besides,FT # ET # T. In Fig. 3, at the initial state s0 = (m0 = [1,0],c0 = [0],e0 = 0), t1 is fireable when c0(t1) = 1 and must fire whenc0(t1) = 3 (FDs0 ðt1Þ ¼ ½1;3�), if it neither has been fired nor disabled.

(TLTS) A timed labeled transition system (TLTS) is a quadrupleL ¼ ðS, R, ?, s0), where S is a finite set of discrete states, R is analphabet of labels representing actions, ? # S � R � S is the tran-sition relation, and s0 2 S is the initial state.

The TPN semantics is defined by associating a TLTSLPE

¼ ðS;R;!; s0Þ, where (i) S is the set of states of TPN PE; (ii)R # ðT �NÞ is a set of labels (t,h) corresponding to the transitiont firing at time h in the firing interval FDs(t), "s 2 S; (iii)? # S � R � S is the state transition relation; (iv) s0 is the initialstate of PE.

(Reachable States) Let LPEbe a TLTS derived from a TPN PE, and

si = (mi,ci,ei) a reachable state. si+1 = fire(si,(t,h)) denotes that firinga transition t at time h from the state si, the reached state si+1 =(mi+1,ci+1,ei+1) is obtained from:

� "p 2 P,mi+1(p) = mi(p) �W(p,t) + W(t,p), as usual in Petri nets;� eiþ1 ¼ ei þ EðtÞ;� "tj R ET(mi+1),ci+1(tj) = #;� "tk 2 ET(mi+1),

ciþ1ðtkÞ ¼0; if ðtk ¼ tÞ0; if ðtk 2 ETðmiþ1Þ � ETðmiÞÞciðtkÞ þ h; else

8<:

Previous definition states that the firing of a transition t, at timevalue h, in state si, defines the next state si+1. The first item refers tothe marking change and the second item represents the energyconsumption for reaching the new state. The third item adjuststhe clock vector in order to assign the value # for the disabled tran-sitions and the fourth item updates the clock values for each en-abled transition. In the latter case, if a transition is fired andcontinues enabled in the new reached state, the clock value is setto 0. In the same way, new enabled transitions have initial valuesset to 0. Nevertheless, if a transition was not fired in the previousstate and continues enabled in the new reached state, the respec-tive clock value is incremented. Note that h is the elapsed time atstate si, not a global clock. For a better understanding, let us as-sume the transition t1 firing at state s0, where s0 = (m0 = [1,0],c0 = [0],e0 = 0), h=1 and Eðt1Þ ¼ 2:5 J. The new reached state iss1 = (m1 = [0,1],c1 = [#],e1 = 2.5) (see Fig. 3b), where: (i)m1(p1) = 1 � 1 + 0, m1(p2) = 0 � 0 + 1; (ii) e1 = 0 + 2.5; and (iii)c1(t1) = #. The respective TLTS is s0 !

ðt1 ;1Þ s1.(Feasible Firing Schedule) Let LPE

be a timed labeled transitionsystem of a time Petri net PE, s0 its initial state, sn = (mn,cn,en) a fi-nal state, and mn = MF is the desired final marking.

s0 !ðt0 ;h0Þ s1 !

ðt1 ;h1Þ s2 ! . . .! sn�1 !ðtk ;hn�1Þ sn

is defined as a feasible firing schedule, where si+1 = fire(si,(tk,hi)),

i P 0, tk 2 FT(si,emax), and hi 2 FDsiðtkÞ.

The system modeling of the proposed methodology guaranteesthat the final marking MF (see Section 5.1(i)) is well-known since itis explicitly specified.

5. Modeling real-time systems

This section presents the proposed modeling method fordescribing systems with timing and energy constraints. This work

adopts a bottom-up approach, in which a set of composition rulesare considered for combining basic building block models. Thesebuilding blocks represent each aspect of a hard real-time system,more specifically, the tasks’ timing constraints (e.g. release time),inter-task relations, overheads, the energy consumption during atask computation as well as the processor availability. Such meth-od generates a time Petri net model from the system specification,so that tools can be adopted to automate the modeling and sched-uling processes (see Section 3).

More specifically, the proposed building blocks have been con-ceived for automatic pre-runtime schedule generation, where theschedule period (PS) corresponds to the least common multiple(LCM) of all tasks’ periods. Within this period, several task in-stances (of the same task) might be carried out, in which N

(si) = PS/pi gives the number of instances of each task si. Once a fea-sible schedule is generated (Section 6.3), the same schedule will beinfinitely often executed during system execution.

Besides, it is important to state that the proposed modelingmethod is also adopted for analysis/verification of behavioral andstructural properties. For instance, the models generated usingthe proposed building blocks are bounded [16], in the sense thatthe respective state space size is finite. In the context of the pro-posed scheduling method (Section 6.3), this property means thatthe pre-runtime scheduling algorithm always finishes. Anotherproperty intrinsic to the models is the absence of liveness [16]. Inother words, the generated models contain deadlock states, whichdo not allow any transition to be fired. Indeed, deadlock states areimportant and necessary, since they represent the desired finalmarking (indication that a feasible schedule has been found) or adeadline missing. Obviously, the latter is avoided by the schedulingalgorithm. Furthermore, the designer may verify the existence ofseveral conditions in the Petri net models by adopting modelchecking techniques based on temporal logic [23].

5.1. Tasks modeling

The proposed building blocks are depicted in Fig. 4, in whichplaces with dashed lines represent the connections with otherbuilding blocks. Without loss of generality, the composition rules(not presented due to lack of space) can be visualized as placemerging operators. As follows, each building block is presented.

(a) Fork block. Supposing that the system has n tasks, the forkblock (Fig. 4a) is responsible for starting all tasks in the sys-tem. This block models the creation of n concurrent tasks aswell as it represents the initial marking. The timing intervalof transition tstart is equal to [0, 0].

(b) Periodic task arrival block. This block (Fig. 4b) models theperiodic invocation for all task instances in the scheduleperiod (PS). A transition tphi

models the initial phase of thetask first instance. Similarly, transition tai

models the peri-odic arrival (after the initial phase) for the remaininginstances and transition tri

represents a task instancerelease. Note the weight (ai ¼NðsiÞ � 1) of the arcðtphi

; pwaiÞ, which models the invocation of all remaining

instances after the first task instance. The timing intervalsof transitions tphi

and taiare the timing constraints depicted

in the specification, in this case, phi (phase) and pi (period).Considering transition tri

, the timing interval is [ri,di � Cmin],where ri is the release time, di is the deadline constraint, andCmin is the computation time of task si at the highest volt-age/frequency level.

(c) Voltage selection block. For each available voltage, this block(Fig. 4c) represents every possible voltage selection for exe-cuting a task instance. In this block, a voltage level is repre-sented by a transition tvin

and its timing interval is [0, 0].

Page 6: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

pwai pwri

pwdipsi

tai

tphi

αi

[pi, pi]

[phi, phi]

pvin pwgin pwcin pwfin pfvi

tvsin tgin tcin tfvin

pproc pproc

C C

[0, 0] [1, 1] [0, 0]

pwdi pwpci

pdmitdi

tpci1

pwci1

[di, di]

[0, 0]

tri[ri, di - Cmin]

pwvsi pwvsi

...

...

pvi1

pvin

[0, 0]tvi1

[0, 0]tvi2

[0, 0]tvin

pvi2

[0, 0]

pwd1

pfv1 pf1

pwdi

pfvi pfi

pwdn

pfvn pfn

tend[0, 0]

...

...

N(T1)

(Ti)N

(Tn)N

pwci2

pwcin

...

...

pvin pwg1in pwc1in pwfv1in pfvi

tvsin tg1in tc1in tfv1in

pproc pproc

C1 C1

[0, 0] [1, 1] [0, 0][0, 0]

pwgin pwcin

tgin tcin

pproc pproc

C2

[0, 0] [1, 1]tfvin

C2

[0, 0]

(f)

pstart

...

...

pst1

[0, 0]tstart

psti

pstn

[C, C]

pwcin pwfin pfvi

tgin tcin tfvin

pproc pproc

[0, 0] [0, 0]

[C1, C1]

pwc1in

tg1in tc1in

pproc pproc

[0, 0] [C2, C2]

pwgin pwcin pwfin pfvi

tfv1in tcin tfvin[0, 0] [0, 0]

pvin pwgin

tvsin[0, 0]

pvin pwg1in

tvsin[0, 0]

tpci2[0, 0]

tpcin[0, 0]

tf1[0, 0]

tfn[0, 0]

tfi[0, 0]

pwfin

pend

a

f

g

b c d

e

h i

Fig. 4. Proposed building blocks.

E. Tavares et al. / Microprocessors and Microsystems 32 (2008) 460–473 465

(d) Non-preemptivetask structure block. Considering a non-pre-emptive scheduling method, the processor is just releasedafter the entire computation is finished. This block modelsa non-preemptive task computation adopting a specific volt-age. In this block, processor granting and task computationare represented by transitions tgin

and tcin, respectively. Only

after the entire task computation, the processor is releasedby transition tcin

. Assuming a voltage v 2V and the respec-tive maximum frequency f = vff(v), task computation time(C) can be obtained by C = dci/fe, where ci is the task (si)WCEC. Fig. 4d shows that time interval of computation tran-sition tcin

has bounds equal to the task computation time at aspecific voltage ([C,C]).The timing intervals for transitiontvsin

, tginand tfvin

are equal to [0, 0]. Furthermore, computa-tion transitions have energy consumption values greaterthan zero, which are calculated using the equation presentedin Section 3.

(e) Preemptive task structure block. In this particular scheduling

method (Fig. 4e), tasks are implicitly split into subtasks, inwhich the computation time of each subtask is exactly equalto one task time unit (TTU). This method allows runningother conflicting tasks, in this case, meaning that one taskmay preempt another task. This is modeled by the timeinterval of computation transitions ([1, 1]), and the entirecomputation is modeled through the arc weights. Consider-ing C the task computation time at a specific voltage, Ctokens are stored in place pwgin

, and the same amount oftokens in place pwfin

is needed for firing transition tfvin. In

the same way as the non-preemptive task structure block,processor granting and task computation are representedby transitions tgin

and tcin, respectively. However, the proces-

sor is released by transition tcinjust after the execution of

one task time unit related to the computation time.(f) Non-preemptive task structure with 2 voltages block and (g)

Preemptive Task Structure with 2 Voltages Block. If theCPU provides a small number of discrete voltage levels andan ideal voltage is not available (V ideal R V), the two immedi-

ate neighbor voltages (V idealL ,V idealH 2V) to the ideal one canbe adopted for reducing energy consumption [7]. For a betterunderstanding, a task may be divided in two parts. The firstpart is executed at the immediate higher voltage in relationto the ideal one (V idealH ), and the second part is executed atthe immediate lower voltage (V idealL ). When the ideal voltageis smaller than any available voltage level, the smallest CPUvoltage level is adopted. However, when the ideal voltage ishigher than any available voltage level, the task instancecannot be scheduled.The proposed method allows the mod-eling of a task instance executing at two different voltagesconsidering non-preemptive (Fig. 4f) and preemptive(Fig. 4g) executions. C1 represents the computation time ofthe first part of the task executing at V idealH , and C2 repre-sents the computation time of the second part of the taskexecuting at V idealL . Without loss of generality, these blocksresemble the task structure blocks presented previously.For further information related to time instants at whichthe voltage changes, the reader is referred to [7].

(h) Deadline checking block. Deadline missing is an undesirablesituation when considering hard real-time systems. The pro-posed block (Fig. 4h) checks the occurrence of a deadlinemissing through transition tdi

, which is enabled at themoment a task instance is ready for execution. For eachplace pwcin

and pwc1inin each task structure block related to

task si, a transition tpcinis connected as postcondition.

Whenever a task instance is executing and a deadline miss-ing occurs (e.g. tdi

fires), the token is removed from placepwcin

(or pwc1in) such that it is not possible to fire any other

computation transition (e.g. tcin) in the model. In other

words, the model enters in a deadlock state, since the pro-cessor is not released. In this case, during schedule genera-tion, the proposed scheduling algorithm backtracks andselects other voltage/frequency level or task for execution.Besides, the timing interval for each transition tpcin

is [0, 0],and for transition tdi

is the deadline constraint di ([di,di]) oftask si.

Page 7: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

466 E. Tavares et al. / Microprocessors and Microsystems 32 (2008) 460–473

(i) Join Block. Usually, concurrent activities need to synchronizewith each other. The join block (Fig. 4i) states that all tasks inthe system have concluded their execution in the scheduleperiod. In this block, transition tfin

represents the conclusionof a task instance, and, thus, it removes any token enablingtransition tdi

(deadline missing). After firing each transitionof every task model, transition tend is fired, hence a tokenis stored in place pend (desired final marking, MF). The timinginterval of transition tend and each transition tfn is equal to[0,0].

(j) Processor Block. The processor model consists of a singleplace pproc, in which a token states the processor availability.

5.2. Model example

In order to enlighten the modeling process, let us consider thefollowing specification consisting of two tasks: T1 = (0, 0,240 � 106, 20, 20) and T2 = (0, 5, 60 � 106, 15, 20). In this specifica-tion, the task time unit is one second and the LCM is equal to 20 s,which points out the existence of two task instances (N(T1) +N(T2) = 2). In addition, assume the following supply voltages andthe respective maximum frequencies vff = {(1 V,10 MHz),(2 V,20 MHz)}. Moreover, an unavailable voltage/frequency level of1.5 V/15 MHz is considered, which can be ‘‘simulated” using the2 immediate neighboring voltages. Fig. 5 shows the resultant mod-el taking into account the preemptive scheduling method.

In this model, the fork block is responsible for starting tasks T1

and T2, such that, after the firing of transition tstart, both tasks be-

pwa2 pwr2

pwd2ps2

ta2

tph2

0

[20, 20]

[0, 0]

pwg21 pwc21

tvs21 tg21 tc216

[0, 0] [1, 1]

pwpci pdm2

td2

tpc21

[15, 15]

[0, 0]

tr2[5, 12]

pwvs2

pv21

[0, 0]tv12 [0, 0]

pv23 pwg123 pwc123

tvs23 tg123 t

2

[0, 0][0, 0]

pstart[0, 0]

pwg22 pwc22

tvs22tg22 t

3pv22

tv22 [0, 0] [[0, 0]

[0, 0]

[0, 0]

tpc24[0, 0]

tpc22[0, 0]

tpc23[0, 0]

pwa1 pwr1

pwd1ps1

ta1

tph1

[20, 20]

[0, 0]pwg11 pwc11

tvs11 tg11 tc1124

[0, 0] [1, 1]

tr1[0, 8]

pwvs1

pv11

[0, 0]tv11 [0, 0]

pv13 pwg113 pwc113

tvs13 tg113 tc113

8

[0, 0] [1, 1][0, 0]

pwg12 pwc12

tvs12 tg12 tc12

12pv12

tv12 [0, 0] [1, 1][0, 0]

[0, 0]

tv13[0, 0]

pwpc1

td1

tpc11

[20, 20]

[0, 0]

tpc14

tpc12[0, 0]tpc13[0, 0]

[0, 0]

pproc

pdm1

0

tv23

Arrival

ArrivalProcessor

Voltage Selection

Voltage Selection

Fork

tstart

Fig. 5. Examp

come eligible for execution. Note that each task is modeled consid-ering a combination of arrival, voltage selection, preemptive taskstructure and deadline blocks. Indeed, these blocks model the tim-ing constraints as well as the energy consumption of a hard real-time task. For a better visualization, the upper blocks model taskT1 and the lower blocks model task T2, in such a way that both tasksare assigned to the same processor (place Pproc). Such connectionshave been carried out by merging the common places of eachbuilding block. Additionally, note that each voltage selection blocktakes into account 3 voltage/frequency levels for executing tasks T1

and T2. More specifically, transitions tv11and tv21

represent thetasks executing at 1 V/10 MHz, transitions tv12

and tv22depict the

execution at 2 V/20 MHz, and, similarly, transitions tv13and tv23

represent the tasks’ execution at 1.5 V/15 MHz. For instance, thecomputation time of task T1 at 1 V/10 MHz is 24 s, sinceC = d240 � 106,/ 10 � 106e = 24 s. Furthermore, each task structureblock (including the block with two voltages) is connected to thedeadline block of the respective task. These connections are re-quired, since, if a deadline occurs during a task computation, theexecution is no longer possible (see Deadline Checking Block). Fi-nally, after evaluating each task, transition tend is fired, so that a to-ken is stored in place pend, reporting that a feasible schedule hasbeen found (desired final marking).

5.3. Inter-task relation modeling

Besides building blocks already described, precedence andexclusion relations between tasks are represented consideringthe model’s structures presented in this section.

pwf21

tfv216

[0, 0]

pf1

pf2

tend[0, 0]

pwfv123

c123 tfv123

2

[1, 1] [0, 0]

pwg23 pwc23

tg23 tc232

[0, 0] [1, 1]tfv23

2

[0, 0]

tf1[0, 0]

tf2[0, 0]

pwf22

c22 tfv223

1, 1] [0, 0]

pwf11

tfv1124

[0, 0]

pwfv113

tfv113

8

[0, 0]

pwg13 pwc13

tg13 tc13

8

[0, 0] [1, 1]tfv13

8

[0, 0]

pwf12

tfv1212

[0, 0]

pfv1

pfv2

pwf23

pwf13

DeadLine

DeadLine

Join

Preemptive Task Structure

Preemptive Task Structure

Preemptive Task Structure with 2 Volt.

Preemptive Task Structure

Preemptive Task Structure

Preemptive Task Structure with 2 Volt.

pend

le model.

Page 8: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

pwri pvin pwgin pwcin pwfin pfvi

tvsin tgin tcin tfvin

Ci Ci

[0, 0] [1, 1] [0, 0]tri

[ri, di - Cmin]

pwvsi

...

pvi1

[0, 0]tvi1

[0, 0]tvin [0, 0] [0, 0]

...

tfi

pwrj pwgjn pwcjn pwfjn pfvj

tvsjn tgjn tcjn tfvjn

Cj Cj

[0, 0] [1, 1] [0, 0]trj

[rj, dj - Cmin]

pwvsj

...

pvj1

[0, 0]tvj1

[0, 0]tvjn

[0, 0]

...tprecij[0, 0]

pwpij

pprecij

[0, 0]tfj

pfi

pfj

pwri pvin pwgin pwcin pwfin pfvi

tvsin tgin tcin tfvin

Ci Ci

[0, 0] [1, 1] [0, 0]tri

[ri, di - Cmin]

pwvsi

...

pvi1

[0, 0]tvi1

[0, 0]tvin [0, 0] [0, 0]

...

tfi

pwrj pwgjn pwcjn pwfjn pfvj

tvsjn tgjn tcjn tfvjn

Cj Cj

[0, 0] [1, 1] [0, 0]trj

[rj, dj - Cmin]

pwvsj

...

pvj1

[0, 0]tvj1

[0, 0]tvjn

[0, 0]

...texclji[0, 0]

pweji

pexclij

[0, 0]tfj

pfi

pfj

[0, 0]

pweij

texclij

pvjn pvjn

Fig. 6. Inter-task relations modeling.

E. Tavares et al. / Microprocessors and Microsystems 32 (2008) 460–473 467

(a) Modeling precedence relations. Precedence relations aredefined between pairs of tasks. Suppose that si PRECEDES

sj is specified, that is, one task can only start its executionafter finishing the other task’s execution. Fig. 6a shows theTPN model for tasks si and sj, in which si PRECEDES sj. Forsake of readability, this figure does not model the processor.

(b) Modeling exclusion relations. Exclusion relations are alsodefined between pairs of tasks. Suppose that si EXCLUDES

sj is specified. This relation models a situation that two taskscannot be concurrently executed. In other words, if task si

starts executing, task sj has to wait up to task si finishesits execution and vice-versa. The proposed modeling methodadds a single place shared by the two tasks’ models. Thisplace must have one token (and only one) as pre-conditionfor the exclusive execution of any of these two tasks. There-fore, just one of them could be executing at a time.

Fig. 6b shows the TPN model for both tasks si and sj, in whichthe execution of one excludes the possibility of concurrent execu-tion of the other. This figure considers that both tasks are preemp-tive, but, as in the previous case, the processing resource is notrepresented.

5.4. Overhead modeling

Whenever designing hard real-time systems, overheads have tobe taken into account for guaranteeing system predictability.Assuming a DVS-capable system, additional time and energy costsmay occur and have to be considered in someway before runtime,such as: (i) preemptions; (ii) dispatcher or runtime scheduler over-head; and (iii) voltage/frequency switching. It is worth stating thatneglecting overheads may lead to timing or energy constraintviolations.

Pvin C1i

tvsintg1in1

tg1in0

...

...

[0,0] [0,0]

[0,0]

tg1ini[0,0]

tg1ink[0,0]

to[a

Pproch_

Pproch...

...Pproch

Pwo1inPwg1in

Pproch

Pvin Citvsin tgin1

tgin0

...

...

[0,0] [0,0]

[0,0]

tgini[0,0]

tgink[0,0]

toin[a,a]

Pproch_idle

PprochTi...

...PprochT1

PwoinCi - 1

tcin[1,1]

tlin[1,1]tlcin

[0,0]tfvin

[0,0]

Pwgin

Pwcin

PprochTk

Pacin

Pwlcin Pfvin Pfvi

a b

Fig. 7. Overhe

Therefore, this work explicitly models overheads and considersthem during schedule generation, hence providing a more realisticsystem behavior. Besides the models already described, two addi-tional blocks have to be introduced. These blocks are depicted inFig. 7.

The first model (Fig. 7a) is a preemptive task structure blockwith overhead considering a single voltage. Assuming k the num-ber of tasks, places pprochTj, 1 6 j 6 k, represent flags that indicatethe current task Tj executing on processor pproch. In a similar man-ner, place pproch_idle represents the idle state of processor pproch.Overheads are represented by transition toin, where its timinginterval is equal to [a, a], and an associated energy consumptionvalue is assigned. In this work, the transition overhead contem-plates: (i) dispatcher overhead, including context-switching; (ii)voltage/frequency related to the dispatcher execution; and (iii)voltage/frequency switching for executing the respective task.The proposed approach assumes the dispatcher execution at afixed supply voltage, which is up to the designer to select theappropriate one. Transitions tginj, 0 6 j 6 k,i–j, represent the pro-cessor granting and takes into account overheads that may occurin task start-up, context-saving or context-restoring. The overheadis not considered when the same task is executing without inter-ruption, which is represented by transition tgini. As in preemptivetask structure block without overhead, the computation time ismodeled using arc weights. Transition tcin represents the executionof one task time unit related to the computation time (C) and noti-fies the execution of task Ti through place pprochTi. After the lastcomputation time unit (tlcin), there is no need for indicating thetask execution, and, next, the processor goes to idle state(pproch_idle). The processor pproch is not shown for the sake ofreadability. However, each processor granting transition has anincoming arc from pproch, and each computation transition has anoutgoing arc to pproch.

C2itgin1

tgin0

...

...

[0,0]

[0,0]

tgini[0,0]

tgink[0,0]

toin[a,a]Pproch_idle

PprochTi...

...PprochT1

PwoinC2i - 1

tcin[1,1]

tlin[1,1]tlcin

[1,1]tfvin

[0,0]

Pwgin Pwcin

PprochTk

Pacin

Pwlcin Pfvin

PprochTi_2volt

Pwovintovin[av,av]

1in,a]idle

Ti

T1C1i - 1

tc1in[1,1]

tl1in[1,1]tlc1in

[0,0]tfv1in

[0,0]Pwc1in

Tk

Pac1in

Pwlc1in Pfv1in Pfvi

ad blocks.

Page 9: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

468 E. Tavares et al. / Microprocessors and Microsystems 32 (2008) 460–473

Fig. 7b shows a preemptive task structure considering two volt-age levels. As stated previously, this situation can occur when anideal voltage is unavailable and the two immediate neighbor volt-ages can be adopted. Without loss of generality, this block may beinterpreted as two concatenated instances of the previous block,but with a slight difference. The difference lies on the overheadafter executing the first part of the task at the immediate highervoltage. Since only a voltage/frequency switching is required toexecute the second part of the task, an additional overhead transi-tion (tovin) and place (pprochTi_2volt) are considered. The timinginterval [av, av] and the energy consumption value associated withthis overhead transition are smaller than those assigned to transi-tion toin due to the absence of unnecessary services. Additionally,as a new flag (pprochTi_2volt) is added, other tasks have to consider,in each preemptive structure block, a new processor granting tran-sition that receives an incoming arc from pprochTi_2volt. Again,pproch is not shown due to readability issues.

When considering non-preemptable tasks, the blocks presentedin Fig. 4d and f may be adopted with only two modifications. Thedispatcher overhead is associated with processor granting transi-tion and, considering the block with two voltages, the additionaloverhead related to voltage/frequency switching is assigned totransition tfv1in. Although the proposed blocks appear a bit com-plex to be modeled manually, DENTES tool [22] has been imple-mented for supporting the modeling process as well as foraccomplishing the automatic time Petri net generation.

Table 1Choice-priority levels

Choice-priority Type Transition

– Release tri

1 Final tfi; tfvin ; tfv1in2 Arrival tai,tphi

3 Voltage tvin

4 Others tvsin ; tlin5 Precedence tprecij

6 Overhead toin ; to1in7 Computation tcin ; tc1in ; tlcin8 Exclusion texcij

9 ProcessorGranting tgin ; tg1in

6. Pre-runtime schedule synthesis

This section describes the proposed pre-runtime schedule syn-thesis, details the preprocessing, the state space minimization, thescheduling algorithm, comments the complexity related to theproblem, and presents an example for depicting the proposedmethod.

6.1. Preprocessing

Before applying the proposed scheduling and modeling meth-ods, the specification is preprocessed considering an extension ofYao’s algorithm (LPEDF – low-power earliest deadline first), inwhich a set of discrete voltages is considered [11]. Yao’s algorithmis adopted as a basis for resembling CPU’s unavailable voltages bythe nearest accessible voltage levels as well as a guide for selectingan initial voltage/frequency level for each task instance duringscheduling generation. Yao’s algorithm does not take into accountinter-task relations and overheads. Nevertheless, the proposedmethod does consider such overheads after carrying out this firstphase.

Yao’s algorithm proceeds in the following way. Assume a set oftask instances J requires to be executed in a given time interval. Acritical interval Ii for J is an interval in which a subset of task in-stances must be scheduled at maximum constant speed in anyoptimal schedule for J. The algorithm schedules those task in-stances in that speed, and constructs a subproblem for the remain-ing instances and solves it recursively. The final result is a set ofintervals, in which each interval Ii contains a set of task instancesand an associated voltage/frequency level for executing thosetasks. The reader should refer to [26,11] for details about LPEDFas well as execution examples. In the proposed method, after exe-cuting LPEDF algorithm, the overheads related to dispatcher call-ings, voltage/frequency switching, and preemptions are includedin the interval Ii. As consequence, each task instance contained inthe interval Ii has a new voltage/frequency level, which is higherthan the original one, in order meet timing constraints. Next, theTPN model is generated also taking into account the CPU’s unavail-

able voltages, which are modeled using the building blocks de-picted in Fig. 7b, Fig. 4f and g.

Yao’s algorithm [26] complexity order is O(N log2N) – in whichN is the number of tasks’ instances – whereas a scheduling prob-lem with inter-task relations is NP-hard [5]. Experiments haveshown that the preprocessing greatly improves scheduling genera-tion processing time as well as the state space size by avoidinginappropriate voltages. For a better understanding, when choosinga voltage/frequency for a task instance, the proposed schedulingalgorithm first selects a voltage taking into account the result ob-tained by the preprocessing phase, and prunes transitions that rep-resent lower voltages. If the selected voltage always leads to anundesirable state, such as deadline missing, the transition that rep-resents the voltage is disregarded and the immediate higher volt-age/frequency level available on CPU, which leads to acomputation time less than or equal to the deadline, is selected.

6.2. Minimizing state space size

In the proposed method, the analysis based on the interleavingof actions is a fundamental point to be considered when facingstate space explosion problem. The analysis of n concurrent actionshas to tackle all n! action’s interleaving possibilities (e.g. the orderof transition firings), unless dependencies between these actionsare considered. This work proposes three means for minimizingthe state space size:

6.2.1. ModelingThe proposed method explicitly models the dependencies be-

tween actions, for instance, resource granting/releasing, prece-dence and exclusion relations between tasks. Therefore, themodeling itself may help in minimizing the state space size, sincethe amount of concurrent actions is reduced providing less inter-leaving possibilities.

6.2.2. Partial-orderIf actions can be executed in any order, such that the model al-

ways reaches the same state, these actions are independent. Inother words, it does not matter in which order these actions areexecuted [6]. Independent actions are related to transitions thatdo not disable other actions, such as arrival, precedence, processorreleasing, and so on. Thus, this method gives the highest choice-pri-ority levels to independent activities, and the lowest levels todependent activities (e.g. processor granting). More specifically,when changing from one state to another, the highest choice-prior-ity class of transitions is analyzed whereas the other classes arepruned. As consequence, this technique decreases the state spacesize as well as allows checking unavailability of feasible schedules.

Table 1 depicts the choice-priority levels of each transitionclass, where the lowest number indicates the highest priority. Inthis table, transition tri (release) does not have a specific choice-priority, since it is a special type of transition that, once it is

Page 10: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

E. Tavares et al. / Microprocessors and Microsystems 32 (2008) 460–473 469

fireable, may fire in any transition class. Considering the class thattransition tri is fireable, tri is selected as the first option. Suchmechanism allows tasks to be released for execution at any mo-ment according to the timing interval provided by tri.

6.2.3. Removing undesirable statesSection 5.1 presents a building block able to find out deadline

missing, which is an undesirable reachable state. During the TLTSgeneration, transitions leading to undesirable states are discardedby the scheduling algorithm.

6.3. Pre-runtime scheduling algorithm

The proposed algorithm (Fig. 8) is a depth-first search methodfor TLTS generation that aims achieving the stop criterion (finalmarking MF reachability – Section 5.1(i)) without generating thewhole state space. Whenever the stop criterion is achieved, a fea-sible schedule is generated.

Considering that (i) the Petri net model is surely bounded, (ii)the timing constraints are enclosed by finite intervals, and (iii)the algorithm strictly implements the transition firing rule of thetime Petri nets, the TLTS is finite and thus the proposed algorithmalways finishes, providing as result either a feasible schedule ornone, in case no one exists.

In this algorithm, S is the current state, MF is the desired finalmarking, TPN is the time Petri net model, and emax is the system en-ergy constraint (Section 3). The only way the algorithm returnsTRUE is when it reaches the desired final marking (MF, stop crite-rion), implying that a feasible schedule has been found (line 3).The tagging scheme (lines 4 and 9) ensures that no state is visitedmore than once. The state space generation algorithm incorporatesthe state space pruning (line 5), in which, for the set of fireabletransitions (function fireable), function pruning is executedaccording to the rules described in Section 6.2 as well as in thepreprocessing phase. PT is a set of ordered pairs ht, hi representing,for each fireable transition (post-pruning), all possible firing timesin the firing domain. The function fire (line 8) returns a new gen-erated state (S0) due to the transition t firing at time h. The feasibleschedule is represented by a timed labeled transition system that isgenerated by the function add-in-trans-system (line 11). Onlywhen the system does not have a feasible schedule, the whole statespace is analyzed.

6.4. Complexity

To provide an estimation of the state space size generated whenconsidering the proposed method, it is assumed n non-interactingtasks, each one with k local states. Hence, the respective statespace size is O(kn) [23]. In general, the number of local states (k)of each task is somewhat affected by the following attributes: (i)

Fig. 8. Schedule synthesis algorithm.

the number of task instances; (ii) the number of available voltagesfor the task; (iii) the respective release interval; and (iv) whenregarding preemptive tasks, the number of tokens that representsthe computation time at a specific voltage (which is affected bythe adopted task time unit). Inter-task relations and timing con-straints, excluding release, have not been taken into account inthe state space size computation. Indeed, these are reasonableassumptions, since: (i) when inter-task relations are explicitlymodeled, the model may generate fewer reachable states; (ii) timeelapsing states, which do not consider transition firing, are dis-carded; (iii) the set of possible reachable markings generated froma TPN model is a subset of or equal to the marking reachability setof the respective untimed Petri net model [12]; and (iv) undesir-able states (e.g deadline missing) are avoided.

The complexity of the state space is not only related to theadopted formalism, but, primarily, due to the scheduling problemin question. For instance, other formal methods, such as processalgebras and automata, face similar complexity to tackle thisscheduling problem. In the case of automata, kn states may be re-quired to model all possible situations of a hard real-time system.Nevertheless, the proposed method only reaches the states ofinterest in consequence of state space reduction techniques. For abetter visualization, Section 8 provides quantitative results.

6.5. Schedule synthesis example

To demonstrate the method introduced, consider the specifica-tion described in Section 5.2. For this specification, assume that theenergy consumption is 1 nJ/cycle at 1 V and 2 nJ/cycle at 2 V.Firstly, the preprocessing is performed, which assigns 1.5 V/15 MHz as the initial voltage/frequency level for each task in-stance. Next, the TPN model is generated (Fig. 5) considering thevoltage/frequency levels available on the CPU (1 V/10 MHz, 2 V/20 MHz) and the unavailable level obtained with the preprocessing(1.5 V/15 MHz).

After obtaining the TPN model, the scheduling algorithm is exe-cuted (see Table 2). In that table, visited represents the amount ofvisited states up to that point, state shows the reached state iden-tification and energy depicts the accumulated energy consumption(in joules) from the initial state up to that point. The other itemsare related to the Petri net formalism (e.g. ET) as well as the sched-uling algorithm (e.g. PT). Additionally, the transition selected forfiring in a specific state (selected column) is followed by the elapsedtime (h). For the sake of readability, the symbol # (null value fordisabled transitions) is not presented in the clock vector. The nextparagraph comments the information presented in Table 2.

Initially, at state 0, only transition tstart is enabled, since the ini-tial marking is m(pstart) = m(pproc) = 1. After the firing of tstart, bothtasks T1 and T2 become eligible for execution, but task T1 is releasedfirst (state 2) due to its timing constraints (ri = 0). The readershould note the partial-order reduction at state 3, which has 4 fire-able transitions. Only transition tph2 is not pruned, since it has thehighest choice-priority level (see Table 1). Similar circumstance oc-curs in further states (e.g. State 17 e 28). In relation to the prepro-cessing, state 4 depicts an interesting situation. Despite theavailability of three voltage/frequency levels for executing T1, tran-sition tv11 is pruned, since the respective voltage/frequency level of1 V/10 MHz is below the level obtained by Yao’s algorithm (1.5 V/15 MHz – transition tv13 ). Thus, transition tv13 is selected and T1 ac-quires the processor granting in order to start its execution (state6).

Observe that, after each time unit related to the computationtime of task T1 (e.g. state 7), the processor is released, and task T1

needs to take the processor granting again (e.g. state 8) for contin-uing its computation. Such approach is required to allow otherconcurrent tasks to be assigned to the CPU, since the preemptive

Page 11: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

Table 2Algorithm execution

Visited State ET C FT PT Selected Energy

1 0 {tstart} [0] {tstart} {tstart} (tstart,0) 0.002 1 {tph1,tph2} [0, 0] {tph1,tph2} {tph1,tph2} (tph1,0) 0.003 2 {td1,tr1,tph2} [0, 0, 0] {tr1,tph2} {tr1,tph2} (tr1,0) 0.004 3 {td1,tv11,tv12,tv13,tph2} [0, 0, 0, 0, 0] {tv11,tv12,tv13,tph2} {tph2} (tph2,0) 0.005 4 {td1,tv11,tv12,tv13,td2,tr2} [0, 0, 0, 0, 0, 0] {tv11,tv12,tv13} {tv12,tv13} (tv13,0) 0.006 5 {td1,tvs13,td2,tr2} [0, 0, 0, 0] {tvs13} {tvs13} (tvs13,0) 0.007 6 {td1,tg113,td2,tr2} [0, 0, 0, 0] {tg113} {tg113} (tg113,0) 0.008 7 {td1,tc113,td2,tr2} [0, 0, 0, 0] {tc113} {tc113} (tc113,1) 0.009 8 {td1,tg113,td2,tr2} [1, 0,1, 1] {tg113} {tg113} (tg113,0) 0.0410 9 {td1,tc113,td2,tr2} [1, 0,1, 1] {tc113} {tc113} (tc113,1) 0.0411 10 {td1,tg113,td2,tr2} [2, 0, 2, 2] {tg113} {tg113} (tg113,0) 0.0812 11 {td1,tc113,td2,tr2} [2, 0, 2, 2] {tc113} {tc113} (tc113,1) 0.0812 12 {td1,tg113,td2,tr2} [3,0,3,3] {tg113} {tg113} (tg113,0) 0.1214 13 {td1,tc113,td2,tr2} [3, 0, 3, 3] {tc113} {tc113} (tc113,1) 0.1215 14 {td1,tg113,td2,tr2} [4, 0, 4, 4] {tg113} {tg113} (tg113,0) 0.1616 15 {td1,tc113,td2,tr2} [4, 0, 4, 4] {tc113} {tc113} (tc113,1) 0.1617 16 {td1,tg113,td2,tr2} [5, 0, 5, 5] {tg113,tr2} {tr2,tg113} (tr2,0) 0.2018 17 {td1,tg113,td2,tv21,tv22,tv23} [5, 0, 5, 0, 0, 0] {tg113,tv21,tv22,tv23} {tv22,tv23} (tv23,0) 0.2019 18 {td1,tg113,td2,tvs23} [5, 0, 5, 0] {tg113,tvs23} {tvs23} (tvs23,0) 0.2020 19 {td1,tg113,td2,tg123} [5, 0, 5, 0] {tg113,tg123} {tg123,tg113} (tg123,0) 0.2021 20 {td1,td2,tc123} [5, 5, 0] {tc123} {tc123} (tc123,1) 0.2022 21 {td1,tg113,td2,tg123} [6, 0, 6, 0] {tg113,tg123} {tg123,tg113} (tg123,0) 0.2423 22 {td1,td2,tc123} [6, 6, 0] {tc123} {tc123} (tc123,1) 0.2424 23 {td1,tg113,td2,tfv123} [7, 0, 7, 0] {tg113,tfv123} {tfv123} (tfv123,0) 0.2825 24 {td1,tg113,td2,tg23} [7, 0, 7, 0] {tg113,tg23} {tg23,tg113} (tg23,0) 0.2826 25 {td1,td2,tc23} [7,7,0] {tc23} {tc23} (tc23,1) 0.2827 26 {td1,tg113,td2,tg23} [8, 0, 8, 0] {tg113,tg23} {tg23,tg113} (tg23,0) 0.2928 27 {td1,td2,tc23} [8, 8, 0] {tc23} {tc23} (tc23,1) 0.2929 28 {td1,tg113,td2,tfv23} [9, 0, 9, 0] {tg113,tfv23} {tfv23} (tfv23,0) 0.3030 29 {td1,tg113,td2,tf2} [9, 0, 9, 0] {tg113,tf2} {tf2} (tf2,0) 0.3031 30 {td1,tg113} [9, 0] {tg113} {tg113} (tg113,0) 0.3032 31 {td1,tc113} [9, 0] {tc113} {tc113} (tc113,1) 0.3033 32 {td1,tg113} [10, 0] {tg113} {tg113} (tg113,0) 0.3434 33 {td1,tc113} [10, 0] {tc113} {tc113} (tc113,1) 0.3435 34 {td1,tg113} [11, 0] {tg113} {tg113} (tg113,0) 0.3836 35 {td1,tc113} [11, 0] {tc113} {tc113} (tc113,1) 0.3837 36 {td1,tfv113} [12, 0] {tfv113} {tfv113} (tfv113,0) 0.3838 37 {td1,tg13} [12, 0] {tg13} {tg13} (tg13,0) 0.4239 38 {td1,tc13} [12, 0] {tc13} {tc13} (tc13,1) 0.4240 39 {td1,tg13} [13, 0] {tg13} {tg13} (tg13,0) 0.4341 40 {td1,tc13} [13, 0] {tc13} {tc13} (tc13,1) 0.4342 41 {td1,tg13} [14, 0] {tg13} {tg13} (tg13,0) 0.4443 42 {td1,tc13} [14, 0] {tc13} {tc13} (tc13,1) 0.4444 43 {td1,tg13} [15, 0] {tg13} {tg13} (tg13,0) 0.4545 44 {td1,tc13} [15, 0] {tc13} {tc13} (tc13,1) 0.4546 45 {td1,tg13} [16, 0] {tg13} {tg13} (tg13,0) 0.4647 46 {td1,tc13} [16, 0] {tc13} {tc13} (tc13,1) 0.4648 47 {td1,tg13} [17, 0] {tg13} {tg13} (tg13,0) 0.4749 48 {td1,tc13} [17, 0] {tc13} {tc13} (tc13,1) 0.4750 49 {td1,tg13} [18, 0] {tg13} {tg13} (tg13,0) 0.4851 50 {td1,tc13} [18, 0] {tc13} {tc13} (tc13,1) 0.4852 51 {td1,tg13} [19, 0] {tg13} {tg13} (tg13,0) 0.4953 52 {td1,tc13} [19, 0] {td1,tc13} {tc13} (tc13,1) 0.4954 53 {td1,tfv13} [20, 0] {td1,tfv13} {tfv13} (tfv13,0) 0.5055 54 {td1,tf1} [20, 0] {td1,tf1} {tf1} (tf1,0) 0.5056 55 {tend} [0] {tend} {tend} (tend,0) 0.50

470 E. Tavares et al. / Microprocessors and Microsystems 32 (2008) 460–473

scheduling method is adopted (see Section 5.1e). At state 16, taskT2 is released for execution (r2 = 5), and the voltage/frequency levelof 1.5 V/15 MHz (transition tv23 ) is selected (state 17). As T2 has anearlier deadline, task T1 is preempted at state 19. Additionally, thereader should notice that the voltage/frequency level is reducedfrom 2 V/20 MHz to 1 V/10 MHz at state 23, because 1.5 V/15 MHz is ‘‘simulated” using both voltage/frequency levels (seeSection 5.1(g)). At state 29, task T2 finishes its execution, and, con-sequently, T1 returns from preemption at state 30. From state 52–54, transition td1 is fireable. Nevertheless, since such transitionrepresents the occurrence of an undesirable state (deadline miss-ing), the pruning is applied (Section 6.2): td1 R PT. Finally, afterconcluding all tasks’ execution, a token is stored in place pend, indi-cating that a feasible schedule has been found. In this example, no

backtrack occurs, in other words, a schedule has been found in thefirst attempt.

7. Handling dynamic slack times

During system runtime, slack times (CPU idle times) may ap-pear due to tasks’ early completion. In order to take advantage ofsuch slacks for reducing even more energy consumption, a smallruntime scheduler is proposed for adjusting the starting times aswell as the voltage/frequency levels associated to each taskinstance.

Initially, during system runtime, a dispatcher is adopted to exe-cute tasks according to the feasible schedule generated in design-time. If a task instance completes its execution earlier than the

Page 12: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

E. Tavares et al. / Microprocessors and Microsystems 32 (2008) 460–473 471

respective WCEC, the scheduler is executed to advance the nexttask instance and to adjust the voltage/frequency to a lower levelthan the one defined for such instance in the pre-runtime schedule.Nevertheless, only if the next task can start its execution earlierand the energy saving compensates the overhead incurred by thescheduling, the adjusting is performed. It is important to state thatthe scheduler does not violate inter-task relations, since it followsthe tasks’ order defined in the pre-runtime schedule. Additionally,the scheduler overhead related to the feasibility test is consideredin the pre-runtime method, more specifically, in the overheadblock presented in Section 5.4.

Before presenting the runtime scheduler algorithm, some con-cepts are required firstly. The pre-runtime schedule is partitionedin several time slices of the same size, in which each slice corre-sponds to one task time unit, and the total amount is equal tothe LCM. For instance, Fig. 2d depicts a feasible schedule, in whichthe total amount of time slices is equal to 13. These slices can begrouped into segments in such a way that represent task execu-tions. Such segments are denominated task segments, and eachone is represented by an interval ([start, end]). When a task isnot completely executed within a segment, the task is preempted,in other words, it is carried out through more segments. Consider-ing Fig. 2d, the segments are: (i) s2 = [2, 3]; (ii) s1 = [3, 6]; (iii)s1

3 ¼ ½6;7�; (iv) s4 = [7, 9]; (v) s23 ¼ ½9;13�. These intervals resemble

a pre-runtime schedule table, which stores information about theexecution of each task instance. Moreover, a global clock (clock)is adopted for tracking the current time (e.g. the accumulatednumber of time slices). Taking into account the previous concepts,the runtime schedule algorithm is depicted in Fig. 9 using a C syn-tax notation.

In order to check the early completion of a task instance, theruntime scheduler is executed at the end of each segment, in sucha way that its execution does not conflict with the dispatcher exe-cution. Firstly, the scheduler verifies which is the next segment(line 2) in the pre-runtime schedule, since it is the candidate foradjusting the respective voltage/frequency level as well as the start

Return

0 1 2 3 4 5 6 7 8 9 10

Time Slices

Segments

τ1 τ2τ3 τ2

τ4

1.38V

1.26V

1.07V

1.04v

a

Fig. 10. Runtime sch

Fig. 9. Runtime scheduler algorithm.

time. If there is no segment to be executed – the remaining seg-ments are returns from preemption of finished instances or the lastsegment was already executed – the original pre-runtime scheduleis kept (line 4). Also, the pre-runtime schedule is not changedwhether the start time of the next segment is equal to the releasetime assigned to the respective task instance. Considering thatthere is an available segment, the respective start time is set sothat the release time is not violated (line 6 and 8). If the next seg-ment can be promptly started, the start time is tuned for takinginto account the scheduler WCET (worst-case execution time). Itis worth noting that the adjustment is only performed wheneverthe improvements compensate the scheduling overhead (line 9).

For a better understanding, consider the schedule depicted inFig. 10a, which is composed of the following segments: (i)s1 = [1, 2]; (ii) s1

2 ¼ ½2;4�; (iii) s3 = [4, 6]; (iv) s22 ¼ ½6;9�; and (v)

s4 = [9, 10]. The DVS platform described in the motivational exam-ple is adopted, considering an additional voltage/frequency level of1.38 V/60 MHz and an energy consumption per clock cycle of0.54 nJ. In this example, if task s2 completes its execution earlierat 7, the proposed scheduler attempts to adjust the voltage/fre-quency level as well as the start time of the next segment (s4). Con-sidering that s4 release time is equal to 6, s4 can start its executionearlier and utilize a lower voltage/frequency level (Fig. 10b).Assuming that WCEC of each task are c1 = 50 � 106,c2 = 150 � 106, c3 = 100 � 106, c4 = 60 � 106, the energy consump-tion is reduced from 0.1305 J (early completion of task s2) to0.1167 J (Fig. 10b).

8. Experimental results

This work has conducted some experiments to evaluate the pro-posed pre-runtime and runtime scheduling algorithms. Firstly,experiments related to the proposed pre-runtime method are pre-sented, and, next, results concerning the runtime scheduler aredescribed.

8.1. Pre-runtime scheduling

Table 3 summarizes the experiments adopted to evaluate thepre-runtime scheduling method. In that table, real-world applica-tions as well as custom-built examples (that simulates real-worldsituations) are taken into account. The column task representsthe number of tasks; inst. represents the number of tasks’ in-stances; size depicts a state space size estimation; sch. is the num-ber of states of the feasible schedule; found counts the number ofstates actually verified for finding a feasible schedule; w/DVS is theenergy consumed (in joules) by the found feasible schedule usingDVS; o/DVS is the energy consumed in joules by an alternativeschedule that disregards DVS; lpedf is the energy consumed (injoules) by a schedule generated using the optimal scheduling

Ret.

0 1 2 3 4 5 6 7 8 9 10

τ1 τ2τ3 τ2 τ4

1.38V

1.26V

1.07V

1.04v

b

eduler example.

Page 13: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

543162 609295 611859 616151 627355 666207 815651

6 8 10 12 14 16 18Tasks

Sta

te S

pac

e S

ize

State Space Generated Space

10130

10120

10110

10100

1090

1080

1070

1060

1050

1040

1030

1020

1010

1

Fig. 11. Scalability: state spaces.

Table 3Experimental results summary

Case study Task Inst. Size Sch. Found w/DVS o/DVS lpedf Time (s)

1. Motiv. example 4 4 7 � 107 48 141 0.1414 0.1620 0.1249 0.0012. Example 2 6 6 7 � 1035 4567 543162 0.0003 0.0008 0.00028 35.7263. Example 3 12 12 2 � 1032 551 9906 267.8400 360.0000 254.1184 0.2824. CNC control 8 289 9 � 1070 235852 1884381 0.1190 0.3450 0.0944 291.2215. Pulse Oximeter 14 178 3 � 1077 1448 1448 0.0051 0.0210 0.0045 0.0396. MP3 & GSM 8 3604 3 � 1068 381313 381313 3.8620 4.7660 3.8541 9.606

0

500

1.000

1.500

2.000

2.500

3.000

3.500

6 8 10 12 14 16 18

Tasks

Tim

e (s

)

Fig. 12. Scalability: execution times.

472 E. Tavares et al. / Microprocessors and Microsystems 32 (2008) 460–473

method proposed by Yao et al. considering discrete voltage/fre-quency levels [11]; and time expresses the algorithm executiontime (in seconds) for finding the feasible schedule. All experimentswere performed on a Pentium D 3 GHz, 4Gb RAM, Linux, and com-piler GCC 3.3.2. For a better comprehension, the following para-graphs give an overview of each case study.

Case study 1 and 2 are based on example 2 presented in [25],which demonstrates conditions in which pre-runtime approacheswould be able to find feasible schedules, whereas runtime methodsmay fail due to the exclusion relation between tasks. For a com-plete explanation, Section 3 – Example details the situation, whichleads to occasions in which the processor need to be left idle to finda valid schedule. In this work, Xu’s example is extended with addi-tional tasks and inter-task relations. Case study 1 incorporates 2additional tasks, and Case study 2 takes into account 4 new tasks.As depicted in Table 3, besides meeting all timing constraints, theproposed approach allows additional energy savings by adoptingDVS.

As in the previous cases, Case study 3 is based on Fig. 2 of [24],which also depicts a condition in which runtime scheduling meth-ods may not work. Since tasks may finish their executions earlierthan their respective worst-case execution times, a runtime sched-uler may promptly start a task for execution due to the releasetime and postpone another task with an earlier deadline. The gen-erated schedule may be infeasible, because, assuming an exclusionrelation between these tasks, the task with the earlier deadline isforced to wait for the other task’s execution, and so, its timing con-straints may be violated. The pre-runtime method proposed, how-ever, avoids this undesirable situation, since the dispatcher alwaysstarts tasks’ execution in accordance with the sequence defined inthe feasible schedule. To adopt the example, the computationtimes were tuned for allowing voltage scaling. Results are shownin Table 3.

Case study 4 [9], 5 [17] and 6 [20] are real-world applications.Case study 4 is the control software of a CNC machine, which isan automatic machining tool adopted for manufacturing user-de-signed workpieces. The CNC Controller specification is composedof several concurrent tasks with exclusion relations, which be-comes an interesting case study for evaluating the proposed meth-od considering DVS. Case study 5 is a pulse oximeter, which is anelectronic device responsible for measuring blood oxygen satura-tion using a non-invasive method. In this case study, the specifica-tion is composed of several non-preemptable tasks withprecedence constraints. Case study 6 is an application composedof an MP3 player and GSM decoder, where the respective specifica-tion contains several tasks with precedence relations. Table 3shows the energy savings obtained using the proposed approach.

Table 3 depicts the quantitative data for each case study, whichclearly shows that the proposed scheduling method has providedmeaningful results, since it has significantly reduced the numberof visited states, provided feasible schedules in which runtimemethods may not, and also allowed energy saving by adoption ofDVS. Besides, the proposed method generated feasible schedulesthat consume only 10% more energy (in average) than Yao’s opti-mal solution. It is important to emphasize that Yao’s method does

not consider precedence and exclusion relations. In other words,the values provided in column lpedf assume a set of independenttasks, thus, the respective schedules are not feasible for theadopted case studies. Nevertheless, those values still serve as aninteresting parameter for comparison purposes.

8.2. Pre-runtime scheduling: scalability

To provide a better visualization of the scheduling algorithmscalability, this work extended Case study 2 to incrementallyaccommodate two new tasks with exclusion relation betweenthem. Figs. 11 and 12 depict the results related to the state spacegeneration and execution time, respectively. Although the statespaces grow exponentially with the number of tasks (Section6.4), the amount of reached states (dashed bar) is significantly re-duced due to state space reduction techniques. Nevertheless, theexecution time considerably increases, mainly, because of the tag-ging scheme (Section 6.3) adopted to keep track of the visitedstates. However, in the proposed work, the major issue is the statespace explosion, which is substantially mitigated as described

Page 14: Modeling hard real-time systems considering inter-task relations, dynamic voltage scaling and overheads

0.6

0.7

0.8

0.9

1

100% 75% 50% 25% 10%

% variation of WCEC

En

erg

y (N

orm

aliz

ed)

with Scheduler without Scheduler

Fig. 13. Runtime scheduler experiment.

E. Tavares et al. / Microprocessors and Microsystems 32 (2008) 460–473 473

before. Besides, although the algorithm execution time increases inconsequence of the tagging scheme, observe that, for a system witha state space size of approximately 10120 states, the algorithmfound a feasible schedule in less than one hour.

8.3. Runtime scheduler

An experiment based on Case study 2 has been adopted to high-light some features related to the proposed runtime scheduler. Inthis experiment, seven concurrent tasks with precedence andexclusion relations were studied. Fig. 13 depicts the quantitativedata, considering that four tasks varied their execution cycles atthe same time from 100% to 10% in relation to each task WCEC. En-ergy values are normalized to aid comparing them when takinginto account the runtime scheduler execution or not. Additionally,all tasks executed within the pre-runtime schedule period (LCM).

At 100%, the scheduler provides no energy savings. In fact, theoverall energy consumption is increased due to the additionaloverheads. However, at 75%, the scheduler saves 7% of energy bystretching out other segments. When the execution cycles are re-duced below 75%, the runtime scheduler still provides energy sav-ings, but with a lower margin. However, such behaviour dependson the characteristics of the system execution as well as of thepre-runtime schedule. One interesting approach is to examine spe-cific situations at design time (e.g. the average execution cycles ofeach task) in order to verify the actual feasibility of the runtimescheduler for a given application.

9. Conclusion

This paper presented a method based on time Petri nets forpower-aware hard real-time systems schedule generation, consid-ering dynamic voltage scaling, overheads, precedence andexclusion relations. Although many works deal with DVS in time-critical systems, inter-task relations as well as overheads havebeen neglected. In addition, most works rely on runtime methods,which, although flexible, may fail in finding a feasible schedule,even if such a schedule exists. Predictability is an important con-cern when considering time-critical systems. For assuring thatevery critical task meets its deadlines, a pre-runtime schedulingmethod was proposed. Several experiments have been conducted

to demonstrate the feasibility of the proposed method, in such away that viable schedules were found in situations where runtimemethods may fail, and, also, energy consumption was reduced byadopting DVS. Besides, using the proposed formal model, severalproperties can be verified as well as analyzed.

Currently, we are developing an automatic code generationmethod, which will provide customized code satisfying timingand energy constraints. As future work, we are planning to extendthe proposed scheduling method in order to consider multipleprocessors.

References

[1] A. Andrei, M. Schmitz, P. Eles, Z. Peng, B.M. Al-Hashimi, Overhead-consciousvoltage selection for dynamic and leakage energy reduction of time-constrained systems, IEE Proc. Comp. Digital Tech. 152 (1) (2005) 28–38.

[2] H. Aydin, R. Melhem, D. Mossé, P. Alvarez-Mejía, Power-aware scheduling forperiodic real-time tasks, IEEE Trans. Comp. 53 (5) (2004) 584–600.

[3] Y. Cai, M. Schmitz, B. Al-Hashimi, S. Reddy, Workload-ahead-driven onlineenergy minimization techniques for battery-powered embedded systems withtime-constraints, ACM Trans. DAES (2006) 1–23.

[4] L. Cortés, P. Eles, Z. Peng, Quasi-static assignment of voltages and optionalcycles for maximizing rewards in real-time systems with energy constraints,DAC’05 (2005) 13–17.

[5] M. Garey, D. Johnson, Computer and Intractability: A Guide to the Theory of theNP-Completeness, W.H. Freeman and Company, 1979.

[6] P. Godefroid, Partial Order Methods for the Verification of Concurrent Systems,PhD Thesis, University of Liege, 1994.

[7] T. Ishihara, H. Yasuura, Voltage Scheduling Problem for Dynamically VariableVoltage Processors, in: ISLPED’98, 1998.

[8] R. Jejurikar, R. Gupta, Energy-aware task scheduling with task synchronizationfor embedded real-time systems, IEEE Trans. Comput-Aid Des. Integ. Circ. Sys.25 (6) (2006) 1024–1037.

[9] N. Kim, M. Ryu, S. Hong, M. Saksena, C. Choi, H. Shin, Visual assessment of areal-time systems design: a case study on a cnc controller, in: RTSS’96, 1996,pp. 300–310.

[10] W. Kim, K. Jihong, M. Sang, Preemption-aware dynamic voltage scaling in hardreal-time systems, in: ISLPED’04, 2004, pp. 393–398.

[11] W. Kwon, T. Kim, Optimal voltage allocation techniques for dynamicallyvariable voltage processors, in: DAC’03, 2003, pp. 2–6.

[12] N. Leveson, J. Stolzy, Safety analysis using Petri nets, IEEE Trans. Soft. Eng. 13(3) (1987) 386–397.

[13] G. Madl, N. Dutt, Domain-specific modeling of power aware distributed real-time embedded systems, in: SAMOS’06, 2006.

[14] P. Merlin, D. Faber, Recoverability of communication protocols, IEEE Trans.Comm. 24 (9) (1976) 1036–1043.

[15] B. Mochocki, X. Hu, Q. Gang, Transition-overhead-aware voltage scheduling forfixed-priority real-time systems, ACM Trans. DAES 12 (2) (2007) 1084–4309.

[16] T. Murata, Petri nets: properties, analysis and applications, Proc. IEEE 77 (4)(1989) 541–580.

[17] M. Oliveira Jr., Desenvolvimento de Um Protótipo para a Medida Não Invasivada Saturação Arterial de Oxigênio em Humanos – Oxímetro de Pulso (inportuguese), MSc Thesis, UFPE, 1998.

[18] T. Phatrapornnant, M. Pont, Reducing jitter in embedded systems employing atime-triggered software architecture and dynamic voltage scaling, IEEE Trans.Comp. 55 (2) (2006) 113–124.

[19] R. Passos, C. Coelho Jr., A. Loureiro, R. Mini, Dynamic power management inwireless sensor networks: an application-driven approach, in: WONS’05, 2005.

[20] R. Prathipati, Energy efficient scheduling techniques for real-time embeddedsystems, MSc Thesis, Texas A&M University, 2004.

[21] G. Quan, X. Hu, Energy efficient DVS schedule for fixed-priority real-timesystems, ACM Trans. Emb. Comp. Sys. (TECS) 6 (4) (2007).

[22] T. Tavares, P. Maciel, AMALGHMA and DENTES tools. <http://www.cin.ufpe.br/~eagt/tools>, 2008.

[23] A. Valmari, The state explosion problem, LNCS: Lect. Petri Nets I: Basic Models1491 (1998) 429–528.

[24] J. Xu, On inspection and verification of software with timing requirements,IEEE Trans. Soft. Eng. 29 (8) (2003) 705–720.

[25] J. Xu, D. Parnas, Priority scheduling versus pre-run-time scheduling, Real-TimeSyst. 18 (1) (2000) 7–23. Kluwer Academic Publishers.

[26] F. Yao, A. Demers, S. Shenker, A scheduling model for reduced cpu energy, IEEEAnn. Found. Comp. Sci. (1995) 374–382.