Online Energy-Aware I/O Device Scheduling for Hard Real ...cse.unl.edu/~goddard/Papers/TechReports/TR_CEAEDF_EASD.pdf · Online Energy-Aware I/O Device Scheduling for Hard Real

Online Energy-Aware I/O Device Scheduling for Hard Real-Time Systems withShared Resources

Abstract

The challenge in conserving energy in embedded real-time systems is to reduce power consumption while pre-

serving temporal correctness. Previous research has focused on power conservation for the processor, while power

conservation for I/O devices has received little attention. In this paper, we analyze the problem of online energy-

aware I/O scheduling for hard real-time systems based on the preemptive periodic task model with non-preemptive

shared resources. We propose two online energy-aware I/O device scheduling algorithms: Conservative Energy-

Aware EDF (CEA-EDF) and Enhanced Aggressive Shut Down (EASD). The CEA-EDF algorithm makes conserva-

tive predictions for device usage and guarantees that a device is in the active state before or at the time the job that

requires it is released. The EASD algorithm utilizes device slack to perform device power state transitions to save

energy, without jeopardizing temporal correctness. Both algorithms are preemptive but support non-preemptive

shared critical regions. An evaluation of the two approaches shows that both yield significant energy savings with

respect to no Dynamic Power Management (DPM) techniques. The actual savings depends on the task set, shared

devices, and the power requirements of the devices. EASD provides better power savings than CEA-EDF. However

CEA-EDF has very low overhead and performs comparable to EASD when the system workload is low.

1 Introduction

In recent years, many embedded real-time systems have emerged with energy conservation requirements. Most

of these systems consist of a microprocessor with I/O devices and batteries with limited power capacity. There-

fore, aggressive energy conservation techniques are needed to extend their lifetimes. Traditionally, the research

community has focused on processor-based power management techniques, with many articles published on pro-

cessor energy conservation. On the other hand, research of energy conservation with I/O devices has received little

1

attention. In practice, however, I/O devices are also important power consumers, but typically support fewer power

states than processors.

At an arbitrary instance, most devices can only be in one of two states: active or idle. To increase energy

savings, the time for which a device is idle must be increased, but I/O devices take much longer to perform power

state transitions than processors. A DSP device can take as long as 500 ms to switch states [19]. Furthermore,

energy consumption during state transitions is not negligible. Too many context switches may increase power

consumption rather than decreasing it. The problem of saving energy for I/O devices in hard real-time systems is a

dilemma: we want to shut down a device whenever the device is not being used, but at a risk of turning on devices

too late, which makes some jobs miss their deadlines, or causes unnecessary state switches that waste energy. We

discuss this in detail in Section 3.

In this paper, we analyze the problem of energy-aware I/O scheduling for hard real-time systems based on the

preemptive periodic task model with non-preemptive shared resources. Without knowing the actual job executions

a priori, an optimal solution is not possible for either online or offline scheduling algorithms. Here we define

optimal as the maximum energy savings for a task set. The actual savings depends on the task set, actual execution

times, shared devices, and the power requirements of the devices.

Two online scheduling algorithms that support shared resources are proposed: Conservative Energy-Aware EDF

(CEA-EDF) and Enhanced Aggressive Shut Down (EASD). Both of these algorithms use Earliest Deadline First

(EDF) [6] to schedule jobs, and use Stack Resource Policy (SRP) [2] to control access to shared resources, which

are typically granted to jobs on a non-preemptive basis and used in a mutually exclusive manner. When performing

preemptive scheduling with I/O devices, I/O devices become important shared resources whose access needs to

be carefully managed. For example, a job that performs an uninterruptible I/O operation can block the execution

of all jobs with higher priorities. Thus the time for the uninterruptible I/O operation needs to be treated as a non-

preemptive resource access. Other resources besides I/O devices include critical sections of code, reader/writer

buffers, etc.

The rest of this paper is organized as follows. Section 2 discusses related work. The problem of energy-aware

I/O device scheduling is analyzed in Section 3. Section 4 describes the proposed algorithms. Section 5 describes

how we evaluated our system and presents the results. Section 6 presents our conclusions and describes future

work.

2

2 Related Work

In the past decade, much research work has been conducted on low-power design methodologies for real-

time embedded systems. For hard real-time systems, the research has focused primarily on reducing the power

consumption of the processor. The research on power conservation technologies for I/O devices, though important,

has received little attention.

Most Dynamic Power Management (DPM) techniques for devices are based on switching a device to a low

power state (or shutdown) during an idle interval. DPM techniques for I/O devices in non-real-time systems focus

on switching the devices into low power states based on various policies (e.g., [9, 10, 8, 5, 23]). These strategies

cannot be directly applied to real-time systems because of their non-deterministic nature.

Some energy-aware I/O scheduling algorithms [18, 19, 20, 21] have been developed for hard real-time systems.

Among them, [18, 19, 20] are non-preemptive methods, which are known to have limitations. With non-preemptive

scheduling, a higher priority task that has been released might have to wait a long time to run (until the current task

gives up the CPU). This reduces the set of tasks that the scheduler can support with hard temporal guarantees. For

example, non-preemptive scheduling algorithms cannot support any task set in which there is a task with a period

shorter than or equal to the Worst Case Execution Time (WCET) of another task. For this reason, most commercial

real-time kernels support preemptive task scheduling.

In [18], Swaminathan et al. presented the Low Energy Device Scheduler (LEDES) for energy-aware I/O device

scheduling for hard real-time systems. LEDES takes as input a pre-determined task schedule and a device-usage list

for each task and generates a sequence of sleep/working states for each device. LEDES determines this sequence

such that the energy consumed by the devices is minimized while guaranteeing that no task misses its deadline.

However, LEDES differs from our work in that it is based on the assumption that scheduling points always occur

at task start or completion times. In other words, it can only support non preemptive task scheduling. Another

assumption is that the execution times of all tasks are greater than the transition times of required devices. This

assumption may not be valid if some required devices have relatively large transition delays, e.g. disks.

An extension of LEDES to handle I/O devices with multiple power states is presented in [20] by Swaminathan

and Chakrabarty. Multi-state Constrained Low Energy Scheduler (MUSCLES) takes as input a pre-computed task

schedule and a per-task device usage list to generate a sequence of power state switching times for I/O devices

while guaranteeing that real-time constraints are not violated. MUSCLES is also a non-preemptive method.

The pruning-based scheduling algorithm, Energy-optimal Device Scheduler (EDS), is an off-line method in

3

which jobs are rearranged to find the minimum energy task schedule [19]. EDS generates a schedule tree by

selectively pruning the branches of the tree. Pruning is done based on both temporal and energy constraints.

Similar to LEDES and MUSCLES, EDS can only support non-preemptive scheduling systems.

The only known published energy-aware algorithm for preemptive schedules, Maximum Device Overlap (MDO),

is an offline method proposed by the same authors in [21]. The MDO algorithm uses a real-time scheduling algo-

rithm, e.g., EDF or RM, to generate a feasible real-time job schedule, and then iteratively swaps job segments to

reduce energy consumption in device power state transitions. After the heuristic-based job schedule is generated,

the device schedule is extracted. That is, device power state transition actions and times are recorded prior to

runtime and used at runtime. A deficiency of the MDO algorithm is that it does not explicitly address the issue of

resource blocking. It is usually impossible to estimate when a resource blocking will happen at the offline phase.

Thus it is hard to integrate a resource accessing policy into MDO. Without considering resource blocking, it is pos-

sible that a feasible offline heuristic job schedule results in an invalid online schedule, especially with swapping

of job segments. Another problem with MDO is that it does not consider the situation when job executions are

less than their WCET; the schedule is generated with jobs WCET. Even without resource blocking, the actual job

executions can be very different from the pre-generated job schedule. A fixed device schedule cannot effectively

adapt to actual job executions. This problem is further discussed in Section 5.

The CEA-EDF and EASD algorithms proposed in this paper remove these drawbacks by providing energy-

saving scheduling for periodic task sets that have feasible preemptive schedules with blocking for shared resources.

To the best of our knowledge, no previous publication has addressed this problem. Another advantage of CEA-EDF

and EASD over existing algorithms is that they support actual execution times less than WCET. Unused WCET is

dynamically reclaimed to increase energy savings.

3 Problem description

Modern I/O devices usually have at least two power states: active and idle. To save energy, a device can be

switched to the idle state when it is not in use. In a real-time system, in order to guarantee that jobs will meet their

deadlines, a device cannot be made idle without knowing when it will be requested by a job, but, the precise time

at which an application requests the operating system for a device is usually not known. Even without knowing

the exact time at which requests are made, we can safely assume that devices are requested within the time of

execution of the job making the request.

Throughout the paper, we assume that task scheduling is based on EDF and resource access is based on SRP.

4

The EDF algorithm is a well-known optimal scheduling algorithm. SRP has two advantages over other resource

accessing policies: (1) it has low context switch overhead. No job is ever blocked once its execution starts, and

no job ever suffers more than two context switches. For other policies such as the Priority-Ceiling Protocol (PCP)

[14], four context switches can occur if a job requires one or more resources. (2) A job can be blocked for at most

the duration of one critical section. Therefore, the blocking time is bounded. The non-preemptable segment of a

job is called a critical section.

3.1 Preliminaries

Suppose that the set of devices and resources required by each task during its execution is specified along with

the temporal parameters of a periodic task set. More formally, given a periodic task set with deadlines equal to

periods, τ = {T1, T2, ...Tn}, let task Ti be specified by the four tuple (P (Ti), wcet(Ti), Dev(Ti), Res(Ti)) where,

P (Ti) is the period, wcet(Ti) is the WCET, Dev(Ti) = {λ1, λ2, ..., λm} is the set of required devices for the task

Ti, and Res(Ti) = {r1, r2, ...rn} is the set of resources required by the task. Note that Dev(Ti) specifies physical

devices required by a task Ti, while Res(Ti) specifies how these devices appear as shared resources to task Ti.

A non-preemptive device may appear as a shared resource with different access times to different tasks; and a

preemptive device may be included in Dev(Ti) but not in Res(Ti). Furthermore, if the I/O operation of device

λi ∈ Dev(Ti) is non-interruptible, then a shared resource representing λi should be put in the resource set of Ti

and all tasks with higher priorities to prevent a possible preemption. For example, suppose that task Ti performs

a non-interruptible I/O operation for 10 ms during its execution, then a resource with access time of 10 ms should

be put in the resource set of Ti; and the same resource with access time of 0 should be put in the resource set of

all tasks that may preempt Ti (e.g., tasks with shorter periods for EDF). Ti can be preempted by higher priority

tasks at any time before or after this I/O operation. In summary, a shared resource is a general concept in this

model. Suppose a section of task Ti is non-preemptive to a subset of tasks, α, then this section should be treated

as a shared resource to Ti and all tasks in α.

A task is an infinite sequence of jobs released every P (Ti) time units. We refer to the jth job of a task Ti

as Ji,j . We let Dev(Ji,j) denote the set of devices that are required by Ji,j . Throughout this paper, we have

Dev(Ji,j)=Dev(Ti). We let et(Ji,j , [t, t′]) denote the execution time of job Ji,j during the interval [t, t′]. It follows

that et(Ji,j , [0, t]) is the actual execution time of Ji,j , if t is equal to or larger than the time job Ji,j finishes its

execution. Furthermore, the priorities of all jobs are assigned according to EDF. For any two jobs, the job with the

earlier deadline has a higher priority. If two jobs have equivalent deadlines, the job with the earlier release time has

5

a higher priority. The original assigned priority of a job Ji,j is denoted by Org Prio(Ji,j). Note Org Prio(Ji,j)

is the original assigned priority and is not changed during execution, though the actual priority may change due to

priority inheritance with SRP.

Associated with each device λi are the following parameters: the transition time from the idle state to the active

state represented by twu(λi); the transition time from the active state to the idle state represented by tsd(λi); the

energy consumed per unit time in the active state represented by Pactive(λi); the energy consumed per unit time

in the idle state represented by Pidle(λi); the energy consumed per unit time during the transition from the active

state to the idle state represented by Psd(λi); and the energy consumed per unit time during the transition from the

idle state to the active state represented by Pwu(λi). We assume that for any device, the state switch can only be

performed when the device is in a stable state, i.e. the idle state or the active state. We will use these parameters in

the problem discussion and algorithm descriptions.

3.2 Motivation

The generalized problem that we aim to solve in this paper can be stated as, given a periodic task set τ =

{T1, T2, ...Tn}, Ti = (P (Ti), wcet(Ti), Dev(Ti), Res(Ti)), is there a preemptive schedule that meets all deadlines

and also reduces the energy consumed by the I/O system?

It is clear that the total energy consumed by a device λi in the hyperperiod H , is given by,

Eλi = Eactive + Eidle + Esw (1)

where, Eactive is the energy consumed when λi is in the active state; Eidle is the energy consumed by λi when it is

in the idle state; and Esw is the energy consumed when λi is in transition states. Let Tactive(λi), Tidle(λi), Twu(λi)

and Tsd(λi) denote the time that the device λi is active, idle, and during wake up/shut shown state transitions

respectively. Then we have Eactive = Pactive(λi)×Tactive(λi), Eidle = Pidle(λi)×Tidle(λi) and Esw = Pwu(λi)×

Twu(λi) + Psd(λi)× Tsd(λi). In addition, for most devices, we have,

Pactive(λi), Pwu(λi), Psd(λi) > Pidle(λi) (2)

From Equations (1) and (2), it can be seen that to increase energy savings, an energy-aware scheduler should

make it the first priority to decrease Tactive(λi) as well as the number of power state transitions. However, it is

usually hard to decrease both at the same time while not affecting temporal correctness. For example, consider

the obvious approach of aggressively shutting down devices whenever they are not needed, which is called the

Aggressive Shut Down (ASD) algorithm. ASD reduces Tactive as much as possible, but may increase energy

consumption because it may introduce too many device switches.

6

��

1T

2T

� � � � ��

1,1J

1,2J

��

2,1J

kλ

��

1,2J

� � � � ��

Figure 1. The device state transition delay can cause system failure even when the system utilizationis low. T1 = {12, 2, ∅, ∅}; T2 = {30, 5, {λk}, ∅}; twu(λk) = tsd(λk) = 8. The system utilization is33.3%.

The optimal solution to this problem should be that job executions are arranged in a way that has the lowest

energy consumption while still guaranteeing that all tasks meet their deadlines. However, this is an NP-hard

problem and an efficient optimal solution is not possible for on-line scheduling due to its huge overhead. At

first thought, an offline approach seems better because it can utilize pre-calculated task schedules. However, it is

difficult to integrate a resource accessing policy into an offline approach because it is hard to predict exact points

that jobs access resources at the offline phase. It is possible that a seemly feasible offline job schedule causes jobs

to miss their deadlines at runtime. Moreover, using an offline approach alone can be inefficient, since the offline

approach can only use the worst execution time of each task, and is difficult to adapt to actual job executions.

Compared to offline methods, it is simple to integrate a resource access policy into an online algorithm. More-

over, online algorithms can better adapt to run-time situations. Unused worst case execution time can be exploited

to increase Tidle for devices. In this paper, we try to find online solutions to this problem. As discussed before,

the ASD algorithm is a straightforward online method. However, this method cannot be directly applied to hard

real-time systems due to two inherent constraints:

1. To ensure timing constraints are met using the ASD method, device switch times must be included in a task’s

WCET, which may compromise the system’s schedulability. Figure 1 shows an example where a system is

not schedulable with the ASD algorithm, though the CPU utilization is only 33.3%.

2. The second problem is that the ASD algorithm does not consider energy consumption associated with device

state transitions. Since the energy consumption associated with the state switch could be high, the ASD

algorithm may waste energy. Consider using a device with very high switch power costs. It is easy to find

scenarios where the ASD algorithm consumes more energy than keeping the device active all the time.

7

3.3 Approach

Despite the two constraints, the ASD algorithm can achieve excellent energy savings in many systems since it

reduces Tactive of devices as much as possible. The starting point of this work is to conquer the two constraints

of the ASD algorithm and make it applicable to hard real-time system. To this end, two objectives exist in our

solutions: (1) our algorithms are applicable to all task sets schedulable with EDF and SRP; (2) our algorithms

consider the problem of energy consumption associated with device state switches. Therefore, our algorithms can

guarantee energy savings in any feasible situation.

Two online preemptive scheduling algorithms that support shared resources are proposed: Conservative Energy-

Aware EDF (CEA-EDF) and Enhanced Aggressive Shut Down (EASD). For the first problem, CEA-EDF guar-

antees that a device is in the active state when a job requiring the device releases. Although this seems too

conservative, this algorithm can achieve significant energy savings and can be easily implemented with very little

overhead, thus yielding a good performance/cost ratio. EASD employs a different approach. It keeps track of the

amount of time a device can be kept inactive without causing a job to miss its deadline. This time is called device

slack in this paper. A device is allowed to switch its state only when the device slack is large enough. Detailed

discussion is presented in Section 4.

Both algorithms utilize the concept of break-even time [3], which represents the minimum inactivity time re-

quired to compensate for the cost of entering and exiting the idle state. For example, suppose a device λk is active

at time t and it is known that no job requires it during [t, t + ∆t]. To save energy, λk can be switched to the

idle state at time t and be switched back to the active state at time t + ∆t if ∆t is larger than the transition time

tsd(λk) + twu(λk). The amount of energy expended by the device during this period is the sum of the energy

expended during transitions, Ewu and Esd, and the energy expended in the idle state, Eidle. However, for a device

λk that expends considerable energy during state transitions, it is possible that the device can consume less energy

if it is kept active during this period. That is, Ewu + Esd + Eidle > Pactive(λk) ×∆t. Therefore, λk needs to be

in the idle state long enough to save energy.

Let BE(λi) denote the break-even time of device λi. By knowing the energy expended for transitions, Ewu(λk) =

Pwu(λk) × twu(λk) and Esd(λk) = Psd(λk) × tsd(λk), as well as the transition delay tsw = twu(λk) + tsd(λk),

we can calculate the break-even time, BE(λk), as

8

Pactive ×BE(λk) = Ewu(λk) + Esd(λk) + Pidle × (BE(λk)− tsw(λk))

=⇒BE(λk) =Ewu(λk) + Esd(λk)− Pidle(λk)× tsw(λk)

Pactive(λi)− Pidle(λi)

Note that the break-even time has to be larger than the transition delay, i.e., tsw(λk). So the break-even time is

given by

BE(λk) = Max(tsw(λk),Ewu(λk) + Esd(λk)− Pidle(λk)× tsw(λk)

Pactive(λi)− Pidle(λi)) (3)

It is clear that if a device is idle for less than the break-even time, it is not worth performing the state switch.

Therefore, our approach makes decisions of device state transition based on the break-even time rather than device

state transition delay. At this point, an obvious improvement to the ASD algorithm can be made so that it utilizes the

break-even time to conquer the second constraint. The Switch-Aware Aggressive Shut Down (SA-ASD) algorithm,

which makes this enhancement to ASD, and a sufficient schedulability condition are presented in Appendix D.

Systems that can satisfy the schedulability condition for the SA-ASD algorithm should use the SA-ASD algorithm

rather than the proposed CEA-EDF and EASD algorithms, because of lower scheduling overhead for SA-ASD.

Note that the SA-ASD algorithm still has the first constraint, however, which is overcome by both CEA-EDF and

EASD.

4. Algorithms

This section introduces two OS-directed, real-time, DPM techniques applicable to I/O devices, which are based

on EDF [6] and SRP [2]: CEA-EDF and EASD. However, we first briefly review how SRP works with EDF.

4.1. Review of EDF and SRP

Each task Ti is assigned a preemption level PL(Ti), which is the reciprocal of the period of the task. The

preemption ceiling of any resource ri is the highest preemption level of all the tasks that require ri. We use the

term Π(t) to denote the current ceiling of the system, which is the highest-preemption level ceiling of all the

resources that are in use at time t. Φ is a non existing preemption level that is lower than the lowest preemption

level of all tasks. The rules can be stated as follows [7].

1. Update of the Current Ceiling: Whenever all the resources are free, the preemption ceiling of the system is

Φ; otherwise, the preemption ceiling Π(t) is the highest preemption level ceiling of all the resources that are

9

in use at time t. The preemption ceiling of the system is thus updated when a resource is allocated or freed

(at the end of a critical section).

2. Scheduling Rule: After a job is released, it is blocked from starting execution until its preemption level is

higher than the current ceiling Π(t) of the system and the preemption level of the executing job. At any time

t, jobs that are not blocked are scheduled on the processor according to their deadlines.

3. Allocation Rule: Whenever a job requests a resource, it is allocated the resource.

4. Priority-Inheritance Rule: When some job is blocked from starting, the blocking job inherits the highest

priority of all blocked jobs.

When scheduling by EDF and SRP, a job can be blocked for at most the duration of one critical section, which

includes regions of shared resource access. The calculation of the maximal blocking duration can be found in [7].

The computation is done off-line and used at runtime. We use B(Ti) to denote the blocking duration for task Ti.

In addition, the maximal blocking duration of each job Ji,j , B(Ji,j), of task Ti is equal to B(Ti).

4.2. CEA-EDF

CEA-EDF is a simple, low-overhead energy-aware scheduling algorithm for hard real-time systems. All devices

that a job needs are active at or before the job is released. Thus devices are safely shut down without affecting

the schedulability of tasks. Before describing CEA-EDF, we define the next device request time and time to next

device request that are used in keeping track of the earliest time that a device is required.

Definition 4.1. Next Device Request Time. The next device request time is denoted by NextDevReqT ime(λk, t)

and is the earliest time that a device λk is requested by an uncompleted job. Since a job can only use a device after

the job is released, the next device request time of a device λk is given by

NextDevReqT ime(λk, t) = Min(R(Ji,j)) ∀Ji,j , λk ∈ Dev(Ji,j) and Ji,j is not completed at time t

where R(Ji,j) is the release time of job Ji,j , Dev(Ji,j) is the set of devices required by Ji,j .

Definition 4.2. Time To Next Device Request. The time to next device request for device λk at time t is denoted by

TimeToNextDevReq(λk, t) and is the time from current time t to the next device request time of λk. Therefore,

the time to next device request of a device λk at time t is given by

TimeToNextDevReq(λk, t) = NextDevReqT ime(λk, t)− t;

10

Device Tasks requiring λi Current Jobs R(Ji,j) NextDevReqT ime(λk, t) Power up time Up(λk)λ1 {T1, T3, T5} {J1,20, J3,15, J5,78} {160, 200, 220} 160 −1λ2 {T1, T2} {J1,20, J2,25} {160, 210} 160 −1λ3 {T3, T4, T6} {J3,15, J4,25, J6,18} {200, 215, 207} 200 180

Table 1. Device Usage Table. The CEA-EDF scheduler uses this table to keep tracks of next devicerequest time for each device. The CEA-EDF scheduler also use this table to power up an idle deviceλk based on the maintained Up(λk).

1 Function T imeToNextDevReq()

2 Input: current system time t and the current executing job Ji,j ;3 Output: renewed time to next device request for all devices;4 If (t: instance when job Ji,j is completed)5 α← α− Ji,j + Ji,j+1; // α is the set of current jobs requiring λk.6 ∀λk ∈ Dev(Ji,j), NextDevReqT ime(λk, t)←Min(R(Jm,n)), where Jm,n ∈ α; // Update next device request time.7 ∀λk ∈ Dev(Ji,j), T imeToNextDevReq(λk, t)← NextDevReqT ime(λk, t)− t;8 Else9 // do nothing

10 End

Figure 2. The pseudocode for TimeToNextDevReq algorithm. This algorithm updates the time tonext device request for devices.

The CEA-EDF scheduler maintains a table for each device as shown in Table 1. The current job of a task is

the uncompleted job with the earliest deadline among all jobs of the task. For example, suppose J1,1 is the first

job of a task T1, and J1,1 is released at time 0 and is completed at time 10. Then the current job of task T1 is J1,1

during [0, 10). Suppose the second job of T1, J1,2, is released at time 40 and is finished at time 50. Then J1,2 is

the current job of T1 during [10, 50). By Definition 4.1, the next device request time NextDevReqT ime(λk, t) is

the minimal release time of all current jobs that require device λk.

With CEA-EDF, a device λi is switched to the low power state at time t when TimeToNextDevReq(λi, t) >

BE(λi), where BE(λi) is the break-even time for device λi and computed using Equation (3). CEA-EDF sets a

power up time, Up(λi), for device λi when λi is switched to the idle state. For any idle device, it is switched back

to the active state if the power up time Up(λi) is equal to the current time t. The CEA-EDF scheduling algorithm

then can be described as in Figure 3, and is invoked at scheduling points and when a power up time is reached.

We define scheduling points as time instances at which jobs are released, completed, or exit critical sections. An

example of CEA-EDF scheduling is illustrated in Figure 4.

Our experiments in Section 5 show that CEA-EDF scheduling is effective in energy savings, especially when

the system workload is low. Meanwhile, the implementation of CEA-EDF is simple.

11

1 Preprocessing:2 Compute Break-Even time BE(λk) (1 ≤ k ≤ m) for each device, as shown in Equation (3).3 Initiate next device request time NextDevReqT ime(λk, 0) (1 ≤ k ≤ m) for each device, as defined in Definition 4.1.4 Device scheduling at time t:5 If (t: instance when job Ji,j is completed)6 If (∃λk, λk = active and T imeToNextDevReq(λk, t) > BE(λk))

7 λk → idle;8 Up(λk)← NextDevReqT ime(λk, t)− twu(λk); // Set the power up timer for λk9 End

10 End11 If (t: ∃λk, λk = idle and Up(λk) = t) // Switch λk to active when current time is the power up time.12 λk → active;13 Up(λk)← −1; // Clear the power up timer for λk14 End15 Schedule jobs by EDF(SRP).

Figure 3. The CEA-EDF algorithm. BE(λk) is the Break-Even time for λk. twu(λk) is the transitiondelay from the idle state to the active state. Up(λk) is the power up time set to λk, at when the devicewill be powered up.

1λ

2λ

��

��

��

��

1,1J

1,2J

� � � � ��

��

��

� ��

Figure 4. CEA-EDF scheduling example; (a) the job scheduling from EDF. J1,1 is released at 6 anduses device λ1. J2,1 is released at 2 and uses device λ2. J1,1 has a higher priority than J2,1. (b) thedevice state transition with the CEA-EDF algorithm.

4.3. EASD

As discussed in Section 3, an energy-aware I/O device scheduler should reduce Tactive(λi), which is the time

that a device λi is in the active state, since a device usually has the highest energy consumption rate in the active

state. The ASD algorithm can reduce Tactive(λi) for all λi. However, some task sets may not be able to utilize

ASD because they cannot satisfy schedulablility conditions with WCETs that include device transition time. On

the other hand, the CEA-EDF algorithm can be applied to any task set that is schedulable, but it is not as efficient as

possible since it conservatively keeps some devices active while jobs requiring these devices are not the currently

executing job. The EASD algorithm, which is based on the ASD algorithm, addresses these limitations by keeping

track of device slack, which is defined as follows.

12

��

1T

2T

� � � � ��

1,1J

1,2J

��

��

��

��

��

2,1J 3,1J

��

��

��

��

��

��

��

��

��

kλ

(a) Job schedule and device schedule.

0 2 4 6 8 10 12 14 16 18 20 22 24−2

0

2

4

6

8

10

12

14

16

18

Time

Devic

e acc

ess d

elay

(b) The device access delay for λk.

0 2 4 6 8 10 12 14 16 18 20 22 24−2

0

2

4

6

8

10

12

14

16

18

Time

Devic

e dep

ende

nt sy

stem

slack

(c) The device dependent system slack for λk.

0 2 4 6 8 10 12 14 16 18 20 22 24−2

0

2

4

6

8

10

12

14

16

18

Time

Devic

e slac

k

(d) The device slack for λk.

Figure 5. Device Slack examples. T1 = {10, 4, ∅, ∅}; T2 = {30, 4, {λk}, ∅}. That is, λk ∈ Dev(T2). Fordevice λk, tsd(λk) = twu(λk) = 8; BE(λk) = 18. The device slack shown in (d) is the sum of thedevice access delay shown in (b) and the device dependent system slack shown in (c).

Definition 4.3. Device slack. The device slack is the maximal length of time that a device λi can be inactive

without causing any job to miss its deadline. We let DevSlack(λi, t) denote the device slack for a device λi at

time t.

The energy savings provided by EASD is closely related to the amount of available device slack. The more

device slack is exploited, the more opportunities can be created to put devices in the idle state to save energy. Thus

exploiting available device slack is the focus of EASD.

Device slack for a device λk comes from different sources. As discussed in the CEA-EDF algorithm, the time to

next device request is a source of available device slack. The CEA-EDF algorithm utilizes the time to next device

request to keep devices idle to save energy. However, other sources of device slack exist. For example, another

source of device slack for a device λk comes from the execution of higher priority jobs that do not require device

λk. In this case, jobs requiring λk cannot execute and thus create slack for the device. As shown in Figure 5, job

J1,1 occupies the CPU during [0, 4]; and the interval [0, 4] becomes a part of the device slack for device λk. This

13

kind of slack does not introduce idle intervals and thus does not jeopardize temporal correctness of the system.

In a system for which the utilization is less than 1, the execution of a job might be postponed without jeopardiz-

ing temporal correctness of the system; thus creating additional device slack. As shown in Figure 5, idle intervals

are inserted in the interval [0, 22] because J2,1 is delayed by the state transition of device λk. However, this kind of

slack needs to be carefully managed to maintain system schedulability. The EASD algorithm can utilize this kind

of device slack while still guaranteeing that every job meets its deadline.

In summary, three sources of device slack are identified in this paper: device access delay, device dependent

system slack and time to next device request. The time to next device request is defined in Definition 4.2.

Definition 4.4. Device access delay. The device access delay for a device λk is the time during which jobs

requiring λk cannot execute because of the execution of higher priority jobs that do not need λk. The device access

delay for a device λk at time t is denoted DevAccessDelay(λk, t).

Definition 4.5. Device dependent system slack. The device dependent system slack is the maximum amount of time

that the CPU can be idle before the execution of any jobs requiring device λk without causing any jobs to miss

their deadlines. The device dependent system slack for a device λk at time t is denoted DevDepSysSlack(λk, t).

As shown in Figure 5, the device access delay and the device dependent system slack for device λk can be

combined to create the device slack for λk because they represent non-overlaping device slacks. In the example

shown in Figure 5, the device dependent system slack for device λk at time 0 is 14. That is, idle intervals with total

length of 14 time units can be inserted before the execution of J2,1 without causing any job to miss its deadline.

Note that the 14 units of idle time are separated into two intervals, i.e., interval [4, 10] and interval [14, 22], as

shown in Figure 5(c). If a single idle interval with a length of 14 time units is inserted at time 4, job J1,2 will

miss its deadline. Additional device slack of 8 time units comes from the execution of jobs J1,1 and J1,2, as shown

in Figure 5(b). The two kinds of device slack do not decrease at the same time. On the contrary, the time to

next device request cannot be combined with either the device access delay or the device dependent system slack

because they might overlap. Therefore, the device slack of a device λk can be given by,

DevSlack(λk, t) = max(TimeToNextDevReq(λk, t), DevAccessDelay(λk, t) +DevDepSysSlack(λk, t)) (4)

Given the device slack for each device at time t, it is straightforward to implement the EASD algorithm. The

algorithm is presented in Figure 6. This algorithm contains three parts: (1) update device slack for all devices; (2)

perform device state transitions; and (3) schedule jobs with EDF and SRP.

14

1 The EASD Algorithm2 // Jexec is the job that is selected to occupy the CPU at time t.3 Update the device slack for all devices at t4 If (t: t is a scheduling point)5 ∀λk, DevSlack(λk, t)←Max(T imeToNextDevReq(λk, t), DevAccessDelay(λk, t) +DevDepSysSlack(λk, t));6 Else // t is not a scheduling point7 ∀λk, DevSlack(λk, t)← DevSlack(λk, t− 1)− 1;8 End9 Perform device state transitions at time t:

10 If (t: ∃λk, λk /∈ Dev(Jexec) and λk = active and DevSlack(λk, t) > BE(λk))11 λk ← idle;12 tenterIdle(λk)← t; // The time that λk starts the state transition to idle.13 End14 // Next condition makes sure that λk has been idle for long enough to compensate for energy consumed in state transition.15 If (t: ∃λk, λk = idle and t− tenterIdle(λk) ≥ BE(λk)− twu(λk))

16 If (λk ∈ Dev(Jexec) or DevSlack(λk, t) ≤ twu(λk))17 λk ← active;18 End19 End20 Schedule job with EDF(SRP);21 End

Figure 6. The peudocode for the EASD algorithm.

As shown in Equation (4), updating the device slack requires updating of the next device request time, device

access delay and device dependent system slack. Note that the computation for the next device request time, device

access delay and device dependent system slack are only performed at scheduling points, i.e., the time instances

of job completion, job release and existing critical sections. At time instances other than scheduling points, the

device slack for any device is simply decreased by 1 per time unit.

The algorithms to compute device access delay and device dependent system slack are provided in Appendix A.

In those computations, we assume that there are n tasks in the system; and the current job of each task Ti at time t,

is given by Jci whose absolute deadline is denoted by D(Jci). Without loss of generality, suppose the current job

Jc1 and Jcn have the earliest deadline D(Jc1) and the latest deadline D(Jcn) among all current jobs respectively.

In the example illustrated in Figure 5(a), we have n = 2; Jc1 = J1,2 and Jcn = J2,1 at time 5.

As shown in Appendix A, the update of the device slack for all devices can be done by looking at each job

Ji, Org Prio(Jcn) ≤ Org Prio(Ji) ≤ Org Prio(Jc1). The worst case computational complexity of this algo-

rithm at scheduling points is O(m + n + K), where m is the number of uncompleted jobs with priorities within

[Org Prio(Jcn), Org Prio(Jc1)], n is the total number of tasks in the system and K is the total number of de-

vices. Let Tl be the task with the longest period, and Ts be the task with the shortest period. In the worst case, m

is a function of b2× P (Tl)/P (Ts)c × (n− 1) + 1.

15

4.4. Schedulability

This section presents a sufficient schedulability condition for the CEA-EDF and EASD scheduling algorithms.

The condition is the same condition used for the EDF algorithm with SRP [2]:

Theorem 4.1. Suppose n periodic tasks are sorted by their periods. They are schedulable by CEA-EDF and EASD

if

∀k, 1 ≤ k ≤ n,k

∑

i=1

wcet(Ti)

P (Ti)+

B(Tk)

P (Tk)≤ 1, (5)

where B(Tk) is the maximal length that a job in Tk can be blocked, which is caused by accessing non-preemptive

resources including I/O device resources and non I/O device resources. Note that device state transition delay is

not included.

With the CEA-EDF algorithm, a device λk is guaranteed to be in the active state when any jobs requiring λk are

released. Therefore, CEA-EDF does not affect the schedulability of any systems. In other words, Theorem 4.1 is

true for CEA-EDF.

The problem for EASD is much more complex than for CEA-EDF. With the EASD algorithm, a device is

switched to the idle state when its device slack is larger than its break-even time. Therefore, there might be some

intervals that the CPU is idle while there are some pending jobs waiting for required devices to be switched to the

active state. Since the proof for the EASD algorithm requires the knowledge of job slack and device dependent

system slack, the proof is presented in Appendix B.

5 Evaluation

In this section, we present evaluation results for the CEA-EDF and EASD algorithms. Section 5.1 describes

the evaluation methodology used in this study. Section 5.2 describes the evaluation of the algorithms with various

system utilizations. Section 5.3 evaluates the ability that CEA-EDF and EASD reclaim unused WCETs to save

energy; and compares the performance of MDO with CEA-EDF and EASD. Section 5.4 gives a comparison of the

CEA-EDF and EASD algorithms.

5.1. Methodology

We evaluated the CEA-EDF and EASD algorithms using an event-driven simulator. This approach is consistent

with evaluation approaches adopted by other researches for energy-aware I/O scheduling [18, 20, 19]. To better

evaluate the two algorithms, we compute the minimal energy requirement, LOW-BOUND, for each simulation. The

16

Device Pactive (W) Pidle (W) Pwu, Psd (W) twu, tsd (ms)1

Realtek Ethernet Chip [13] 0.187 0.085 0.125 10MaxStream Wireless module [11] 0.75 0.005 0.1 40

IBM Microdrive [17] 1.3 0.1 0.5 12SST Flash SST39LF020 [16] 0.125 0.001 0.05 1SimpleTech Flash Card [15] 0.225 0.02 0.1 2

Table 2. Device Specifications.

LOW-BOUND is acquired by assuming that the time and energy overhead of device state transition is 0. A device

is shut off whenever it is not required by the current executing job, and is powered up as soon as a job requiring it

is executing. Therefore, the LOW-BOUND represents an energy consumption level that is not achievable for any

scheduling algorithm.

The power requirements and state switching times for devices were obtained from data sheets provided by the

manufacturer. The devices used in experiments are listed in Table 2. The normalized energy savings is used to

evaluate the energy savings of the algorithms. The normalized energy savings is the amount of energy saved under

a DPM algorithm relative to the case when no DPM technique is used, wherein all devices remain in the active

state over the entire simulation. The normalized energy savings is computed using Equation (6).

Normalized Energy Savings = 1−Energy with CEA-EDF or EASD

Energy with No DPM(6)

In all experiments, we used randomly generated task sets to evaluate the performance of the CEA-EDF and

EASD algorithms. All task sets are pretested to satisfy the feasibility condition shown in Equation (5). Each

generated task set contained 1 ∼ 10 tasks. Each task in a task set required a random number (0 − 3) of devices

from Table 2. Critical sections of all jobs were randomly generated. Other characteristics include task periods

and the best/worst case execution ratio, which are specified in each experiment. We repeated each experiment 500

times and present the mean value. During the whole experiment, all jobs meet their deadlines with the CEA-EDF

and EASD algorithms.

Although the worst case computational complexity of EASD is briefly discussed in Section 4.3, it may still be

a concern that EASD has too much scheduling overhead in practice. We did not measure scheduling overhead in

real systems since all algorithms were evaluated with simulations. Instead, we compared the scheduling overhead

of CEA-EDF and EASD with respect to EDF(SRP) in our simulations. We used relative scheduling overhead to

evaluate the scheduling overhead of CEA-EDF and EASD. The relative scheduling overhead is given by

relative scheduling overhead =scheduling overhead with CEA-EDF or EASD

scheduling overhead with EDF( SRP)− 1

1Most vendors report only a single switching time. Thus we used this time for both twu and tsd.

17

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0.4

0.5

0.6

0.7

0.8

0.9

System utilization

Nor

mili

zed

Ene

rgy

Sav

ing

Mean energy saving under different system utilizations

CEA−EDFEASDLOW−BOUND

(a) Mean normalized energy savings of dif-ferent system utilization settings for tasksets with periods in [50, 200].

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0.4

0.5

0.6

0.7

0.8

0.9

System utilization

Nor

mili

zed

Ene

rgy

Sav

ing



(b) Mean normalized energy savings of dif-ferent system utilization settings for tasksets with periods in [200, 2000].

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0.4

0.5

0.6

0.7

0.8

0.9

System utilization

Nor

mili

zed

Ene

rgy

Sav

ing



(c) Mean normalized energy savings of dif-ferent system utilization settings for tasksets with periods in [2000, 8000].

Figure 7. Normalized energy savings with multiple devices.

The mean value of the relative scheduling overhead of CEA-EDF is 3.1%, verifying that CEA-EDF is a low-

overhead algorithm. The mean value of the relative scheduling overhead of EASD is 59.3%. Considering that the

scheduling overhead of EDF(SRP) is very low, a relative overhead of 59.3% is affordable. For example, if a system

spends 1 time units out of every 1000 time units to perform scheduling with EDF(SRP), then the system will spend

only 1.593 time units out of every 1000 time units to perform scheduling with EASD. Therefore, the scheduling

overhead of EASD is very low with respect to the whole system.

5.2. Average energy savings

In this experiment, we measured the overall performance of CEA-EDF and EASD. Periods of tasks are chosen

from three groups: [50, 200]; [200, 2000] and [2000, 8000], which represent short-period, mid-period and long-

period groups respectively. The intention of experimenting with different ranges of task periods is to evaluate the

relation of the energy saving and the ratio of task periods to device state transition times. Within each group, task

periods and WCETs were randomly selected such that they are schedulable according to Theorem 4.1.

We first focus on the relationship of normalized energy savings to the system utilization, which is the sum of the

worst case utilization for all tasks. In this experiment, we set the best/worst case execution time ratio to 1. Figure 7

shows the mean normalized energy saving for the CEA-EDF and the EASD under different system utilizations.

On average, EASD saves more energy than CEA-EDF. In most cases, as the system utilization increases, the

normalized energy savings decreases. The rationale for this is that as tasks execute more, the amount of time

devices can be kept in idle mode decreases. Also it can be seen from the figure that the performance of EASD is

comparable to the LOW-BOUND.

18

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

Best Case Execution Time / Worst Case Execution Time

Nor

mili

zed

Ene

rgy

Sav

ing

Mean energy saving


(a) Normalized energy savings for various ratiosof the best case execution time to the worst caseexecution time.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

Best Case Execution Time / Worst Case Execution Time

Nor

mili

zed

Ene

rgy

Sav

ing

Mean energy saving

CEA−EDFEASDMDOLOW−BOUND

(b) Normalized energy savings for various ratiosof the best case execution time to the worst caseexecution time. Note no shared resource in thisexperiment since MDO does not address the issueof resource blocking.

Figure 8. Reclaiming unused WCETs to save energy.

An important, albeit intuitive finding, is that the ratio of device state transition time to task periods greatly affects

the energy savings. Both algorithms perform worst in the experiment with short periods, as shown in Figure 7(a).

This is consistent with our expectations. For example, suppose it takes a long time for a device λk to perform a

state transition; and λk is used in a system in which tasks have very short periods, then λk has little chance to be

switched to the idle state. Furthermore, the performance of EASD is close to optimal when tasks are in the long-

period group, as shown in Figure 7(c). It can be seen that EASD is more sensitive to task periods than CEA-EDF.

This is also consistent with our discussion of device slack in Appendix A, since device slack is closely related to

task periods. With the same system utilization, the mean device slack of devices in a system with longer periods

should be larger than those of devices in a system with shorter periods.

5.3. Reclaiming unused WCETs to save energy

In practice, job actual execution times can be less than their WCETs. Unused WCETs can be reclaimed to save

energy. First, we evaluate the ability of CEA-EDF and EASD to save energy by dynamically reclaiming the slack

coming from unused WCETs. Figure 8(a) shows the normalized energy savings for the CEA-EDF and EASD with

increasing best/worst case execution time ratios. In this experiment, the system utilization is set between 90%

and 100%; and task periods are chosen from the mid-period group, i.e., [200, 2000]. As with the first experiment,

critical sections of all jobs were randomly generated. As shown in Figure 8(a), both CEA-EDF and EASD save

more energy when the ratio of the best/worst case execution time is smaller, showing that both algorithms can

19

dynamically reclaim unused WCETs to save energy. Moreover, the difference between EASD and LOW-BOUND

remains almost unchanged with different best/worst case execution time ratios, which means that EASD is able to

fully reclaim the slack created by unused WCETs to save energy. Similar results are acquired for the short-period

group and long-period group, and are therefore omitted here.

In addition, we compare the energy saving of CEA-EDF and EASD with MDO, which is the only published

energy-aware device scheduling algorithm for preemptive schedules. This comparison intends to evaluate the ad-

vantage of online algorithms (CEA-EDF and EASD) over an offline-alone algorithm (MDO) in utilizing unused

WCETs to save energy. As discussed in Section 2, MDO cannot support shared non-preemptive resources. There-

fore, no critical section is generated for any job in this experiment. That is, task sets used in this experiment are

fully preemptive. As shown in Figure 8(b), MDO has energy savings of only an additional 0.74% over EASD when

job execution times are equal to their WCETs, i.e., the runtime job execution is exactly as computed with MDO at

the offline phase. However, the energy saving of MDO does not increase when the best/worst case execution time

ratio decreases, because MDO does not utilize unused WCETs to save energy [4]. It can be seen from Figure 8(b)

that MDO even saves less energy than CEA-EDF when the best/worst case execution time is less than 0.45.

5.4. Comparison of CEA-EDF and EASD

The last experiment compares the energy saving of EASD to CEA-EDF. We use the normalized additional

energy savings to evaluate the additional energy savings of the EASD algorithm. The normalized additional en-

ergy savings is the amount of energy saved under the EASD algorithm relative to the CEA-EDF algorithm. It is

computed using Equation (7).

Normalized Additional Energy Savings = 1−Energy with EASD

Energy with CEA-EDF(7)

In this experiment, task periods are chosen from the mid-period group, i.e., [200, 2000]. The best/worst case

execution time ratio is set to 1. The distribution of normalized additional energy savings with three ranges of

system utilization is presented in Figure 9. The results are consistent with previous experiment results. CEA-EDF

performs well when the system workload is low. When the system utilization is less than 10%, CEA-EDF performs

almost the same as EASD; and when the system utilization is less than 60%, CEA-EDF performs close to EASD.

There are a few instances in which the EASD algorithm actually results in more energy being consumed than the

CEA-EDF algorithm. This is because the EASD tries to reduce the time that devices are in the active state, but this

causes more device switches.

A remarkable result from these experiments is that CEA-EDF performs well, on average, compared to EASD

20

−0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Normalized Energy Savings

Per

cent

age

of s

imul

atio

n

(a) The distribution of normalized addi-tional energy savings with system utiliza-tion of 0-10%.

−0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

0.05

0.1

0.15

0.2

0.25

0.3

0.35


Per

cent

age

of s

imul

atio

n

(b) The distribution of normalized addi-tional energy savings with system utiliza-tion of 40-50%.

−0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

0.05

0.1

0.15

0.2

0.25

0.3

0.35


Per

cent

age

of s

imul

atio

n

(c) The distribution of normalized addi-tional energy savings with system utiliza-tion of 90-100%.

Figure 9. Comparison of the CEA-EDF and the EASD. X axis represents the normalized additionalenergy saving of the EASD to the CEA-EDF, Y axis represents the percentage of the normalizedadditional energy savings gained in all simulations.

when the system workload is low. Even in cases where the system utilization is near 100%, the CEA-EDF algorithm

can still achieve nearly 40% energy savings for I/O devices. Moreover, CEA-EDF can be used together with an

energy-aware processor scheduler without any modification, because CEA-EDF has no influence on processor

scheduling while it has an excellent performance/cost ratio. We will conduct the integration of CEA-EDF and

energy-aware processor scheduling in future research.

6 Conclusion

Two hard real-time scheduling algorithms were presented for conserving energy in device subsystems. Both

algorithms support the preemptive scheduling of periodic tasks with non-preemptive shared resources. The CEA-

EDF algorithm, though a relatively simple extension to EDF scheduling, provides remarkable power savings when

the system workload is low. On the other hand, EASD can produce more energy savings than CEA-EDF. Ulti-

mately, the choice of which energy saving algorithm to choose, if any, depends on the temporal parameters of the

task set and devices utilized.

Although the power management of the processor is not addressed in this paper, our work can be applied

to reduce the leakage power consumption of the processor, which is expected to become an increasingly larger

fraction of the processor energy consumption. Leakage power consumption is reduced by disabling all or parts of

the processor whenever possible. Therefore, with the CPU as a shared device for all tasks, our algorithms can be

applied without any modification.

In general, CEA-EDF and EASD do not result in the minimum energy schedule when multiple devices are

21

shared. The problem of finding a feasible schedule that consumes minimum I/O device energy is NP-hard. Hence,

our focus was not to find the optimal solution but to create algorithms that reduce the energy consumption of multi-

ple shared devices and that can be executed online to adapt to the work load. This work provides the foundation for

a family of general, online energy saving algorithms that can be applied to systems with hard temporal constraints.

References

[1] Advanced configuration & power interface specification. Advanced Configuration & Power Interface, August 2003,http://www.acpi.info/DOWNLOADS/ACPIspec-2-0c.pdf.

[2] Baker, T.P., “Stack-Based Scheduling of Real-Time Processes,” Real-Time Systems, 3(1):67-99, March 1991.

[3] Benini, L., Bogliolo, A., and Micheli, G., “A survey of design techniques for system-level dynamic power management”,IEEE Trans. VLSI Syst., vol. 8, June 2000.

[4] Chakrabarty., K, Correspondence with the author of the MDO algorithm, May 2005.

[5] Golding, R.A., Bosch, p., Staelin, C., Sullivan, T., and Wilkes, J., “Idleness if not sloth”, Proceedings of the WinterUSENIX Conference, 1996.

[6] Liu and Layland, “Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment”, Journal of theACM, 20(1), January, 1973.

[7] Liu, J., Real-time Systems, Prentice Hall, 2000.

[8] Lu Y. H., Benini L., “Power-Aware operating systems for interactive systems”, IEEE Transactions on Very Large ScaleIntegration Systems, 10(2):119-134, April 2002.

[9] Lu Y. H., Benini L. and Micheli G., “Operating-System Directed Power Reduction”, International Symposium on LowPower Electronics and Design, 2000.

[10] Lu Y. H., Benini L., and Micheli G., “ Requester-Aware Power Reduction”, International Symposium on System Syn-thesis, Stanford University, pages 18–23, September, 2000.

[11] Maxstream 9xstream 900mhz wireless OEM module. http://www.maxstream.net/pdf xstreammanual.pdf.

[12] Microsoft OnNow power management architecture. http://www.microsoft.com/ whdc/hwdev/tech/onnow/OnNowAppPrint.mspx.

[13] Realtek ISA full-duplex ethernet controller RTL8019AS. ftp://152.104.125.40/cn/nic/rtl8019as/spec-8019as.zip.

[14] Sha, L., Rajkumar, R., and Lehoczky, J.P., “Priority inheritance protocols: an approach to real-time synchronization”,IEEE Transactions on Computers, page 1175-85, 1990.

[15] Simpletech compact flash card. http://www.simpletech.com/flash/flash prox.php.

[16] SST multi-purpose flash SST39LF020. http://www.sst.com/downloads/ datasheet/S71150.pdf.

[17] IBM microdrive DSCM-11000. http://www.hgst.com/tech/techlib.nsf/techdocs/ F532791CA062C38F87256AC00060DD49/file/ibm md datasheet.pdf.

[18] Swaminathan, V., Chakrabarty, K., and Iyengar, S.S., “Dynamic I/O Power Management for Hard Real-time Systems”In Proceedings of the Ninth International Symposium on Hardware/Software Codesign, p.237-242, April 2001,Copen-hagen,Denmark.

[19] Swaminathan, V., Chakrabarty, K., “Pruning-based energy-optimal device scheduling for hard real-time systems”, InProceedings of the tenth international symposium on Hardware/software codesign, Pages: 175-180, 2002.

[20] Swaminathan, V., and Chakrabarty, K., “Energy-conscious, deterministic I/O device scheduling in hard real-time sys-tems”, IEEE Transactions on Computer-Aided Design of Integrated Circuits & Systems, vol 22, pages 847–858, July2003.

22

[21] Swaminathan, V., and Chakrabarty, K., “Pruning-based, Energy-optimal, Deterministic I/O Device Scheduling for HardReal-Time Systems”, ACM Transactions on Embeded Computing Systems, 4(1):141-167, February 2005.

[22] Tia, T.S., “Utilizing Slack Time for aperiodic and sporadic requests scheduling in real-time systems.”, Ph.D. thesis,University of Illinois at Urbana-Champaign, Department of Computing Science, 1995.

[23] Weiser, M., Welch, B., Demers, A.J., and Shenker, S., “Scheduling for Reduced CPU Energy”, Operating SystemsDesign and Implementation, 1994.

23

A Appendix

In this section, we provide algorithms used to compute device access delay and device dependent system slack.

To reduce computational overhead, the original priority of all jobs are computed offline. We assume that there

are N jobs in the hyperperiod; and all jobs in a hyperperiod are ordered by their priorities, Org Prio(J1) >

Org Prio(J2) > Org Prio(J3) > . . . > Org Prio(JN ). Without loss of generality, suppose that the current

job of each task Ti at time t, is given by Jci whose absolute deadline is denoted by D(Jci). Job Jc1 and Jcn have

the earliest deadline D(Jc1) and the latest deadline D(Jcn) among all current jobs respectively.

In Appendix A.1 and Appendix A.2, we show that updating device access delay and device dependent system

slack can be done by looking at all jobs with priorities between Org Prio(Jc1) and Org Prio(Jcn). Jc1 and

Jcn change at runtime. Thus the set of jobs included in computations is like a sliding window, which is called

the computation window hereafter. For example, suppose a system consists of two tasks: T1 = {10, 2, ∅, ∅} and

T2 = {24, 2, ∅, ∅}, then the computation window at time 0 is {J1,1, J1,2, J2,1}. When J1,1 is completed at time 2,

the computation window becomes {J1,2, J2,1}.

A.1. Device access delay

As defined in Definition 4.4, the device access delay for a device λk is the time during which jobs requiring

λk cannot execute because of the execution of higher priority jobs that do not need λk. The computation needs

the knowledge of the actual job execution time, which is unknown a priori. Therefore, WCET is used in the

computation as an approximation.

The computation of the device access delay for all devices at time t can be done by computing the sched-

ule of all current jobs from time t with their WCETs. The device access delay for a device λk at time t, i.e.,

DevAccessDelay(λk, t), can be acquired by looking at the schedule. For example, if there is at least one uncom-

pleted job requiring λk that is released before or at time t, then DevAccessDelay(λk, t) = t′ − t, where t′ is the

first time instance after time t that any job requiring λk occupies the CPU in the schedule. However, it requires

significant overhead to compute the device access delay in this way. To reduce the computational overhead, we

adopt a simplified algorithm. In this algorithm, we only consider higher priority jobs that have been released and

cannot be blocked by any current jobs requiring λk. The pseudocode for the simplified algorithm to compute the

device access delay for a device λk is shown in Figure 10. An optimized algorithm with lower computational

complexity is shown in Figure 11.

24

1 Function DevAccessDelay(λk, t)

2 Output: (1) the device access delay for λk, i.e., DevAccessDelay(λk, t); (2) the first device request job, i.e., Jλk,t.3 sum← 0; // record device access delay for device λk.4 // α denotes the set of jobs that require device λk5 Jx ← null; // The job with the highest priority among all jobs in α6 Jy ← null; // The job with the highest priority among all jobs in α and can be blocked by some jobs in α.7 ∀Ji, Org Prio(Jcn

) ≤ Org Prio(Ji) ≤ Org Prio(Jc1) // looking at each job within the computation window.8 If (λk ∈ Dev(Ji) and Org Prio(Ji) > Org Prio(Jx))9 Jx ← Ji;

10 // σ(λk, t) denotes the highest preemption ceiling of resources being held by any job requiring λk.11 Else If( λk /∈ Dev(Ji) and Org Prio(Ji) > Org Prio(Jy) and Res(Ji) 6= ∅ and PL(Ji) ≤ σ(λk, t) and et(Ji, [0, t]) = 0)12 Jy ← Ji;13 End14 End15 Jλk,t ← Ji, Ji ∈ {Jx, Jy} and Org Prio(Ji) = Max(Org Prio(Jx), Org Prio(Jy));16 ∀Ji, Org Prio(Jcn

) ≤ Org Prio(Ji) ≤ Org Prio(Jc1)

17 If (Org Prio(Ji) > Org Prio(Jλk,t) and R(Ji) ≤ t)18 sum← sum+ wcet(Ji)− et(Ji, [0, t]);19 End20 End21 DevAccessDelay(λk, t)← sum;22 End

Figure 10. The pseudocode for the simplified DevAccessDelay algorithm. The algorithm is reformattedfor readability. The optimized algorithm to reduce computational overhead is shown in Figure 11.

The DevAccessDelay algorithm is as follows. Suppose α is the set of jobs that require λk; Jx is the job with

the highest priority in α; and Jy is the highest priority job of all jobs that can be possibly blocked by some job(s) in

α. Then the device access delay for device λk consists of the remaining WCET of any job Ji, Org Prio(Jcn) ≤

Org Prio(Ji) ≤ Org Prio(Jc1), satisfying following two conditions: (1) Org Prio(Ji) > Org Prio(Jx) and

Org Prio(Ji) > Org Prio(Jy); and (2) the release time of Ji is equal to or less than the current time t, which is

to make sure that Ji can occupy the CPU before any job in α.

In the DevAccessDelay algorithm, the computation is done by looking at all jobs that have higher priorities than

Jx and Jy. Intuitively, any job that has a higher priority than both Jx and Jy can delay the execution of any job

requiring device λk once it is released. In the remainder of this paper, first device request job is used to represent

the job with the higher priority of Jx and Jy. The first device request job is defined as follows.

Definition A.1. First device request job. The first device request job of a device λk, at time t, is denoted Jλk,t and

computed by

Jλk,t =

Jx Org Prio(Jx) ≥ Org Prio(Jy)

Jy Org Prio(Jx) < Org Prio(Jy)(8)

where Jx is the job with the highest priority of all current jobs requiring device λk; and Jy is the job with the

25

1 Function DevAccessDelay()

2 // Update the device access delay and the first device request job for every device λk, 1 ≤ k ≤ K;3 sum← 0; // record device access delay.4 ToCompute← (∼ 0); // ToCompute is a bit-array used to indicate if the computation for devices is uncompleted.5 HoldingResJob← Head(HoldingResJobList); // HoldingResJobList is a list of jobs that are holding resources.6 ∀Ji, Ji ← Jc1 : Jcn

//Browsing the computation window from the highest priority job to the lowest priority job.7 D ← ToCompute & DevBits(Ji) // D is the set of devices that needs to be computed and are required by Ji.8 ∀λk, λk ∈ D9 Jλk,t ← Ji; // Jλk,t is the first device request job for λk.

10 DevAccessDelay(λk, t)← sum;11 ToCompute← ToCompute & (∼ (1 << (k − 1)));12 End13 If (Res(Ji) 6= ∅ and et(Ji, [0, t]) = 0)14 While (PL(Ji) ≤ σ(HoldingResJob)) // σ(Jx) is the highest preemption ceiling of resources being held by Jx.15 D ← ToCompute & DevBits(HoldingResJob);16 ∀λk, λk ∈ D17 Jλk,t ← Ji;18 DevAccessDelay(λk, t)← sum;19 ToCompute← ToCompute & (∼ (1 << (k − 1)));20 End21 HoldingResJob← Next(HoldingResJob);22 End23 End24 If (ToCompute > 0 and R(Ji) ≤ t)25 sum← sum+ wcet(Ji)− et(Ji, [0, t]);26 Else If (ToCompute = 0)27 break; // break the loop28 End29 End

Figure 11. The pseudocode for the optimized DevAccessDelay algorithm.

highest priority of all jobs that (1) do not start execution and do not require device λk; (2) require shared resources;

and (3) have equivalent or lower preemption levels than the preemption ceiling of some resources being held by

any job requiring device λk.

The DevAccessDelay algorithm Figure 10 shows the simplified algorithm to compute the device access delay

for one device at time t. In practice, the computation of the device access delay for all devices are performed

in the same computation window. Thus the computational complexity can be reduced by combining common

computations.

Figure 11 shows the algorithm to update the device access delay for all devices at time t. In this algorithm, three

data structures are used to facilitate the computation:

1. ToCompute. ToCompute is a bit-array to represent the devices of which the device access delay needs to

be computed. For example, suppose that the total number of devices is 8. The initial value for ToCompute

is set to 11111111 (line 4), indicating that the device access delay for all devices needs to be computed.

26

2. DevBits. DevBits is a bit-array to represent the devices required by each job. For example, suppose job

J1 requires devices λ1, λ3 and λ4, then the DevBits(J1) is 00001101.

3. HoldingResJobList. HoldingResJobList is a list of jobs that are holding resources at time t. HoldingResJob

is initialized to be the first job in the HoldingResJobList (line 5); and σ(HoldingResJob) denotes the

highest preemption level ceiling of resources being held by HoldingResJob (line 14). Suppose jobs in

the HoldingResJobList are Jb1 , Jb2 , . . . , Jbm , and σ(Jb1) > σ(Jb2) > . . . > σ(Jbm), then a job Ji that

requires some resources can start its execution only when its preemption level is higher than σ(Jb1). If a

resource ri is allocated to Ji at time t, then σ(Ji) > σ(Jb1) if Ji 6= Jb1 . Thus Ji is placed at the head of

HoldingResJobList at time t. With SRP, HoldingResJobList works like a stack. That is, a new job

always joins HoldingResJobList at the head of the list. Similarly, the job that releases a resource must

also be the first job of the list. Therefore, jobs join/leave the list in a FILO (First In Last Out) manner. For

a system of n tasks, the length of the list is at most n. The computational complexity of the maintenance of

HoldingResJobList at runtime is O(1).

It can be seen that the computation is done by looking at each job Ji in the computational window (line 6) from

the highest priority job to the lowest priority job. For any device λk that is required by Ji; and the computation

for the device λk is not yet completed (line 7), Ji is the first device request job of λk (line 8-12), according

to Definition A.1. If Ji can be blocked by some jobs requiring devices (line 13-14); and the computations for

these devices are not completed, then Ji is the first device request job of these devices (line 16-20), according to

Definition A.1. HoldingResJob is assigned to the next job in the HoldingResJobList (line 21). sum is used to

record the cumulative unused worst case execution time of all released jobs (line 25). Once computations for all

devices are done (line 26), the DevAccessDelay algorithm is completed (line 27). Note only lower priority job

can block a high priority job. However, the DevAccessDelay algorithm does not compare job priorities (line 14)

because if HoldingResJob has a higher priority than Ji, then the device set D (line 15) should be ∅ at this point.

The worst case computation complexity of this algorithm is O(m + n + K) where m is the number of jobs in

the computational window, n is the total number of tasks in the system and K is the total number of devices. It can

be seen that lines 7, 13, 23-28 execute at most m times; lines 8-12 and lines 16-20 execute at most K times since

at least the computation for one device is completed by executing these codes; lines 14, 15, 21 and 22 execute at

most n times since the maximum length of HoldingResJobList is n.

27

A.2. Device dependent system slack

Before presenting the algorithm to compute the device dependent system slack, we first introduce several con-

cepts used in the computation.

Definition A.2. Initial Job Slack. The initial slack of a job Jk, at time t = 0, is denoted JobSlack(Jk, 0) and

computed by subtracting the total time required to execute job Jk and other periodic requests with higher priorities

than this job and the maximum blocking duration for Jk from the total time available to execute job Jk. That is,

the slack of a job Jk at t = 0 is given by

JobSlack(Jk, 0) = D(Jk)−∑

D(Ji)≤D(Jk)

wcet(Ji)−B(Jk) (9)

where D(Jk) is the absolute deadline of Jk; B(Jk) is the maximal blocking time for Jk, as discussed in Section 4.1.

Definition A.3. job slack. The slack of a job Jk, at t, ∀t > 0, is denoted JobSlack(Jk, t). JobSlack(Jk, t)

decreases as it gets consumed by CPU idling and by the execution of lower priority jobs; and increases as jobs are

completed sooner than their WCETs. That is, the job slack of a job Jk at time t is given by

JobSlack(Jk, t) = JobSlack(Jk, 0)− Idle(0, t)−∑

D(Ji)>D(Jk),R(Ji)<t

et(Ji, [0, t]) +∑

D(Ji)≤D(Jk)

Urem(Ji) (10)

where R(Ji) is the release time of job Ji; Idle(0, t) is the amount of time the CPU has been idled till t; and∑

D(Ji)>D(Jk),R(Ji)<t

et(Ji, [0, t]) is the amount of time jobs with deadlines greater than D(Jk), have executed till t,

which implies that these jobs have to be released before t. Thus Idle(0, t)+∑

D(Ji)>D(Jk),R(Ji)<t

et(Ji, [0, t]) is the

total amount of slack consumed till t.∑

D(Ji)≤D(Jk)

Urem(Ji) is the amount of unused WCETs of completed jobs

with deadlines equal to or less than D(Jk), which are reclaimed as job slack for Jk.

Intuitively, the job slack of a job Ji at time t is the maximum amount of time that the CPU can be idle at time

t without causing Ji itself to miss its deadline. Suppose β = {Jj |∀Jj , Org Prio(Jj) ≤ Org Prio(Ji)}, then

minSlack = min(JobSlack(Jj , t)), ∀Jj ∈ β, is the maximum amount of time that the CPU can be idle at time t

without causing any jobs with equivalent or lower priority than Org Prio(Ji) to miss its deadline. If the idle time

is only inserted before the execution of any job in β without delaying the execution of higher priority jobs, then

minSlack becomes the maximum amount of time that the CPU can be idle before the execution of any job in β

without causing any job to miss its deadline. As shown in Figure 5(a), JobSlack(J2,1, 0) = JobSlack(J1,3, 0) =

14. If an idle interval of 14 time units is inserted at time 0, i.e., the CPU is idle during [0, 14], then J1,1 will miss its

28

λ��

� � � � ��

1J

2J

3J 3J

��

� � � � ��

Figure 12. λ ∈ Dev(J3); tsd(λ) = twu(λ) = 4; D(J1) = 10; D(J2) = 16 and D(J3) = 24. At time 4,JobSlack(J2, 4) = 4 and JobSlack(J3, 4) = 8. The shaded regions represent shared resource access.In this example, if the CPU is idle for 8 time units before the execution of J3, then J2 misses its deadlinebecause it is blocked by J3. Therefore, Jλ,4 = J2 and DevDepSysSlack(λ, 4) = JobSlack(J2, 4) = 4.

deadline. However, if J1,1 “preempts” the idle interval as shown in Figure 5(a), then every job meets its deadline,

and the total idle time before the execution of J2,1 and J1,3 is still 14 time units.

However, the above discussion is only true without resource blocking. If a job Jj ∈ β is holding a shared

resource and thus may block a higher priority job Jy /∈ β, then the idle interval inserted before the execution of Jj

can possibly delay the execution of Jy and all jobs with priority lower than Jy. In this case, Min(JobSlack(Jj , t)),

∀Jj , Org Prio(Jj) ≤ Org Prio(Jy), is the maximum amount of time that the CPU can be idle before the

execution of any job in β without causing any job to miss its deadline. As shown in Figure 12, J3 can block J2 and

thus the execution of J2 depends on the execution of J3. As a result, the maximum amount of time that the CPU

can be idle before the execution of J3 is the job slack of J2. Note that EASD schedules jobs according to EDF

and SRP. If there is a released high priority job in the system, a low priority job can execute only if it is the current

ceiling task and blocks all released higher priority jobs. In other word, the EASD algorithm does not allow low

priority jobs to utilize the idle intervals caused by device transitions for high priority jobs. The reason is: allowing

reordering job executions may cause unexpected blocking for jobs and thus jeopardize temporal correctness. For

example, a job can be possibly blocked for more than once with job reordering.

Recall that the device dependent system slack is defined to be the maximum amount of time that the CPU can

be idle before the execution of any jobs requiring device λk without causing any jobs to miss their deadlines. It is

now clear that the device dependent system slack is the maximum amount of idle time that can be inserted before

29

1 Function DevDepSysSlack(λk, t)

2 Output: The device dependent system slack for λk at time t.3 MinSlack ← +∞; // To record the minimal dynamic job slack;4 ∀Jx, Org Prio(Jcn

) ≤ Org Prio(Jx) ≤ Org Prio(Jcn)

5 JobSlack(Jx, t)← JobSlack(Jx, 0)− Idle(0, t)−∑

D(Ji)>D(Jx),R(Ji)<t

et(Ji, [0, t]) +∑

D(Ji)≤D(Jk)

Urem(Ji);

6 If (Org Prio(Jx) ≤ Org Prio(Jλk,t) and JobSlack(Jx, t) < MinSlack)7 MinSlack ← JobSlack(Jx, t);8 End9 End

10 DevDepSysSlack(λk, t)←Min(MinSlack,MinInitSlack(Jcn) +

∑

D(Ji)≤D(Jcn)

Urem(Ji)− Idle(0, t));

Figure 13. The pseudocode for the simplified DevDepSysSlack algorithm. An algorithm with lowercomputational complexity is presented in Figure 15.

the execution of Jλk,t and all lower priority jobs, which can be given by,

DevDepSysSlack(λk, t) = min(JobSlack(Jx, t)) ∀Jx, Org Prio(Jx) ≤ Org Prio(Jλk,t) (11)

To compute DevDepSysSlack(λk, t) for each device, we need to compute the job slack of all uncompleted

jobs. The complexity of this computation would be O(N), where N is the number of jobs in a hyperperiod. To

reduce computational overhead, the initial slack of all jobs in a hyperperiod are computed offline and are kept in

a job slack list ordered by deadlines. Let MinInitSlack(Jk) denote the minimal initial job slack of all jobs with

equivalent or lower original priorities than Org Prio(Jk), which can be given by,

MinInitSlack(Jk) = min(JobSlack(Jx, 0)) ∀Jx, Org Prio(Jx) ≤ Org Prio(Jk)

Then the minimum job slack of Jk and all jobs with lower priorities than Org Prio(Jcn) is MinInitSlack(Jcn)+

∑

Urem(Ji)−Idle(0, t), where Ji is any job that is completed at or before time t. Note that D(Jcn) has a deadline

no less than any completed job. In this way, the computation of device dependent system slack can be done by

looking at each job in the computation window. The simplified algorithm for the computation of device dependent

system slack is presented in Figure 13. A detailed algorithm is presented next.

The DevDepSysSlack algorithm We are now ready to describe the DevDepSysSlack algorithm. As discussed

before, our method involves an off-line phase. In this phase, the initial slack of all jobs in a hyperperiod are

computed and are kept in a job slack list ordered by priorities. Each entry of the job slack list contains a job’s ID,

say Jk; the corresponding initial job slack, JobSlack(Jk, 0); as well as MinInitSlack(Jk). An example of the

job slack list of a task set is shown in Figure 14.

30

��

��

��

��

��

��

�

�

�

�

�

�

�

�

�

�

�

�

��

��

��

��

��

��

��

��

1,1J 2,1J

3,1J 4,1J 5,1J

6,1J

1,2J 2,2J

3,2J

4,2J

1,3J

2,3J

3,3J

��

��

��

Figure 14. An example of job slack list. Jobs are ordered by priorities. T1 ={10, 3, Dev(T1), Res(T1)};T2 = {15, 4, Dev(T2), Res(T2)};T3 = {20, 5, Dev(T3), Res(T3)}.

The DevDepSysSlack algorithm contains two parts: (1) update the job slack for all jobs in the computation

window; and (2) update the device dependent system slack for all devices. This algorithm is invoked at time

instances when the current executing job is completed (line 7) or is preempted (line 29). After the job slack of all

jobs in the computation window is updated, the device dependent system slack of all devices can be acquired (line

39), according to Equation (11).

The computation of job slack is done by looking at each job in the computation window from the lowest priority

job to the highest priority job (line 14,27). Thus the computational complexity for updating the job slack of

all jobs in the computation window is O(m), where m is the number of jobs in the computation window. The

computational complexity of the device dependent system slack for all devices is O(K), where K is the total

number of devices. Therefore, the computational complexity for the DevDepSysSlack algorithm is O(m+K).

31

1 Function DevDepSysSlack()

2 Initialize at time t : 0

3 c← 0; // c is used to record the cumulative unused WCET of all completed jobs.4 t′ ← 0; // t′ is the last instance that the JobExecDelay algorithm is invoked.5 MinSlack ← +∞; // MinSlack is the minimum job slack of all jobs that have been looked at.6 Update dynamic job slack at time t:7 If (t: instance when job Ji,j is completed)8 c← c+ wcet(Ji,j)− et(Ji,j , [0, t]);9 If (Ji,j = Jc1 ) // Need to update Jc1

10 Jc1 ← the second job in the computation window;11 End12 If (Org Prio(Ji,j+1) < Org Prio(Jcn

)) // Need to update Jcn

13 J ′cn← Jcn

; // J ′cnis used to record old Jcn

.14 Jcn

← Ji,j+1;15 End16 ∀Jx, Jx ← Jcn

: Jc1 // Browsing the computation window from the lowest priority job to the highest priority job.17 If (D(Jx) < D(Ji,j)) // The execution time of Ji,j is not included in the initial job slack of Jx18 JobSlack(Jx, t)← JobSlack(Jx, t)− et(Ji,j , [t

′, t]);19 Else If (D(Ji,j) ≤ D(Jx) ≤ D(J ′cn

))20 JobSlack(Jx, t)← JobSlack(Jx, t) + wcet(Ji,j)− et(Ji,j , [0, t]); // Reclaim the unused WCET of Ji,j .21 Else // D(J ′cn

) < D(Jx)

22 JobSlack(Jx, t)← JobSlack(Jx, t) + c− Idle(0, t); // The dynamic job slack of Jx at time t23 End24 MinSlack ←Min(MinSlack, JobSlack(Jx, t),MinInitSlack(Jcn

) + c− Idle(0, t));25 Jx.minSlack ←MinSlack; // Jx.minSlack is the minimum job slack of all jobs with equivalent or lower priorities.26 End27 Remove Ji,j from the job slack list; t′ ← t;28 End29 If (t: instance when job Ji,j is preempted) // Ji,j can be idle job, D(idle job) = +∞, et(idle job, [t′, t]) = Idle(t′, t)

30 ∀Jx, Jx ← Jcn: Jc1 ; // Browsing the computation window from the lowest priority job to the highest priority job.

31 If (D(Jx) < D(Ji,j))32 JobSlack(Jx, t)← JobSlack(Jx, t)− et(Ji,j , [t

′, t]);33 MinSlack ←Min(MinSlack, JobSlack(Jx, t),MinInitSlack(Jcn

) + c− Idle(0, t));34 Jx.minSlack ←MinSlack;35 End36 End37 t′ ← t;38 End39 Update device dependent system slack:40 ∀λk, DevDepSysSlack(λk, t)← Jλk,t.minSlack;

Figure 15. The pseudocode for the DevDepSysSlack algorithm.

32

B Appendix

This section shows that Theorem 4.1 is true for EASD. With the EASD algorithm, a device is switched to the

idle state when its device slack is larger than its break-even time. Therefore, there might be some intervals that

the CPU is idle while there are pending jobs waiting for required devices to be switched to the active state. As

discussed before, these idle intervals are device dependent system slack. An example of device dependent system

slack is shown in Figure 5(a).

We first consider the relationship between the device slack at time t and the device slack at time t+1. The device

slack at time t means the device slack at the time instance t, while the time unit t means the duration [t, t + 1).

Suppose device λinact is not active at time unit t (including the time that device λinact begins state transition from

the active state to the idle state at time t). With following lemmas, we want to show that the device slack for this

device at time t+1 is at most 1 time unit less than the device slack at time t. Let α denote the set of all uncompleted

jobs that require device λinact; and let Jexec be the job that occupies the CPU at time unit t. Therefore, Jexec /∈ α.

Lemma B.1. The first device request job for device λinact at time t and time t + 1 are the same job. That is,

Jλinact,t = Jλinact,t+1

Proof: Suppose that Jx is the job with the highest priority in α; and Jy is the highest priority job of all jobs that

do not require λinact and can be possibly blocked by some job(s) in α at time t. According to Definition A.1,

Jλinact,t is either Jx or Jy.

Firstly, Jexec 6= Jx because Jx ∈ α and Jexec /∈ α. Thus at time t + 1, Jx is still the highest priority job in α

at time t + 1. Secondly, Jexec 6= Jy because Jexec is not blocked by any job. Since Jexec /∈ α, neither are new

resources acquired nor are resources being held by jobs in α released. Therefore, Jy is still the highest priority job

of all jobs that do not require λinact and can be possibly blocked by some job(s) in α at time t + 1. Therefore,

Jλinact,t = Jλinact,t+1.

Lemma B.2. The device access delay and the device dependent system slack for device λinact cannot decrease at

the same time. That is, DevAccessDelay(λinact, t+1) ≥ DevAccessDelay(λinact, t) or DevDepSysSlack(λinact, t+

1) ≥ DevDepSysSlack(λinact, t).

Proof: Suppose that DevAccessDelay(λinact, t + 1) < DevAccessDelay(λinact, t). That means Jexec has

higher priority than all jobs in α; and the WCET of Jexec is included in DevAccessDelay(λinact, t). It follows

33

that D(Jexec) ≤ D(Jλinact,t). Thus the execution time of Jexec has already been subtracted from the dynamic

job slack of Jλinact,t and all lower priority jobs, which cannot decrease in this case. With Lemma B.1, we have

Jλinact,t+1 = Jλinact,t. Therefore, the device dependent system slack for device λinact does not decrease when the

device access delay decreases.

Lemma B.3. The device dependent system slack decreases at most 1 per time unit. That is, DevDepSysSlack(λinact, t+

1) ≥ DevDepSysSlack(λinact, t)− 1.

Proof: From Lemma B.1, we know that Jλinact,t+1 = Jλinact,t. The device dependent system slack at time t+1 is

the minimum dynamic job slack of all jobs with priorities equivalent or lower than Org Prio(Jλinact,t+1). The dy-

namic job slack of any job can decrease at most one during a time unit. It follows that DevDepSysSlack(λinact, t+

1) ≥ DevDepSysSlack(λinact, t)− 1.

Lemma B.4. DevAccessDelay(λinact, t+1)+DevDepSysSlack(λinact, t+1) ≥ DevAccessDelay(λinact, t)+

DevDepSysSlack(λinact, t)− 1.

Proof: We show this in the following two cases: (1) the WCET of Jexec is not included in DevAccessDelay(λinact, t);

and (2) the WCET of Jexec is included in DevAccessDelay(λinact, t).

Case 1: The WCET of Jexec is not included in DevAccessDelay(λinact, t). In this case, DevAccessDelay(λinact, t+

1) ≥ DevAccessDelay(λinact, t). Also DevDepSysSlack(λinact, t+1) ≥ DevDepSysSlack(λinact, t)−

1, according to Lemma B.3. Therefore, DevAccessDelay(λinact, t + 1) +DevDepSysSlack(λinact, t +

1) ≥ DevAccessDelay(λinact, t) +DevDepSysSlack(λinact, t)− 1.

Case 2: The WCET of Jexec is included in DevAccessDelay(λinact, t). In this case, if Jexec is not completed

at time t + 1, then DevAccessDelay(λinact, t + 1) ≥ DevAccessDelay(λinact, t) − 1. According to

Lemma B.2, the device dependent system slack cannot decrease at the same time as the device access delay.

Therefore, Lemma B.4 is true in this case. On the other hand, if Jexec is completed at time t + 1, then

DevAccessDelay(λinact, t+1) ≥ DevAccessDelay(λinact, t)− (wcet(Jexec)−et(Jexec, [0, t+1]))−1.

The unused WCET of Jexec, i.e., wcet(Jexec)-et(Jexec, [0, t+1]), becomes additional job slack for Jλinact,t

and all lower priority jobs. Therefore, DevDepSysSlack(λinact, t+ 1) ≥ DevDepSysSlack(λinact, t) +

(wcet(Jexec)− et(Jexec, [0, t+ 1])). It follows that Lemma B.4 is true in this case, too.

34

Lemma B.5. The time to next device request at time t+ 1 is not less than the time to next device request at time t

minus 1. That is, TimeToNextDevReq(λinact, t+ 1) ≥ TimeToNextDevReq(λinact, t)− 1.

Proof: By Definition 4.2, TimeToNextDevReq(λinact, t) = NextDevReqT ime(λinact, t) − t, Since the re-

lease time of any job is fixed, we have NextDevReqT ime(λinact, t+1) ≥ NextDevReqT ime(λinact, t). There-

fore, TimeToNextDevReq(λinact, t+ 1) ≥ TimeToNextDevReq(λinact, t)− 1.

Lemma B.6. The device slack of device λinact at time t+ 1 is not less than the device slack of the device at time t

minus 1. That is, DevSlack(λinact, t+ 1) ≥ DevSlack(λinact, t)− 1.

Proof: Recall that the device slack for device λinact at time t is the larger of TimeToNextDevReq(λinact, t)

and DevAccessDelay(λinact, t)+DevDepSysSlack(λinact, t). The correctness of Lemma B.6 directly follows

from Lemma B.4 and Lemma B.5.

We have now finished the proof that the device slack for an inactive device decreases at most 1 per time unit.

Next, we provide another lemma before the proof of Theorem 4.1.

Lemma B.7. The dynamic job slack of jobs with equivalent deadline is the same at any time t. That is, JobSlack(Ji, t) =

JobSlack(Jj , t), ∀Ji, Jj , D(Ji) = D(Jj).

Proof: According to Definition A.2, JobSlack(Ji, 0) = JobSlack(Jj , 0), ∀Ji, Jj , D(Ji) = D(Jj). And the

update of dynamic job slack is only related to a job’s deadline, as shown in Equation (10). That is, the same amount

of slack is added or subtracted to all jobs with equivalent deadlines at any time t. Therefore, JobSlack(Ji, t) =

JobSlack(Jj , t), ∀Ji, Jj , D(Ji) = D(Jj).

Proof of Theorem 4.1: Assume Equation (5) holds but a job misses its deadline when scheduled with EASD. Let

Jk be the first job that misses its deadline D(Jk) and t0 be the last time before D(Jk) such that there are no pending

jobs with release times before t0 and deadlines before or at D(Jk). Since no job can release before system start

time, t0 is well defined. Let ρ be the set of jobs that are released in [t0, D(Jk)] and have deadlines in [t0, D(Jk)].

By choice of t0 and D(Jk), the jobs that execute in [t0, D(Jk)] are jobs in ρ and possibly a job that blocks a job in

ρ. Since there are transition delays for devices, there might be some idle periods in [t0, D(Jk)].

35

First of all, there can be at most one job Jb /∈ ρ that blocks any job in ρ, and the blocking job Jb must be released

before t0 and has a deadline larger than D(Jk). This conclusion directly follows from SRP and the proof can be

found in [2].

Next, we proceed with our proof in two cases: (1) there are idle intervals during [t0, D(Jk)]; and (2) there is no

idle interval during [t0, D(Jk)].

Case 1: There are some idle intervals during [t0, D(Jk)]. By choice of t0 and D(Jk), these idle intervals are only

from the time when jobs are waiting for required devices to become active. An example is shown in Figure 5.

If several devices are required at the same time, we consider the device that takes the longest time to perform

the state transition.

Suppose an idle interval [t, t′] is caused by the state transition delay of a device λk. It is obvious that

TimeToNextDevReq(λk, t) ≤ 0 and DevAccessDelay(λk, t) = 0. Thus DevSlack(λk, t) is equal to

DevDepSysSlack(λk, t). With the EASD algorithm, t′−t ≤ DevSlack(λk, t). Moreover, DevSlack(λk, t′)

is no less than DevSlack(λk, t) − (t′ − t) according to Lemma B.6. Therefore DevSlack(λk, t′) =

DevDepSysSlack(λk, t′) ≥ 0.

Assume that [t, t′] is the last idle interval during [t0, D(Jk)]. It follows that DevDepSysSlack(λk, t′) ≥ 0

for at least one device λk that is required by some job Jx at some time in [t, t′]. Now we show that Jλk,t′ ∈ ρ.

We discuss it in two cases: (i) Jx ∈ ρ; and (ii) Jx /∈ ρ.

Case i: Jx ∈ ρ. According to Definition A.1, we have Org Prio(Jλk,t′) ≥ Org Prio(Jx). It follows that

D(Jλk,t′) ≤ D(Jx) ≤ D(Jk). Therefore, Jλk,t′ ∈ ρ.

Case ii: Jx /∈ ρ. As discussed before, the only job that is not in ρ but can execute during [t0, D(Jk)]

is the blocking job Jb. Thus Jx = Jb; and at least one job in ρ, say Jy, must be blocked by Jx.

According to Definition A.1, Org Prio(Jλk,t′) ≥ Org Prio(Jy). It follows that Jλk,t′ ∈ ρ, since

D(Jλk,t′) ≤ D(Jy) ≤ D(Jk).

Recall that DevDepSysSlack(λk, t′) = Min(JobSlack(Ji, t

′)), where Ji is any job with a equivalent or

a lower priority than Org Prio(Jλk,t′). Since DevDepSysSlack(λk, t′) ≥ 0, we know that the job slack

of Jλk,t′ and all lower priority jobs are at least 0. Next we show JobSlack(Jk, t′) ≥ 0, where Jk is the

first job that misses its deadline. We discuss it in following two cases: (i) D(Jλk,t′) < D(Jk); and (ii)

D(Jλk,t′) = D(Jk).

36

Case i: D(Jλk,t′) < D(Jk). In this case, JobSlack(Jk, t′) ≥ 0 because Org Prio(Jλk,t′) > Org Prio(Jk).

Case ii: D(Jλk,t′) = D(Jk). According to Lemma B.7, JobSlack(Jk, t′) = JobSlack(Jλk,t′ , t

′) ≥ 0.

According to Definition A.3, JobSlack(Jk, D(Jk)) is no less than JobSlack(Jk, t′)−B(Jk) since there is

no idle intervals after time t′. Therefore, at time D(Jk),

JobSlack(Jk, D(Jk)) ≥ JobSlack(Jk, t′)−B(Jk) ≥ −B(Jk)

=⇒ JobSlack(Jk, D(Jk)) +B(Jk) ≥ 0

=⇒ D(Jk) ≥∑

D(Ji)≤D(Jk)

wcet(Ji) + Idle(0, D(Jk))+

∑

D(Ji)>D(Jk),R(Ji)<t

et(Ji, [0, D(Jk)])−∑

D(Ji)≤D(Jk)

Urem(Ji)

This contradicts to the assumption that Jk misses its deadline at D(Jk).

Case 2: There is no idle period during [t0, D(Jk)]. In this case, the proof is the same as presented in [2], and a

contradiction can be acquired. Please refer to [2].

Thus, in conclusion, each case leads to a contradiction of the assumption that Equation (5) holds but a job misses

a deadline. Therefore, Theorem 4.1 holds for EASD.

37

C Appendix

This section presents a sufficient schedulability condition for the ASD algorithm. Suppose n periodic tasks are

sorted by their periods, P (T1) ≤ P (T2) ≤ . . . ≤ P (Tn); and λ̄i is the device with the longest combined state

transition delays, i.e., tsw(λ̄i) = tsd(λ̄i) + twu(λ̄i), among all devices required by Ti.

With EDF and SRP, each job suffers two context switches; one for a job starting its execution and another one for

a job finishing its execution. Note that the context-switch when a job is preempted is attributed to the preempting

job; otherwise more than two context-switches can be attributed to a job. The context switch cost cannot be ignored

with ASD, because of state transition delay of devices. Context switch costs need to be included in each task’s

WCET.

We first consider the worst context switch cost for each job starting its execution. For any job Ji,j that is selected

to be executed at time t, the worst case is that λ̄i just starts switching to the idle state at time t− 1 and thus needs

to be switched back to the active state. This procedure takes at most tsw(λ̄i)−1 time units, which is the worst case

context switch overhead for a job starting its execution, as shown in Figure 16(a).

We next consider the worst context switch cost for a job finishing its execution. Suppose that Ji,j starts its

execution at time t and finishes its execution at time t′. If Ji,j does not preempt any job using devices at time t,

then the context switch cost for finishing execution is 0; otherwise, devices required by the preempted job, say

Jm,n, should be switched back to the active state, which takes at most tsw(λ̄m) − 1 time units. This time should

be included in the context switch cost for Ji,j finishing its execution, as shown in Figure 16(b).

With EDF, a task can only preempt tasks of longer periods. Suppose λ̂i is the device with the longest transition

time among all devices required by tasks that have longer periods than Ti. Then a sufficient schedulability condition

for the ASD scheduling algorithm is described in Theorem C.1.

Theorem C.1. Suppose n periodic tasks are sorted by their periods. They are schedulable by the ASD algorithm if

∀k, 1 ≤ k ≤ n,k

∑

i=1

(wcet(Ti) + tsw(λ̄i) + tsw(λ̂i)− 2)

P (Ti)+

B(Tk)

P (Tk)≤ 1,

where B(Tk) is the maximal length that a job in Tk can be blocked.

In fact, the condition is the same condition used for EDF algorithm with the SRP proposed in [2]. Since context

costs are included in WCETs, the proof of Theorem C.1 directly follows the proof presented in [2] and thus

38

λ��

� � � � ��

1,1J

1,2J

� � � � ��

(a) λ ∈ Dev(J2,1); twu(λ) = tsd(λ) = 4. At time 0, J1,1

starts its execution and λ begin its state transition to the idlestate. J1,1 finishes its execution at 1. The context switch costfor J2,1 starting its execution is 7.

λ �! �"�"

� � � � ��

1,1J

1,2J

� � � � ��

1,3J

(b) λ ∈ Dev(J1,1)∩Dev(J3,1); twu(λ) = tsd(λ) = 2. Attime 2, J3,1 is preempted by J2,1 , which is in turn preemptedby J1,1. At time 15 when J2,1 finishes its execution, λ is per-forming state transition to the idle state, while it is switchedback to the active state at time 16 because J3,1 requires λ.The context switch cost for J2,1 finishing its execution is 3.

Figure 16. Context switch costs in ASD. (a) context switch cost of a job starts its execution; (b) contextswitch cost of a job finishing its execution.

is omitted here. Note that tighter sufficient conditions of scheduablility may exist. However, addressing other

scheduling conditions is beyond the scope of this work.

39

1 The SA-ASD Algorithm2 // Jexec is the job that is selected to occupy the CPU at time t.3 Schedule devices at time t4 ∀λk, λk /∈ Dev(Jexec) and λk = active

5 λk ← idle;6 tenterIdle(λk)← t; // The time that λk starts the state transition to idle.7 End8 ∀λk, λk ∈ Dev(Jexec) and λk = idle

9 If (t− tenterIdle(λk) ≥ BE(λk)− twu(λk)). // device has been idle for enough time10 λk ← active;11 End12 End13 Schedule job with EDF(SRP);

Figure 17. The peudocode for the SA-ASD algorithm.

D Appendix

Similar to the CEA-EDF algorithm and the EASD algorithm, the ASD algorithm can utilize the break-even

time to avoid wasting energy for devices with high state transition energy costs. That is, an inactive device can be

switched to the active state only when the energy saved is large enough to compensate for the energy consumed in

state transitions. This algorithm is called SA-ASD, and is presented in Figure 17.

However, as with the ASD algorithm, SA-ASD may jeopardize the temporal correctness for some systems. The

sufficient schedulability condition for the SA-ASD algorithm can be acquired by replacing the state transition time

by the break-even time in Equation (C.1). Therefore, the sufficient schedulability for SA-ASD can be given by,

Theorem D.1. Suppose n periodic tasks are sorted by their periods. They are schedulable by the SA-ASD algorithm

if

∀k, 1 ≤ k ≤ n,k

∑

i=1

(wcet(Ti) +BE(λ̄i) +BE(λ̂i)− 2)

P (Ti)+

B(Tk)

P (Tk)≤ 1,

where B(Tk) is the maximal length that a job in Tk can be blocked.

As with the ASD algorithm, the proof of Theorem D.1 directly follows the proof presented in [2] and thus is

omitted here.

40

Documents

Online Energy-Aware I/O Device Scheduling for Hard Real ...cse.unl.edu/~goddard/Papers/TechReports/TR_CEAEDF_EASD.pdf · Online Energy-Aware I/O Device Scheduling for Hard Real