Seminar - TU Kaiserslauternrts.eit.uni-kl.de/fileadmin/publication_files/Guerra_TRRTSemSS11.pdf · Abstract—Wireless Sensor Networks (WSN) consists of spa-tially distributed nodes

Proceedings of

5th Real-Time Systems Seminar

Summer 2011

18 July 2011, Kaiserslautern, Germany

i

Edited by

Raphael GuerraGerhard Fohler

ii

Table of Contents: Real-Time Systems Seminar Summer 2011

Real Time properties and applications of wireless sensor networks.........................................................1Anoop Bhagyanath, Raphael Guerra

Global view of methods for evaluating end-to-end delays on AFDX.......................................................6Canko Canew, Raphael Guerra

Optimal scheduling approaches for FlexRay bus.....................................................................................10Piotr Swedrowski, Raphael Guerra

Challenges faced by on-chip network implementation for real-time embedded applications.................14Naga Rajesh Garikiparthi, Raphael Guerra

iii

1

REAL TIME PROPERTIES AND APPLICATIONSOF WIRELESS SENSOR NETWORKS

Anoop BhagyanathChair of Real Time Systems

Department of Electrical and Computer EngineeringTechnische Universitat Kaiserslautern

[email protected]

Raphael GuerraChair of Real Time Systems


[email protected]

Abstract—Wireless Sensor Networks (WSN) consists of spa-tially distributed nodes that are capable of sensing, gathering,processing and communicating data. A vast majority of WSNapplications are real time, which require bounded delay guaran-tee on packet delivery. We focus on the real time applications andchallenges in WSN. This paper points out the key challenges inWSN from a real time perspective and the real time techniquesemployed to meet these challenges. In presenting this, we focuson two specific real time methods 1) Contention-free PeriodicMessage Scheduler MAC protocol [3] and 2) RAP: A RealTime Communication Architecture for Large Scale WSN [4].We also concentrate on the drawbacks of these techniques andconclude by suggesting the future work in order to overcomethese drawbacks.

I. INTRODUCTION

A Wireless Sensor Network (WSN) is built up of a col-lection of nodes. These nodes communicate wirelessly. Eachone of them has processing capability, contain memory, havea radio frequency (RF) transceiver, power source and varioussensors or actuators. Various applications can use such a net-work, many of which requires real time guarantees. Environ-mental monitoring, industrial monitoring, military, agriculture,surveillance, medical field are only a few examples of a varietyof possible real time applications of WSN. These applicationsare categorized as real time since they demand that data istransmitted from the source and delivered to the destinationwithin a specified deadline.

In this paper, we address real time challenges in WSN. Themost obvious challenge from a real time perspective is to guar-antee end to end delay for messages. This must be achievedunder the constraints of minimum energy consumption andminimum resource requirement. The former ensures longer lifefor WSN and the latter ensures low cost. Employing WSNsatisfactorily to real time applications requires developmentof robust real time techniques. The unlimited potential ofWSN in various applications and the fact that vast majorityof these applications are real time increases the significanceof developing robust and reliable real time techniques toaddress these challenges. Use of WSN in medical field toconnect medical sensors wirelessly to handheld devices orPCs to monitor and treat patients is an example of a hardreal time WSN system. Wireless Medium is a lossy link

with high unpredictability and unreliability. This makes thedevelopment of robust real time techniques for WSN verydifficult. In addition, the factors such as minimization ofenergy consumed and resources used by the sensor node toin turn minimize the cost and energy makes this task evenmore challenging. Most existing protocols end up making alot of unrealistic assumptions like ignoring the possibility ofwireless interferences, location awareness etc. This limits theirapplicability to real world.

We present in this paper real time applications and chal-lenges in WSN. We then focus on two specific real timetechniques, 1) Contention-Free Periodic Message SchedulerMAC protocol[3] and 2) RAP: A Real Time Communica-tion Framework for WSN[4]. Reference paper [3] basicallydescribes techniques to come up with contention-free messageset from a given periodic message set. A message set is saidto be contention-free, if only one message is ready to betransmitted at a time. This contention-free periodic messagescheduler has lower time and space complexities comparedto an EDF scheduler. The latter [4] presents a complete realtime communication framework for large scale WSN. Thekey component of this architecture is Velocity MonotonicScheduling (VMS), which prioritizes the messages dependingon their deadline and distance to travel. RAP provides generalAPIs as a convenient high level service for general sensingapplications and this framework scales well in large scalewireless sensor networks because of the use of localized andefficient algorithms at every layer. Also the simulations usingRAP communication architecture shows that it considerablyreduces the end to end deadline misses in multi-hop scenarios.We also analyse and present the drawbacks of the aboveintroduced techniques in terms of the impractical assumptionsmade.

In section 2, we give an overview of real time applicationsof WSN. Real time challenges are listed out in section 3.Section 4 focuses on two real time methods 1) a contention-free periodic message scheduler MAC protocol and 2) RAP areal time communication architecture. Section 5 analyses thedrawbacks of these real time techniques and proposes futurework to overcome these drawbacks. Finally we conclude thepaper by summarizing the contents discussed.

2

II. REAL TIME APPLICATIONSIn this section, we run through a list of possible real time

applications of WSN and also list out a few in use exam-ples. Many possible real time applications like environmentalmonitoring, industrial monitoring, surveillance, medical, agri-culture, structural monitoring etc can utilize wireless sensornetworks. In environmental monitoring, the sensor nodes aredeployed over a region where some phenomenon is requiredto be monitored. When the sensors detect the event beingmonitored, the event is reported to a base station, which takesappropriate actions. WSN save the cost of wiring and moreimportantly enables access to previously inaccessible locationslike in rotating machinery for industrial monitoring, on animalsfor monitoring of their habitat etc.

There are a lot of areas where WSNs are currently used.WSN is deployed to monitor eruption at Volcan Tungurahua inCentral Ecuador. The sensors basically monitor the infrasonicsignals during eruption and this data is transmitted to the basestation 9 km away. Another example is Codeblue, a designto support medical sensors connected wirelessly to handhelddevices or PCs to monitor and treat patients. The sensorshere collect heart rate, oxygen saturation and EKG data andtransmit them over a 100m wireless link to a PDA. However,this work is currently not approved to be implemented inpractice. Pinptr is a wireless sensor network developed tolocalize the position of a sniper. The sensor detects muzzleblast and acoustic wave that originate from sound of shooting.The time of arrival of these acoustic waves is used to estimatethe sniper position and this is send to the base station. InDARPA’s self healing minefield, sensor nodes are placed onanti-tank mines. The peer to peer communication betweenthese nodes are used to respond to attacks by activating properanti-tank mines, thus complicating the progress of enemytroops. Intel’s wireless vineyard is an example of using WSNfor agricultural monitoring. Here, we have sensor nodes placedon the field, and they collect data. All data is redirected to datamules, small devices carried by people (or dogs) and thesedata mules interpret the collected data and use them to makedecisions on the presence of parasites and use of appropriateinsecticide [1].

III. REAL TIME CHALLENGESIn this section, we look at the challenges in wireless sensor

networks from a real time perspective. The most obvious realtime challenge is to guarantee bounded end to end delay formessages. Most end to end message transmission is multi hopin all practical scenarios. This is a WSN routing layer concern.Scalability of WSN is another factor that routing layer mustsupport for most applications. Real time Medium Access Con-trol (MAC) protocols aim at guaranteeing forwarding delaysin single hop scenarios. Wireless links are generally lossy andhighly unpredictable due to the environmental factors. Thismakes our goal of meeting the real time challenges associatedwith WSN very difficult. Furthermore, the delay guaranteeshave to be made under strict energy and resource constraintsof sensor nodes. This makes the task even more challenging.

Lifetime of a WSN is extremely critical for manyapplications[2]. Energy consumption of the nodes is the mainlimiting factor of the lifetime of a network. Thus minimizingenergy at all the individual layers in the network protocolstack is very important to prolong the life of the system. Withregards to resource constraints, real time protocols should bedesigned to incur minimal overhead so that the nodes needonly minimum processing or computing power and only asmall amount of memory to operate on. This ensures lowcost for WSN. These same factors must be considered whiledesigning other software layers like application or transportlayer for a sensor node.

IV. REAL TIME METHODS

There are many common mechanisms that real time MACprotocols and routing protocols use to meet the timing con-straints. MAC protocols often make use of traffic regulationmechanisms, scheduling of message transmissions, structurednetwork topology etc [1]. Routing protocols often rely ongeographic forwarding, multi-path routing, prioritized queuingmodel etc [1]. In the next two sub sections, we present indetail, two real time techniques designed to be employed inWSN.

A. CONTENTION-FREE PERIODIC MESSAGE SCHED-ULER MEDIUM ACCESS CONTROL (MAC) LAYER

Here we take a look at a contention-free periodic messagescheduler medium access control described in detail in [3]. Itis assumed that the network is single hop and every node inthe network can communicate directly with every other node.As the name indicates, this MAC protocol only deals withperiodic messages and the messages consist of a number offixed size packets. At the physical layer, the assumption ismade that each individual packet takes a fixed network timeunit to be transmitted. A message set is harmonic if and onlyif each message period is a positive integer multiple of allsmaller message periods and equivalently, each message perioddivides all larger message periods. A synchronization protocolis described in [3], which requires only a small overhead, onthe order of 1% of network bandwidth. A single message mi

is modeled by the parameters (φi, Ci, Di, Ti), where φi isthe phase or offset, Ci the number of packets in the message,Di is the relative deadline, which specifies the length of timeafter release by which transmission must be complete, and Tiis the message period. The utilization of a message set is UM

= Σ|M |i=1Ci/Ti, where |M | is the number of messages in the

set . Given a set of messages M, we find a message set M cf

whose schedule is contention-free. A message set is said to becontention-free, if only one message is ready to be transmittedat a time.

This section describes a centralized sub-optimal attributeassignment algorithm to come up with a contention-free mes-sage set M cf from a given message set M, consisting ofmessages mi=(φi, Ci, Di, Ti). As a first step, we convertthese multi-packet messages into single-packet messages. Thissimplifies our analysis and gives our protocol more degree of

3

freedom to work on. The period of each single-packet messagemay be less than or equal to the period of the correspondingmulti-packet message, ∀i : T cf

i ≤Ti. We also require that thephases of individual packets of the same message preservethe order among themselves, ∀mi ε M, ∀j,k=1...Ci : j<k→ φcfi,j < φcfi,k. The phase of each packet is greater thanor equal to zero and less than that message’s period, ∀i,j:0≤φcfi,j<T

cfi . The deadline of the original multi-packet mes-

sage is met if the assigned phases of individual packets of thismessage have a maximum difference less than or equal to thedeadline. Now to prove that this phase difference constraintholds, we assume for simplicity that a message deadline isits period, ∀miεM: Di=Ti, which is a realistic assumption.Suppose ∃i : φcfi,Ci

−φcfi,1>Ti. Equivalently, ∃i: φcfi,Ci>Ti+φ

cfi,1.

Since φcfi,1>0, we have φcfi,Ci≥Ti. This contradicts the phase

constraint made earlier, which says, ∀i,j: 0≤φcfi,j<Tcfi [3].

Thus, we cannot have a maximum phase difference betweenindividual packets of a message greater than the deadlineof that message, which guarantees that the deadline of theoriginal multi-packet message is preserved even though themessage is divided into many single-packets. Now we have aset of single-packet messages with some constraints specifiedfor their period and phase. Next we show how to use thecentralized sub-optimal attribute assignment algorithm to comeup with a contention-free message set from the above givenmessage set. Algorithm 1 is shown in figure 1 below.

Fig. 1. Centralized attribute assignment algorithm [3]

The basic idea behind this algorithm is to assign the phasesand the periods of the messages properly to come up with acontention-free message set from the given message set. Lines1 and 2 in Algorithm 1 make the message set harmonic byreducing the periods to largest smaller power of 2. Line 3initializes a variable which holds the next available time slotfor assigning the phase of a message. The table S representsthe message schedule, in which a true entry corresponds to atime slot already in use by a message for its transmission. Line4 initializes this table to false in the beginning. In line 5, weiterate through all messages and assign the current messagephase to the next available free slot in line 6. Lines 6 and7 updates the message schedule table S accordingly. Finally,lines 9 and 10 updates the phase variable to the next available

free slot after checking through the message schedule [3].Thisalgorithm is executed offline. Online, for each message a nodesends or receives, a real time task is created whose attributesare derived from the message attributes by multiplying eachby the network time unit tunit. The scheduling algorithm isthen implemented by executing each task when it is released.

From the properties of harmonic message sets, it can beproved that if a message set M has utilization UM≤1/2, thenthere exists a contention-free phase and period assignment[3].A comparison was made between EDF scheduler Time Di-vision Multiple Access (TDMA) and contention-free schedulerTDMA based on time complexity and space complexity. Timecomplexity indicates the complexity of the algorithm to comeup with a message to be transmitted during a time slot. Spacecomplexity represents the memory requirement at each sensornode. It was shown that contention-free scheduler TDMA haslower time and space complexities when a node receives lessthan 70% and 95% of all messages respectively [3].

B. RAP: A REAL TIME COMMUNICATION ARCHITEC-TURE FOR WIRELESS SENSOR NETWORKS

From a MAC protocol for real time wireless sensor net-works, we now move on to a complete real time commu-nication architecture for WSN. This RAP communicationarchitecture is described in detail in [4]. The framework isshown in Figure 2.

Fig. 2. The RAP Communication Architecture [4]

The applications interact with RAP communication frame-work through a set of query/event APIs. In general, a message(query or event) is directed to a set of sensors in an area ratherthan a specific sensor address. Location Addressed Protocol(LAP) converts the request to send a message to a set ofsensors in a specific area to the addresses of those sensors.The assumption made is that the routing layer is aware of thephysical geography. Geographic Forwarding makes a decisionon which immediate neighbor to transmit the message to. Ittransmits a packet to an immediate neighbor if 1) that neighborhas the shortest geographic distance to the packet’s destinationand 2) it is closer to the destination than the forwarding node[4].

4

The key component of RAP real time communication archi-tecture is the Velocity Monotonic Scheduling (VMS), whichmakes packet scheduling at a node both deadline aware as wellas distance aware. Basically, the packet priority is assignedin such a way that shorter the deadline, higher the priorityof the packet and longer the distance, higher the priority ofthe packet. VMS assigns the priority of a packet based onits requested velocity. Two priority assignments schemes arediscussed. Static Velocity Monotonic (SVM) calculates a re-quested velocity at the sender of each packet and this remainsfixed on each hop. Assuming a sender location of (x0,y0),destination location (xd,yd) and an end to end deadline of thepacket D, then requested velocity is set to

V = dis(x0, y0, xd, yd)/D

where dis(x0,y0,xd,yd) is the distance between sender anddestination. Dynamic Velocity Monotonic (DVM) recalculatesthe requested velocity of the packet upon its arrival at eachforwarding sensor node. The velocity is given by

V = dis(xi, yi, xd, yd)/(D − Ti)

where (xi,yi) is the location of the current node and the newterm Ti represents the elapsed time or the time that the packethas been in the network [4].

Finally the MAC layer assures that the wireless mediumaccess is granted according to the priority of the messagepacket. MAC layer uses the light-weight CSMA/CA protocol,but with a couple of extensions to include priority of packetsin determining the medium access. Firstly, the initial wait timeafter idle, which represents the wait time for a node after thechannel becomes idle, is included and is given by

DIFS = BASE DIFS × PRIORITY

where, DIFS is a counter value set by the node once channelbecomes idle and the node waits for a random period of timebetween 0 and DIFS before sending a Request To Send (RTS)packet. Thus, nodes having packets with a higher priority (cor-responding to smaller value for PRIORITY) choose a smallerwaiting time. Secondly, backoff window increase function,which represents the increase in time that a node waits whena transmission collision occur, is included and is given by

CW = CW×(2+(PRIORITY −1)/MAX PRIORITY )

where, MAX PRIORITY is the maximum value of PRIOR-ITY (corresponding to lowest priority). Thus, the backoff timefor a node having lower priority packets increases faster than anode having higher priority packets waiting to be transmitted[4]. Detailed simulations of wireless sensor networks demon-strated that RAP considerably reduces the end to end deadlinemisses in multi-hop scenarios. But no real time guarantees areprovided for the messages.

V. DRAWBACKS OF EXISTING TECHNIQUES ANDFURTHER WORK PROPOSAL

Most real time techniques make a lot of assumptions, whichare not practical. One very common misconception is the very

objective of real time techniques, which states that the goalis to provide hard real time guarantees for each transmittedmessage. Considering the highly unpredictable and unreliablenature of wireless links, the real time objective should be re-stated to accommodate to a large extent the inherent propertiesof wireless sensor networks [1].

Here we look at a few unrealistic assumptions made bythe real time techniques discussed in the previous section.Contention-free periodic message scheduler MAC protocolassumes that all the nodes are in same wireless range, whichis not true in most practical deployment scenarios. Mostpractical WSN deployments rely on multi-hop for messagetransmission. The RAP real time communication architectureassumes that every node is aware of the physical geography.This location awareness is in itself a big problem to be solved.Furthermore, both the real time techniques make the assump-tions like radio transceiver either in transmitting, receivingor turned off mode, not considering the transition time fromtx to rx and vice versa. This transition time introduces alarge enough interval of time, which must be considered whendesigning MAC protocols. Otherwise, even TDMA scheduledmesssage transmissions can lead to collisions if the nodes arein the same interference range. Another unrealistic assumptionthat both techniques rely on is that if no other node istrying to access the medium, the medium is free (ignoringthe possibility of wireless interferences). This is especiallycritical in case of TDMA based MAC protocols. In TDMA,we select a message in a node from a set of messages inmany nodes, in contention for transmission and we assumethat at this point in time, wireless medium is free. But in realworld scenarios, still wireless interferences can occur frommany other sources. Lastly, the simulators used for evaluatingthese techniques would have to be studied to understandhow realistically they model WSN deployment environment.It is difficult to develop a realistic radio model due to thedeep complexities of a practical WSN environment. Simplemodels used by simulators could hide design flaws and limitapplicability of these techniques in real world [1].

In future, we need to design and implement timelinessmethods for WSN without relying on these unrealistic as-sumptions. Real time objectives need to be tuned in orderto take into account the wireless sensor network properties.MAC protocols are required to be robust enough to dealwith unstable and weak links. Routing protocols should beable to support timeliness using minimum resources and theyshould be designed for scalability and adaptability. Also thesimulators, used for validation of these techniques shouldbe closely related to the real world, modeling the wirelessenvironment more realistically [1].

VI. CONCLUSIONS

Wireless sensor networks are currently receiving consider-able attention due to their unlimited potential. In this paper, welisted some real time applications of WSN. We presented thechallenges in WSN from a real time perspective and introduceda few general techniques used to meet these challenges. Then

5

we focused on two specific real time methods, namely, acontention-free periodic message scheduler MAC protocoland a real time communication framework for WSN: RAP.Drawbacks of existing real time techniques were discussedand future work was proposed to overcome these drawbacks.

REFERENCES

[1] Ramon Serna Oliver and Gerhard Fohler :Timeliness in Wireless SensorNetworks:Common Misconceptions. 9th International Workshop on Real-Time Networks RTN’2010, Brussels, Belgium, July 2010.

[2] Daniele Puccinelli and Martin Haenggi: Wireless Sensor Networks: Ap-plications and Challenges of Ubiquitous Sensing.Circuits and SystemsMagazine, IEEE, 2005.

[3] Thomas W. Carley, Moussa A. Ba, Rajeev Barua and David B. Stewart:Contention-Free Periodic Message Scheduler Medium Access Control inWireless Sensor / Actuator Networks. Real-Time Systems Symposium,2003.

[4] Chenyang Lu, Brian M. Blum, Tarek F. Abdelzaher, John A. Stankovicand Tian He : RAP: A Real-Time Communication Architecture forLarge-Scale Wireless Sensor Networks.IEEE Real-Time and EmbeddedTechnology and Applications Symposium, 2002.

6

Abstract— “AFDX (Avionics Full Duplex Switched Ethernet)

developed for the Airbus 380 represents a major upgrade in both

bandwidth and capability”[3] In order to determine the upper

bound of avionic flow in end-to-end delay in this kind of

networks there have been made a lot of researches for the

approaches. Starting from the basic (pessimistic) certification

model - Network Calculus, then to compare the results - the

Simulation model. Next the Stochastic Network Calculus with the

probability of exceeding the upper bound and Model checking

approach for the exact worst-case delay of each flow. The latest

work focus on improving the Trajectory approach - one for

distributed system which look closely in the priority queuing

settings.

The object of the paper is to give the global view of these

methods.

I. INTRODUCTION

To guarantee the determinism of avionic communications

there have been added new mechanisms to the evolutional

AFDX switched Ethernet . It is a big challenge in this area to

demonstrate that upper bound can be determined for end-to-

end communications delays. In mentioned technology all of

the avionics communications are able to be statically

described: asynchronous multicast communication flow are

identified and quantified and what’s important flows can be

statically mapped on the network of AFDX switches. In a

given flow – end-to-end communication delay of frame can be

described as the sum of transmission delays on links and

latencies in switches. What’s also really useful is that the

AFDX is based on Full Duplex so it assures no collision on the

physical support, and no CSMA/CD support is necessary.

The problems that occurs is that with sharing resources in

functions, leads to complex construction. Proving in this case

that no frame will be lost by network (no switch queue will

overflow) and to evaluate the end-to-end transfer delay

through network is the aim of the invented approaches.

However we need to remember that an exact stochastic

analysis of an industrial avionics network is unaffordable, due

to the number of VLs of such a network configuration.

We find big difficulties due to increase in the complexity

of the embedded systems, in term of rise in number of

integrated functions and their connectivity. What is interesting,

is that for example “setting randomly priorities gives worst

bounds than using no priorities” [1] what clearly means that

selecting priority policy is a way for a good solution. What’s

also important the occurring indeterminism problem was

shifted to the switch level where various flows can enter in

competition for sharing outputs ports of given switch.”

The first step which was made in researches was the

Network calculus mainly for the certification purpose. It

compute a worst-case upper bound for each communication

flow. “It allowed the scaling of the switches memory buffers

in order to avoid buffer overflow and frame losses”[3] what is

obviously pessimistic in the analysis.

Then to understand real behavior of AFDX the simulation

model was used. “This approach allows calculations on

modeled network of each flow the end-to-end delay according

to representative subset of possible scenarios”[3]. The

simulations in this case were made using configurations

provided by Airbus. However this approach cannot be used for

certifications obviously because rare events can be missed by

the simulations and we use it only on some scenarios.

Useful following approach was the Stochastic network

calculus which was proposed to compute a probabilistic upper

bound. Thanks to this approach the “computation of the

probability p for an end-to-end delay to exceed a given bound.

“[3] was done. The probability p in this case can be interpreted

as the acceptable probability that a frame misses its deadline.

Next method – the model-checking reviewed in the [4]

document give us computation of the exact worst-case delay

for each flow . “Unfortunately, it cannot cope with real AFDX

configurations, due to the combinatorial explosion problem for

large configurations. “[4].

The newest approach - Trajectory concept applied [4]

with “fixed priority policy was established in order to provide

the bounds needed for a deterministic avionics network with a

static priority Qos policy. The Idea of this approach is to

introduce additional non avionics traffic (with lower priority)

for improving the use of available AFDX resources”[4].

During improving all approaches and inventing new one

there have been mane simulations what was recently

discovered is that there is opportunity on focusing on the parts

that only influences end-to-end delay of a given flow. This

definitely will reduces the network and reduces the

calculations. We found out also that “The trajectory approach

with SP/FIFO scheduling is able to guarantee worst case end-

to-end delay for AFDX networks with static priority flows

differentiation QoS mechanism”[4] and it can be enhanced by

using grouping technique.

Paper is organized in this way. In on the next section (II)

The AFDX Network architecture will be presented. Then we

take look in (III a) for the Network Calculus and possible

improvements. Next the Simulation approach in (III b) as a

realistic example. In (III c) the Model checking approach for

determining an exact worst-case end-to-end delay and nn the

Global View of Methods for Evaluating

End-To-End Delays on AFDX Canko Canew, Raphael Guerra

Technische Universität Kaiserslautern, Germany

[email protected], [email protected]

7

end Trajectory approach in section (III d) for improving the

use of available AFDX resources.

II. THE AFDX NETWORK ARCHITECTURE

An illustrative example of AFDX is depicted in Figure

below.

As we can see it is composed with five interconnected

switches S1 to S5. Switches doesn’t have any input buffer on

input and one FIFO buffer occurs on each output port. The

inputs and the outputs are called End Systems. “Each end

system is connected to exactly one switch port and each switch

port is connected to at most one end system. Links between

are full duplex. The end-to-end avionics traffic

characterization is made by definition of Virtual Links.”[4]

Virtual Link(VL) is a concept of Virtual channel for

communication. Thanks to them “it is possible to statically

define all the flows (VL) which enter the network” [4] The

packets that will be exchanged moves through the VL and

End Systems .The VL defines a logical unidirectional

connection which goes from one source end system to one or

more destination end systems. On the Figure 1, vx is a unicast

VL with path {e3-S3-S4-e8}, while v6 is a multicast VL with

path {e1-S1-S2-e7} and paths {e1-S1-S4-e8}.

Only one end system inside the AFDX can be source of one

VL. VL definition includes also important parameters, BAG

Bandwidth allocation Gap, which is the minimum delay

between two next packets of the associated VL, and also Smin,

Smax which stands for a minimum and maximum packet

length. Compliance of VL parameters is ensured by shaping

unit at end system level and a traffic policing unit at each

switch entry port.

AFDX network allows assigning either a high or low

priority to each VL in every switch output port [1]. The load of

a physical link here is defined as the portion of time a link is

busy”[2]

All of the constraints of AFDX added to vintage Ethernet

enables a precise analysis of the network, this allows

computation of an upper bound and dimensioning of output

buffers so that no packet is lost.

III. METHODS FOR BOUNDING END-TO-END DELAY

A. Network calculus for obtaining the delay & Network

calculus stochastic

Certification is mandatory in the context of avionics. We

need probabilistic upper bound on the end-to-end delay of

each flow. Like mentioned before “An exact stochastic

analysis of an industrial avionics network is unaffordable, due

to the number of VLs of such a network configuration.”[3]

“One way to solve the problem is to use pessimistic stochastic

analysis which is a safe approximation of the exact stochastic

analysis.”[3] The calculated upper bound associated with a

given probability is guaranteed to be greater than the exact

upper bound. The Calculus gives the latency bound of any

elementary network entity and for those elements that have a

queuing capability, a queue-size bound expressed either in a

number of bits or in a number of frames(with a simple

majorization using Smin).”[1]

Elementary entity offers service curve β to an input flow

constrained by an arrival curve α ,the calculus also brings the

arrival curve α* of the output flow: α* = α ∅ βwhere α ∅ βis

defined by:

The improvement of the calculus was defining “groups” of

VLs. We group VLs that exit from the same multiplexer and

enter another multiplexer together, i.e. VL that share two

segments of path gives tighter bound up to 40% . Frames of

this VL are serialized once exiting the firs multiplexer and thus

they don’t have to be serialized again in the following

multiplexers” [1].

Also an important improvement of the Network Calculus

was the stochastic version of this approach. Its Aim is to

obtain the statistical calculation of delay and backlog bounds.

This theory allows computation of the probability p for an end

to end delay to exceed a given by approach bound and what is

very useful is that it assures that the probability of exceeding

the computed bound is not greater than p.

B. Simulation approach as a realistic example

In this approach the main gol is to approximate real

network behavior. It requires realistic model of the network

and calculates the end-to-end delays of a given flow on a

subset of all possible scenarios It gives experimental upper

bound on a set of scenarios.

All elements of the network will be built as a queuing

station or object structure. They represent the simple network

elements: One-way Links, Buffers, Demuxes, Scheduler

Muxes. The selected policy of service is Fifo” [2]. Building

the simulations difrrent strategies. For example in frame

generation like: frames can be generated periodically, using

the BAG asa perios, or using the BAG as a minimum inter

emission time For phasing between VL: synchronized one,

first frame an every VL is transmitted at the same time, or

ohase randomly distributed between 0 and its BAG and for

frame size: minimum length for every frame of every VL, or

maximum, or average length between min and max, or random

length between minimum and maximum ones for every frames

of every VL.

During researches (for analysis reasons) there have been

also computed for each path the ratio the end-to-end delay

obtained by simulation and the one calculated with the network

calculus.

8

In null phasing (first frame initiated in the same time) VL

synchronous the ration- most of the paths the ratio between 5

and 40 %. All VL path with a ratio of at least 70% have a

length of 1 (they cross single switch). What was then showed

was that deterministic upper bound obtained by the network

calculus approach is reachable in case of single switching

communication level, null phase configuration is quite often

close from the worst-case configuration and in the random

phasing each of them have a specific random delay before the

emission of its first frame.

For example the delay between 0 and BAG, results – for all

VL paths the ratio under 20%. It pop out also what influence

has BAG occupation on end to end delays. When all BAG are

occupied does not always lead to worst-case end-to-end delay.

In case reviewed in document [2] the Simulation check also

influence of frame length using as a reference the “minimum

length”.

It is on easy in this approach to find a representative subset

of scenarios in order to calculate the end-to-end delay

distribution.. The new mentioned key idea was to model only

the elements of the network which have an influence on the

end-to-end delay distribution of the flow.

Using this idea\ with classification of VLs (pictured below),

we can distinguish paths that doesn’t have direct influence on

delay distribution of Vx, what in latter works leads to drastic

reductions of the simulation space. But in order to obtain

larger reduction the classification has to be exploited more

effectively

C. Model checking approach for determining an exact worst-

case end-to-end delay

The approach presented in paper [2] is based on timed

automata. “This Method explore all the possible states of the

system and thus it determine an exact worst-case end-to-end

delay. It implies to compute if a propriety, expressed by a

timed logic is verified or not” [2]. The aim of it was also to

describe system behavior with time

Each action a executed by a first timed automaton corresponds

to an action with the same name a executed in parallel by a

second timed automaton. In this particular case performing

transitions requires no time, but time “can run” in nodes.

In this timed automata we use extension by so called

committed nodes or shared integer variables.

Like depicted above “A1 performs m1 and simultaneously

A2 performs m1. Then A1 performs m2 and simultaneously

A3 performs m2. As s2 of A1 is committed, the two transitions

m1 and m2 are performed simultaneously without time

evolution. This extension allows to model broadcast

communications mechanism through timed automata.”[2]

With mentioned shared integer variables a set of variables is

shared by timed automata. In such a way these values can be

consulted and updated by any timed automata.

The Reachability analysis here is performed by model

checking encoding the property in terms of reachability of a

given node of one of the automata.

To calculate worst-case, the method consist of: verifying

that a frame is received before a global transmission delay. By

the generalization of considered in [2] system, looking on the

picture below we can find defined 2 groups of VLs.

“GrVL1 a group of VL that all merge in Switch1 by the

same input port, cross Switch2 and go out by the same

output port. Similarly,

GrVL2 is a group of VL that all merge in Switch2 bt the

same input port and go out by the same output port.

Model-checking says here that worst-case end-to-end delay

is reached when GrVL1 and GrVL2 groups are synchronous.

This can help the simulation approach choosing a phasing

between VL that gives higher end-to-end delays that the null

phasing.” [2]

D. Trajectory approach

This approach is used to get deterministic upper bound on

end-to-end response time in distributed systems and to allow a

worst case delay computation identifies for a packet m the

busy periods and the packets impacting its end-toned delay on

all the nodes visited.

General distributed systems can by depicted like on the

picture nr 5. Each flow crossing system follows a static path

which is and order sequence of nodes. “ It assumes, with

regards to any flow ri following path Pi, that any flow rj

following path Pj , with Pj 6= Pi and Pj \ Pi 6= ;,never visits a

node of path Pi after having left this path.”[4]

9

„Normally flows are scheduled with a combined fixed

priority and FIFO algorithm in every visited node (non-

preemptive policy). The flows are at first sorter according to

fixed priority level and flows with same fixed priority are then

treated in FIFO order.”[4]

The end-to-end response time of packet is the sum of the

times spent in each crossed node and the transmission delay on

links. In Paper [4] they focus on fixed priority policy, and for

bounds needed for deterministic avionics network with a static

priority QoS mechanism. Idea was to introduce additional non

avionic traffic (wth lower priority) to improve the use of the

available AFDX resources.

The optimization of this approach was serialization of flows

with fixed priorities similar to the grouping technique in NC.

IV. DISCUSSION

We need the Network calculus approach since it gives the

guaranteed upper bound for end-to-end delay in ADFX but

unfortunately because of the assumptions made in this

approach usually this deal is not reachable.

Very good step was made with inventing of the “group

concept” for tightening the bound of arrival curves and inter

switches traffic which checked later in simulations gives 40%

better results on the end. After this it was also very useful to go

with the Network calculus approach into the Probabilistic

domain. The result of this step is a good candidate for

certification since it is guarantee. It give us also a computation

of probability p for an end-to-end delay to exceed the bound

which is given. However we cannot forget that due to

pessimistic assumptions of Network calculus this approach is

often pessimistic and because of that I don’t think that it will

be easy to improve more this approach.

Presented next approach – simulation gives an experimental

results for a given scenarios, off course it can be awfully

exceed if it will miss some rare event but what is more

important in my suggestion is that this approach gives a view

on a global estimation of the network load and it helps to

search new approaches and methods in bounding delays in

AFDX.

The next idea with the model-checking is not so useful. It

gives an exact delay and the corresponding scenario by

exploring all the possible states of the system. But it will be

not easy to use this one in real network configuration.

Definitely for a presented in documents models it will lead to

combinatorial explosion, so this one is still open topic in

researches.

I believe that a better research direction was with the latest

one, Trajectory approach which allows computation in

distributed systems can be also nicely improved by grouping

technique like in the Network calculus version, and it really

improve results with tighter upperbound on end-to-end delay.

What was really good step was taking a look for a low

priorities flow. Impact of them can be upper bounded per

switch by the transmission time of the biggest lower priority

packet.

V. CONCLUSIONS

In this paper we have took a brief look for AFDX networks

structure and approaches invented to bound the end-to-end

delays in avionic systems. This kind of networks represents

important upgrade which brings much better results and

decreases in hardware side like cabling etc.

Invented only for certification purposes Network Calculus

approach was brought here as a fundamental step in

calculations. Improved with grouping technique it brings a

major upgrade in respond time analysis.

The Network Calculus approach in probabilistic domain is also

mentioned since it also improve results in research by

calculating the probability p of exceeding the bound which is

given.

In this work we say also about Simulation model mainly to

show what’s the way to analyze AFDX in real environment,

and for obtaining a global estimation of network load. What’s

very important is the idea in this approach with which one we

only focus on the elements of the network which have an

influence on the end-to-end delay distribution of the flow. This

brings huge reductions in AFDX networks but in order to

obtain larger reduction the classification has to be exploited

more effectively.

The third approach presented in paper is based on timed

automata. This Model-checking method by exploring all the

possible states of the system determine an exact worst-case

end-to-end delay. As mentioned it is based on timed automata

so it is easy to guess that mainly the aim of it was to describe

system behavior with time.

The latest Trajectory appraoch was invented mainly for the

distribution system’s needs. “It identifies for a packet m the

busy periods and the packets impacting its end-to-end delay on

all the nodes visited by m. Thus, it allows a worst-case delay

computation”[4]. This approach main later idea is to introduce

additional non avionics traffic (with lower priority) for

improving the use of available AFDX resources. This method

with mentioned before grouping technique show that it

improves average by 10% the upper bound computed by

Network Calculus giving less pessimistic results.

REFERENCES

[1] F. Frances, C. Fraboul J. Grieu - Using Network Calculus to optimize

the AFDX network.

[2] Hussein Charara, Jean-Luc Scharbarg, J´erˆome Ermont, Christian

Fraboul - Methods for bounding end-to-end delays on an AFDX

network

[3] Jean-Luc Scharbarg, Frédéric Ridouard, and Christian Fraboul - A

Probabilistic Analysis of End-To-End Delays on an AFDX Avionic

Network

[4] Henri Bauer, Jean-Luc Scharbarg, Christian Fraboul – Applying

Trajectory approach with static priority queuing for improving the use

of available AFDX resources.

10

Abstract - The FlexRay communication protocol is expected to be

the de facto standard for high speed, in-vehicle communication.

FlexRay is a robust, scalable, deterministic digital serial bus

system designed for use in automotive applications. FlexRay is

designed to be faster and more reliable than CAN and TTP. In

this paper we will propose some approaches, how we can

optimally schedule messages in Static and Dynamic segment of

FlexRay networks. Also we will present, how to make optimal

schedule with the fault-tolerant communication in static segment

and on the end we will present two algorithms for scheduling

messages in switched FlexRay network.

I. INTRODUCTION

FlexRay is communication protocol for high speed, in-

vehicle communication. FlexRay has “Static Segment” for

time-triggered messages and “Dynamic Segment” for event-

triggered messages. Thus FlexRay can achieve data rates up to

10Mb/s. Also in the newest FlexRay networks, so called

“switched FlexRay networks”, due to application of “switch”

instead of active star we can achieve branch parallelism, which

gives us opportunity to have more than one sender in one

FlexRay cycle without collisions.

But with messages in FlexRay comes an issue of scheduling

those messages. Static and dynamic segment of FlexRay

consists of slots. In those slots we are placing our messages,

which are transmitted in FlexRay frames, that consist of

message data as multiples of 2-byte words and framing

overhead. First difficulty is to prepare our schedule optimally

with respect to allocated slots. We would like to implement

such an algorithm, which will place our messages in FlexRay

slots using as few slots as possible.

But as we know automotive networks are hard real-time

systems, so we also have to take care about deadlines of our

messages. So in fact our schedule has to be feasible, especially

for the event-triggered dynamic segment, moreover in dynamic

segment we want to minimize the bandwidth reservation (due

to using as few slots as possible).

In each network we have to be sure, that our transmission is

reliable. The easiest way to achieve reliability threshold in

information exchange networks is to retransmit data. But there

is problem, which data should we retransmit (if there is more

faults) and how many times, to achieve reliability threshold.

And last problem is, how to achieve parallelism in FlexRay

networks. So far in one FlexRay cycle there can be only one

sender. One of the approach to achieve many senders is

switched FlexRay network.

It is very important to find algorithm for scheduling

messages, which will take reasonable amount of time, because

in real-time systems we cannot “buy” more time. We have to

schedule our messages in the time, which we have and our

result should be optimal or at least reasonable and of course

they have to be feasible. And to be feasible, all deadlines have

to be meet. In hard real-time systems (i.e. FlexRay) missing a

single deadline can lead to catastrophic consequences,

including loss of human life.

In fault-tolerant communication for static segment, because

of many possibilities of messages retransmissions it is very

important to choose appropriate messages for retransmission

and to indicate number of repetitions for those messages to

achieve reliability threshold.

It is very difficult to implement such algorithm which will

do all those things, which we mention above and will do that in

reasonable amount of time. With reasonable amount of time

we can achieve only sub-optimal solutions. For the optimal

algorithms we have to deal with discrete nonlinear

optimization problem. And for the fault-tolerant

communication it is difficult to compute, which message we

should to retransmit and also how many times to achieve

reliability threshold “p”.

The basic idea of solution for the problem of optimal

message scheduling in FlexRay networks is to find algorithm

which will:

- use ass few slots as possible (optimization with respect

to allocated slots)

- minimize the used bandwidth (by decreasing used slots)

- prepare feasible schedule scheme (with respect to

deadlines)

- for time-triggered messages minimize the jitter

- For event-triggered messages minimize used bandwidth

(B) and cycle load (L), where bandwidth is the number

of reserved slots per one cycle for each node and cycle

load is the maximum number of slots reserved for

message transmission in one FlexRay cycle (FC)

For the switched FlexRay networks to achieve message

sending parallelism basic idea of solution is to use “switch”

instead of “active star” in network topology.

Presented in this paper algorithms are optimal with respect

to the number of used slots, what makes network faster and the

bandwidth utilization is higher. All presented algorithms gives

feasible schedule, what is the most important thing in hard

real-time systems. Although that presented algorithms in

reasonable amount of time can achieve only sub-optimal

solutions, those solutions are reliable in practice.

II. FLEXRAY BUS

Optimal scheduling approaches

for FlexRay bus Piotr Swedrowski, Raphael Guerra

Technishe Universitat Kaiserslautern, Germany

[email protected], [email protected]

11

A. Optimization framework for scheduling the FlexRay bus

Now we introduce and MILP (Mixed Integer Linear

Programming) formulation to find solution for the FlexRay

scheduling problem with respect to: FlexRay protocol rules,

allocated slots for schedule scheme and messages deadlines

(feasible schedule).

The approach is, that we have equations and restrictions,

which can be used to formulate algorithm, which will

optimally compute schedule scheme. To implement algorithm

we need:

- Activation, release and deadline constrains

- Job starts time and preemptions

- Schedulability constrains

- FlexRay protocol rules

- Data dependencies

Optimal solution can be found in a reasonable amount of

time (within one hour) for a case study taken from an x-by-

wire system, which proves the efficiency of this MILP solver.

B. Static segment of FlexRay bus

Message scheduling for static segment

This process can be divided into two steps, where first is

message frame packing of periodic signal (NIP problem). But

fortunately this NIP problem can be reduced to ILP problem

by evaluating the properties of FlexRay static segment.

Periodic signals are packed into messages, we first have to

observe that only signals from the same node and with the

same period are packed into the same message. Here very

important thing is how we will choose duration time of static

segment in FC(FlexRay cycle). For the static segment we can

calculate the duration time of static segment using this

equation:

TSS = NSTS * TSTS

Where:

TSS – duration time of static segment

NSTS - number of nodes

TSTS – duration time of one slot in static segment

After signals are packed into messages, we obtain an optimal

message set. Now we have to schedule this set. And that is the

second part of our approach, now we have to schedule periodic

messages obtained in the first step periodically while obeying

the FlexRay operation.

In scheduling our message set we can have two different

scenarios:

1) Message schedule without jitter:

When we have messages without jitter, then we are only

taking into account messages periods. First we are ordering

messages in such way, that two messages with coprime

periods cannot be scheduled with the same FID without

jitter (FID – frame ID, number of slot, in which particular

node can transmit message). After ordering message we

have partial order in our message set and now e can

schedule our message set using GLPK (Gnu Linear

Programming Kit).

2) Message schedule with jitter:

Here we have to schedule our message set optimally with

respect to allocated slots and also to jitter. To do that we

have to calculate for each message jitter and then schedule

message. After schedule all messages in one FC we sum up

all jitters from messages and calculate overall jitter (with

some weight “p”), which after that we add to our used slots

in formulation of our optimization problem.

The main conclusion, which we should to say about

messages scheduling for static segment of FlexRay network is

that frame packing I essential to achieve satisfactory

utilization. For example without frame packing we can achieve

utilization U=0,6 and with frame packing we can achieve

utilization U=0,11 what gives us about 83% improvement.

Fault-tolerant communication for static segment

In automotive networks we can have many faults, which can

be caused by electromagnetic interference, radiation,

temperature variations, etc.. Such a faults can appear for a very

short amount of time and they can cause miscalculations in the

logic or data corruption and after that they can disappear

without permanent or physical damage to the circuit. To defeat

that problem, we propose retransmission approach.

First propose will be CLP-based approach (optimal

approach). The main goal is to achieve reliability threshold

“p”. And schedule scheme has to be feasible and optimal with

respect to the number of used slots.

We have to define smallest repetition number for each

message “M”, that reliability threshold “p” is achieved. After

that we have for each message its repetition number “ki”. Then

from the all “ki” we can distinguish upper and lower bound for

“ki”, such that the reliability threshold “p” is achieved. So now

our possibilities of “ki” is much smaller. Now we only need to

calculate for each message in our schedule the finish time and

check whether the retransmission is possible and also if it is

needed.

Of course CLP-based approach is very good and optimal but

it is very much time consuming. We propose efficient heuristic

approach. This approach we can divide into three steps:

1) Compute the required number of retransmission “ki” for

each message, such that reliability threshold “p” is

achieved.

GP ≥ p

2) The second stage involves the scheduling, now we have to

assign slots to messages. So we have to check whether the

retransmission is needed and possible, and if yes then how

many time we should retransmit message to achieve

reliability threshold “p”

3) In the third stage algorithm is identifying critical message

(scheduler fails to build a schedule and goes back to the

step first to compute new “ki”). In this stage algorithm is

trying to minimize the number of changes in previous

computed “ki” values, such that a schedule can be

constructed.

The optimal-CLP algorithm is very good, but also is very

complex and its evaluation time grows up exponentially with

the number of messages. Also the optimal-CLP is always able

12

to find optimal solution, but with the large test cases it could

not give an answer in reasonable amount of time (within one

hour).

Heuristic approach is less complex and its computation time

is smaller, but heuristic approach does not give optimal result.

In 80 test cases heuristic algorithm fails only 5 times.

When there were cases with no feasible schedule solution,

both optimal-CLP and heuristic reported that. In all cases

when heuristic was successful, it also obtained the same

optimization cost as the optimal-CLP.

C. Dynamic segment of FlexRay

In FlexRay networks we have static segment for time-

triggered messages and the dynamic segment for event-

triggered messages. Dynamic segment is also divided into slots

(like static segment). Here the smallest entity in dynamic

segment of FlexRay network is MS (mini slot). Messages are

mapped into a specific DYS (dynamic slot). One DYS = one

ore more MS. In Dynamic segment for each message we are

making reservation of DYS for that message. It is sufficient to

transmit message during the reserved DYS, such that message

will meet its deadline.

Now, when we have static and dynamic segment, we have to

choose appropriate Tc(duration time of one FlexRay cycle):

Tc ≥ TcDS + TcSS

where:

TcDS – duration time of Dynamic Segment in one FlexRay

cycle

TcSS – duration time of Static Segment in one FlexRay cycle

Cycle Load (Lj) of an FlexRay cycle “j” denotes the

maximum number of used MS (mini slots), that is reserved for

message transmission in FCj for an arbitrary assignment of

FIDs (Frame ID). FID denotes, in which slots particular node

can transmit message. To one node we can assign more than

one FID, what means that one node can transmit message in

many slots (not only in one slot). The number of FIDs cannot

be grater than number of slots in dynamic segment, and the

smallest number of FIDs is the number of nodes in network.

In dynamic segment we can assign multiple messages to the

same reservation and as a result of this assignment, more

reservations are utilized, such that the bandwidth “B” is

minimized. This process of assigning more than one message

to the same reservation is called message grouping.

Basic idea of message grouping in dynamic segment of

FlexRay network is to group messages in such way, that

messages with the smallest deadlines are in many groups. If we

do such a grouping process, then after this process we have

many groups where messages with smallest deadlines are in

many groups, but on the end not all groups will be used in our

schedule scheme. For example: message M1 has to be

transmitted only once during, for example 8FC(FlexRay

cycles), so after transmission of message M1 on the beginning

of our 8FC in next FC cycles we still have reserved DYS for

M1, but those DYS will be never used for transmitting M1,

because message M1 was already transmitted. We can use

those reservation for transmission of another message, which is

also in the same group and can be transmitted during the

reservation for message M1.

To optimally schedule messages in dynamic segment of

FlexRay network we have to minimize the number of reserved

DYS for all messages. To do that we have to select our groups

to schedule scheme in such way, that reservation of DYS for

those messages is minimized. In that case we have to minimize

the Lmax (maximum reservation of DYS for all messages in

dynamic segment).

This minimization problem is the NIP problem, so to

facilitate solution this problem is decomposed into two linear

binary programming (BIP) minimization problems (steps).

First step is to select groups in such way, that the bandwidth

reservation “B“ is minimized. We can do this step using

Tomlab. After first step we have our selected groups, for

which bandwidth is minimized.

Now in the second step we can minimize the cycle load

reservation Lmax, by computing the offsets for selected groups.

It is verified that the NIP and two steps BIP formulations

gives the same results for practical messages sets. For small

message sets NIP gives an optimal solution, but for larger

message sets (more than 16 messages in dynamic segment)

NIP fails. Two steps BIP approach does not give an optimal

solution, but two steps BIP approach is suitable in practice

examples and is less computationally expensive than NIP

approach. Test cases has shown, that by message grouping we

can reduce bandwidth reservation by about 20%.

III. SWITCHED FLEXRAY NETWORKS

Switched FlexRay networks are the next step in automotive

systems. In switched FlexRay networks messages are also

transmitted in frames, and many messages can be assigned into

one frame. The main difference between ordinary FlexRay

network and switched FlexRay network is that in ordinary

FlexRay there is thing called “active star”, which is used to

connect many branches of FlexRay network, and in switched

FlexRay network instead of “active star”, we have “switch” to

connect branches.

Fig. 1 The switched FlexRay network structure

By this replacement we have in switched FlexRay networks

branch parallelism. Branch parallelism means, that we have

something like two (or more – depends from switch and from

network architecture) sub-networks, which are independent

from each other, so we can schedule messages in those sub-

networks in parallel without any collision.

Such a way of parallelism gives us opportunity to have

multiple senders in one FlexRay cycle (because of the

independency of sub-networks).

13

But also in switched FlexRay networks one problem is to

create feasible schedule scheme by packing a set of frames in

as few slots as possible (optimization with respect to the

number of allocated slots). Another problem are also the

collisions between frames, while packing those frames.

First algorithm, which we would like to present is

Decreasing First –Fit algorithm. This is no optimal algorithm,

but it gives reasonably good schedules in very short amount of

time.

The algorithm approach is as follow:

1) Frames are sorted according to their assigned weight

(where weight denotes message size).

2) The scheduler tries to place each frame in the first slot, in

which frame fits (every time checks if there will be

collision with the previous frame).

3) If a frame cannot be placed in any slot, then algorithm

fails to find a feasible solution.

Second algorithm, which we would like to present is branch

and price algorithm, which is an optimal algorithm

(optimization with respect to number of used slots).

This algorithm firstly is trying to find the smallest possible

subset of packings, that contains all frames, this process is

called master problem. Algorithm starts only with very small

set of packings and then is iteratively adding packings to this

small set (this process is called column generation), after

some time of adding packings algorithm finishes with only

limited number of all possible packings. From this moment it

is called RMP (Restricted Master Problem). RMP is solved by

iteratively checking every feasible set until an optimal one has

been found, this is done by simplex algorithm. To determine,

which packing should be included in the smallest subset, the

“pricer” is used in each step of the simplex algorithm.

The pricer tries to maximize the product “yt p”, by finding

new value p. In the product “yt p”, y denotes dual value vector

obtained by simplex algorithm, in which every frame maps to a

value. When the pricer finds new value p, then it counts the

product and if yt p >1, then the packing is added to the RMP

(smallest packings subset), otherwise is not added.

We will present two kinds of pricers:

1) First-Fit pricer, which is very fast, but is no optimal. Its

approach is very similar to DFF (decreasing First-Fit)

scheduler – if first result is good, take it.

2) ILP pricer, which is an optimal pricer, but it much slower

than first-fit pricer. In practice this pricer is only used,

when first-fit pricer cannot find a packing first.

Those two proposed algorithms are very different, when we

compare their computation times. DFF algorithm is very fast

and also in practice provides reasonable good schedule

scheme, but this algorithm is not optimal one. BP (Branch and

Price algorithm) is an optimal algorithm, but due to its ILP

pricing its run-time is very long (in some cases many hours).

Moreover BP algorithm is very computationally expensive. BP

algorithm with First-Fit pricer is not optimal algorithm, but

gives reasonable good solution in reasonable amount of time.

IV. CONCLUSIONS

In this paper we present approaches to schedule messages in

FlexRay network. FlexRay networks are brand new systems in

automotive networks, thus still some of algorithms presented

in this paper gives non-optimal solutions, but we should to

notice that the solutions which those algorithms gives are

feasible (what is the most important issue for the hard real-

time system) and also we should to notice that given

algorithms are able to give reasonable schedule scheme in

reasonable amount of time. In switched FlexRay networks we

are able to have multiple senders in one FlexRay cycle, thus

we can achieve faster information exchange in FlexRay

networks.

V. REFERENCES

[1] Haibo Zeng, Wei Zheng, Marco Di Natale “Scheduling

the FlexRay Bus Using Optimization Techniques”

[Design Automation Conference, 2009. DAC '09. 46th

ACMIEEE July 2009]

[2] Ece Guran Schmidt, Klaus Schmidt “Message Scheduling

for the FlexRay Protocol The Static Segment” [Vehicular

Technology, IEEE Transactions, Jun 2009]

[3] Bogdan Tanasa, Unmesh D.Bordoli, Petru Eles, Zebo

Peng “Scheduling for Fault-Tolerant Communication on

the Static Segment of FlexRay” [31st IEEE Real-Time

Systems Symposium (RTSS10), San Diego, CA, USA,

November 30-December 3, 2010]

[4] Ece Guran Schmidt, Klaus Schmidt “Message Scheduling

for the FlexRay Protocol - The dynamic segment”

[Vehicular Technology, IEEE Transactions, Jun 2009 ]

[5] Thijs Schenkelaars, Bart Vermeulen, Kees Goossens

“Optimal Scheduling of switched FlexRay networks”

[Design, Automation & Test in Europe Conference &

Exhibition (DATE), 2011, 14-18 March 2011] .

14

CHALLENGES FACED BY ON-CHIP-NETWORKIMPLEMENTATION FOR REAL-TIME

EMBEDDED SYSTEMSNaga Rajesh GarikiparthiChair of Real Time Systems


[email protected]

Raphael GuerraChair of Real Time Systems


[email protected]

Abstract—The objective of this paper is to address the chal-lenges of energy consumption, hardware overhead and perfor-mance confronted by Network on Chip and the implementationsproposed to overcome them. Energy efficiency is one of themost important issues and relies strikingly on task allocation.Theexisting work does not well illustrate the trade-off between powerconsumed by the processor and the network links. One of thegoals of this paper is to present a scheme which accounts for thetrade-off between communication and processing power resultingin suboptimal mappings of tasks from the system point of view.From the hardware implementation cost standpoint, a priorityshare policy for real-time on chip communication is describedwhich reduces resource overhead. Experiments show that sig-nificant resource sharing could be achieved without missingdeadlines. For maintaining low latency and high throughput forbest effort traffic assuring guaranteed services to traffic withhard deadlines, the paper describes a run-time configurable NoCthat enables bandwidth guarantees with minimum impact onlatency for best effort traffic. There exists a lot of heterogeneity inembedded system applications (for e.g. automotive, avionics andconsumer electronics). Hence arises the need for a predictablefault tolerant integrated execution environment for componentbased design. This paper also focuses on a Time Triggered NoCarchitecture that provides a uniform interface to all types of com-ponents which supports component based design, thus enablingreuse. It also offers inherent fault isolation and mechanisms suchas Integrated Resource Management and a power aware systembehavior.

I. INTRODUCTION

MPSoCs are being used for implementing a wide variety ofmulti functional applications in parallel, for example in mobiledevices for streaming media and general purpose productivityapplications. Network on Chip has emerged as a new paradigmto overcome limitations of the current bus system basedcommunication infrastructure for System on Chip designs. Thebandwidth of NoC scales with the complexity of SoC networksize. Computation is no more a major problem today inbuilding applications, but communication in SoC has becomea bottle neck in guaranteeing real-time and energy efficiencyconstraints.

Energy efficiency is one of the most critical design issues

in embedded system design development. There is a need toevaluate the tradeoff between processing power and commu-nication power at design time [1]. The efficient co-executionof diverse applications on MPSoCs taking energy consumptioninto account is crucial to the success of architectures employedby product developers. Mobile Internet devices, which are usedfor communication (hard real time), multimedia playback,content creation and augmented reality (soft real time), aswell as office applications (best effort), potentially requirebandwidth all at the same time [3].

Worm hole switching has been widely used for real timecommunication on NoCs. The non determinism in routingpackets due to contentions of the channels leads to delaysand jitters which violate the hard real time constraints. Themajor problem of the priority based approach precisely isthat it requires distinct priorities and an exclusive virtualchannel for each traffic flow in a router port. This restrictedimplementation structure results in high area and energyoverhead and heavily limits its employment and development[1]. An increase in latency often reduces the General Purpose(GP) application performance dramatically, so the GP trafficcan be considered latency sensitive [3]. The challenges ofcommunication among nearly autonomous possibly heteroge-neous IP blocks in MPSoCs can be addressed by a novelsystem architecture which offers a component based designmethodology.

A major roadblock in the MPSoC development process ismapping and scheduling of tasks onto the platform. To pursuethe system level optimal solution, it is difficult to consider thetrade-off between task processing power and communicationpower together. The priority share policy leads to significantblocking and unpredictable network latency.

A unified approach was developed in [1] for efficient com-putation of the system wide energy optimal task allocation byextending the Integer Linear Program (ILP) and the SimulatedAnnealing formulations. The experimental results show thatthe new Simulated Annealing (SA) heuristic achieves perfor-mance very close to the global optimum and much higher

15

execution speed than ILP based solutions. The problem ofresource overhead due to the priority based approach wassolved in [2] by a priority share policy, where multiple trafficflows were assigned the same priority, hence sharing thesame virtual channel. The number of virtual channels andpriorities were reduced by 50% and 70% respectively. Thepossibility of achieving best possible latency for best efforttraffic was achieved by prioritizing best effort traffic overguaranteed throughput traffic while limiting the bandwidthallocated to the best effort traffic such that enough resourcesremain to meet the guarantees. Furthermore the design allowsfor configuration of the QoS mechanisms at run time to allowflexible use of the system [3]. A novel architectural frameworkthat supports composability and addresses the challenge ofside effect free composition of component services to formlarge systems was proposed by a Time-Triggered NoC. Thisapproach contributed towards an elevated level of designabstraction, determinism through encapsulation, a global timebase for SoC and an Integrated Resource Management.

In this context, this paper addresses various challengesencountered by Network on Chip and solutions proposedby eminent people in the domain. Section 2 describes theproposals briefly and outlines the result. Section 3 discussesfurther points of view and Section 4 draws conclusions.

II. SOLUTIONS FOR VARIOUS CHALLENGESA. Energy Aware Task Allocation

The idea presented in [1] takes communication power intoaccount to reduce the system level energy consumption. Anobjective function is defined which has to be minimized inILP formulation. In the current framework different DynamicVoltage Scaling (DVS) modes of a DVS-enabled processorcan also be incorporated to reduce the task processing power.Todays MPSoCs involve complex designs; hence the com-putational complexity of the ILP formulation grows rapidly.Therefore [1] targets on providing a scalable algorithm throughSimulated Annealing with the Timing Adjustment algorithm.The SA (Simulated Annealing) optimization is started froma baseline mapping instead from a random mapping. Beforecomputing the mapping all the tasks are sorted in decreasingorder of desirability. The idea of desirability is to considerthe task with high gain first, in order to prevent its preferredprocessor from being occupied by other tasks. The task withhighest desirability is first assigned to a processor with lowestprocessing energy. The next task to be assigned is allocatedto the processor that has enough resources and minimizesthe total energy consumption considering the communicationwith the tasks that are mapped previously. This procedure isdone until all tasks are allocated. This optimization processcontinues until it reaches the total number of iterations Nconfigured by the user [1].

The SA is extended by a Timing Adjustment phase, whoseaim is to fine tune the timing of an accepted mapping tomeet the timing constraints.This phase keeps the mappingunmodified if it meets the deadline. Otherwise it examinesthe neighboring mappings to find a new solution that can

Fig. 1. Experimental results for evaluating tradeoff between Ep and Ecom[1]

improve the timing. The adjusted mapping is then checkedfor feasibility and the feasible mappings are compared to findthe best mapping so far. This procedure continues until thedeadline is met or no improvement can be found for all tasks.In the former case, the final mapping is the output of the TA,and in the later case, the problem is reported unfeasible.

The energy aware task allocation described in this paper wasevaluated using 10 task graphs as shown in Figure 1. Iit wasevident that by sacrificing processing energy, a higher amountof communication energy can be saved, resulting in a systemwide saving. Averaging over 10 graphs, 15% energy savingwas achieved. The heuristic algorithm outperforms ILP1 wherecommunication enegery was not considered for minimizationin the ILP, 11% energy is saved comparatively, while solvingan ILP problem may take from several minutes up to severalhours, SA-TA takes less than a second.

B. Worm Hole Switching with Priority Share Policy

Worm hole switching is a very popular cut-through strategyfor NoC. Each packet in a wormhole is divided into numberof flits and each flit is given a priority. The shared physicallink is accessed through virtual channels which is a resourceallocation technique which incorporates multiple independentbuffers to accommodate the flits for each shared link. A prior-ity arbiter decides which virtual channel could be given accessto the shared physical link. Differing from previous works isthat multiple traffic flows per virtual channels are supported.Each traffic flow is assigned a natural priority produced bythe distinct priority per flow policy, and also a system priorityis assigned to each flow where all flows competing for thesame virtual channels are given the same system priority asshown in Figure 2. The complicated blocking analysis wasavoided by collapsing all the flows with the same priorityinto one single scheduling entity and a novel schedulabilityanalysis was presented. This requires the consideration ofthe direct and indirect competing relationships. The directcompeting relationship means a traffic flow has at-least one

16

Fig. 2. A case of traffic flows with Priority Share [2]

physical link in common with the observed traffic flow. In theindirect competing relationship there is an intervening trafficflow between two traffic flows which do not share a link.Assuming that all natural and system priorities are assigneda traffic flow may encounter the following interferences andblocking.

• Direct Interference from traffic flow with higher systempriority

• Indirect Interference from traffic flow with higher systempriority

• Direct blocking from traffic flow with same system pri-ority

• Indirect blocking from traffic flow with same systempriority [2].

In [2] a greedy priority allocation policy which ensuresschedulability with reduced time complexity is introduced. Theintuition of the algorithm is as follows: at each system priorityGk, if any traffic flow τi exists that when τi is mapped topriority Gk, all the flows which have been assigned systempriority Gk or less are still schedulable, τi will be assignedpriority Gk. If no additional flow mapped to Gk can lead toa schedulable system, the system priority is increased. If aschedulable priority ordering exists under distinct priority perflow policy, there must exist a schedulable priority orderingunder priority share [2].

The priority share policy outperforms by exhibiting a re-markable hardware cost saving ; consuming only 20.3% ofpriority levels and 38.4% of virtual channels compared withthe original approach when the network maximum link loadreaches 0.4.

C. QoS Aware Link Arbitration Scheme

The mechanism to give priority to best effort (BE) traffic foroptimal latency by limiting its rate to allow enough bandwidthfor guaranteed throughput (GT) traffic is implemented in theswitch allocator for each router. Every output port requires aseparate arbiter which consists of a selective priority arbiterand a traffic shaper. The traffic shaper consists of a bucketof tokens, and tokens are added at an average token rate ofrtoken. The selective priority arbiter grants requests from BEvirtual channels first as long as there are tokens in the shapersbucket. Consequently, the token rate determines the averagerate of prioritized BE traffic. For every BE flit sent one token

is removed from the bucket and when the bucket is depletedof all tokens GT traffic is prioritized.

A GT connection between two routers can be set by ad-justing the traffic shaper settings on the corresponding outputports along the route of packets between those routers. Toefficiently shape BE traffic only on the affected routes, routingmust be deterministic for GT traffic. [3] implements distributeddimension ordered XY routing. For the latency sensitive besteffort applications evaluated in this paper a speed up to 14%was achieved and the latency of BE traffic was improved by47%.

D. Time Triggered Network on Chip

There is an inherent concurrency in a typical embeddedapplication (e.g., automotive electronics, avionics). The centralelement of the presented SoC architecture as seen in Figure 3 isa Time Triggered NoC that connects multiple heterogeneousIP Blocks called micro components. A micro component isan application sub system that provides a part of the serviceof the overall system, for example a braking system. A microcomponent comprises two parts: a host and a Trusted InterfaceSub System (TISS). The behavior of a micro component canneither disrupt the computations nor the communication per-formed by other micro components. The host implements theapplication services and the TISS is a dedicated architecturalelement that protects the access to the TT-NoC. Each TISScontains a table which stores a priori knowledge concerningthe global points in time of all message transmissions andreceptions of the respective micro component and ensure thetemporal ordering and consistent delivery order of the packets.

The purposes of the Time Triggered NoC encompass clocksynchronization for the establishment of global time base, aswell as the predictable transport of periodic and sporadic mes-sages. Using TDMA, the available bandwidth of the NoC isdivided into periodic conflict free sending slots. The allocationof sending slots of time triggered NoC to micro componentsoccur using a communication primitive called pulsed datastream. A pulsed data stream is a time triggered periodicunidirectional data stream that transports data in pulses witha defined length from one sender to n a priori identifiedreceivers at a specified phase of every cycle of a periodiccontrol system [4].

TT-SoC enables integrated resource management throughthe Trusted Network Authority (TNA) and the Resource Man-agement Authority (RMA). The RMA computes new resourceallocations for the non safety critical application subsystems,while the TNA ensures that the new resource allocationshave no adverse effect on the behavior of the safety criticalapplication sub systems. In order to prevent any unintendedinterference between subsystems, the time triggered SoC archi-tecture ensures temporal and spatial partitioning with respectto encapsulated communication channels which is enforced byTISS.

The TT-NoC is composed of fragment switches. The TISSof each micro component is connected to exactly one fragmentswitch via the TTNoC interface. A particular flit called routing

17

Fig. 3. Structure of Time Triggered SoC Architecture: trusted sub system(shaded) and non-trusted sub-system (hosts of micro components) [4]

flit, entering an interconnect carries switching informationcalled switching op-code which directly represents a hop. Thecomplete sequence of switching information from the sendingTISS to the receiving TISS is called routing information anddefines the route of an encapsulated communication channel.Hence, the sender affects the route which will be taken. This iscalled source routing [5]. The TTNoC enables the coexistenceof several encapsulated communication channels that conveypulsed data streams at the same instant of time among thenetwork topology. But at the same time interference mustbe prevented by avoiding the situation at all or by meansof interleaving fragments. The trouble of specifying branchedroutes and multiple receivers in multi-casting was solved bysplit point multi casting of the TTNoC.

The novel architecture described in this paper, a customhardware prototype was produced by TTTech1.Considering a32 bit data bus a theoretical throughput of 11.2 Gbit/sec couldbe achieved on a single encapsulated channel.

III. DISCUSSIONS

In the method proposed for energy aware task allocation in[1] the authors have not mentioned the problem of hot spots.When they consider the allocation of the tasks for the baselinemapping, to reduce the communication power the dependenttasks are placed closer. Hence this leads to hot spots due toincreased temperatures.

The solution presented in [2] to reduce the resource overhead is very elegant. The consideration of priority sharingthrough which the number of virtual channels is reduceddecreases the cost, area and also energy consumption. Thetraffic models presented and novel schedulability analysismake the computational complexity low. The quality of servicecan be flexibly explored at design time. As the number oftraffic flows increases the resource saving also increases.

1TTTech Computertechnik AG, http://www.tttech.com

The implementation of [3] results in power overhead. Con-sidering its application in Mobile Internet Devices, powerconsumption is much more critical than accomplishing optimallatency for BE traffic, thus thwarting its practical implemen-tation.

The time triggered NoC architecture raises several interest-ing points.

• Inherent Fault isolation• Predictability• Flexibility of heterogeneous components integration.• The high amount of throughput.

IV. CONCLUSIONS

With communication gaining dominance over computationin SoC design at least in consumer electronics, to be successfulas the market leaders especially with the advent of multicores in mobile devices, power and area consumption willbe the USP. With techniques such as [1] substantial energyconsumption can be reduced. Costs also play a major role incommercial embedded systems; implementations as proposedin [2] reduce the hardware implementation costs. For appli-cations involving concurrent real time and general purposeexecutions, practicing solutions as presented in [3] improvelatencies of BE traffic.

With the increasing complexity of embedded systems, bypartitioning an application into a set of autonomous concurrentfunctions a linear performance can be achieved with theincrease in number of devices. The Time Triggered NoC offersan increased ease of abstraction for mixed criticality systemswhich is characterized by safety critical and less critical microcomponents. While other approaches establish evolution ofwell known design styles, the TT-SoC engenders a revolutionin SoC design.

REFERENCES

[1] Jia Huang, Christian Buckl, Andreas Raabe and Alois Knoll :Energy-Aware Task Allocation for Network-on-Chip Based Heterogeneous Multi-processor Systems. 19th International Euromicro Conference on Parallel,Distributed and Network-Based Processing 2011.

[2] Zheng Shi and Alan Burns: Real-Time Communication Analysis with aPriority Share Policy in On-Chip Networks. 21st Euromicro Conferenceon Real Time Systems 2009.

[3] Jonas Diemer, Rolf Ernst and Michael Kauschke: Efficient Throughput-Guarantees for Latency-Sensitive Networks-On-Chip. Design and Au-tomation Conference 2009.

[4] Roman Obermaisser, Christian El Salloum, Bernhard Huber and Her-mann Kopetz : Time-Triggered System-on-Chip architecture. IndustrialElectronics 2008.

[5] Christian Paukovits and Hermann Kopetz : Concepts of Switching in theTime-Triggered Network-on-Chip . Embedded and Real-Time ComputingSystems and Applications Conference 2008.

Documents

Seminar - TU Kaiserslauternrts.eit.uni-kl.de/fileadmin/publication_files/Guerra_TRRTSemSS11.pdf · Abstract—Wireless Sensor Networks (WSN) consists of spa-tially distributed nodes