Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Proceedings of
5th Real-Time Systems Seminar
Summer 2011
18 July 2011, Kaiserslautern, Germany
i
Edited by
Raphael GuerraGerhard Fohler
ii
Table of Contents: Real-Time Systems Seminar Summer 2011
Real Time properties and applications of wireless sensor networks.........................................................1Anoop Bhagyanath, Raphael Guerra
Global view of methods for evaluating end-to-end delays on AFDX.......................................................6Canko Canew, Raphael Guerra
Optimal scheduling approaches for FlexRay bus.....................................................................................10Piotr Swedrowski, Raphael Guerra
Challenges faced by on-chip network implementation for real-time embedded applications.................14Naga Rajesh Garikiparthi, Raphael Guerra
iii
1
REAL TIME PROPERTIES AND APPLICATIONSOF WIRELESS SENSOR NETWORKS
Anoop BhagyanathChair of Real Time Systems
Department of Electrical and Computer EngineeringTechnische Universitat Kaiserslautern
Raphael GuerraChair of Real Time Systems
Department of Electrical and Computer EngineeringTechnische Universitat Kaiserslautern
Abstract—Wireless Sensor Networks (WSN) consists of spa-tially distributed nodes that are capable of sensing, gathering,processing and communicating data. A vast majority of WSNapplications are real time, which require bounded delay guaran-tee on packet delivery. We focus on the real time applications andchallenges in WSN. This paper points out the key challenges inWSN from a real time perspective and the real time techniquesemployed to meet these challenges. In presenting this, we focuson two specific real time methods 1) Contention-free PeriodicMessage Scheduler MAC protocol [3] and 2) RAP: A RealTime Communication Architecture for Large Scale WSN [4].We also concentrate on the drawbacks of these techniques andconclude by suggesting the future work in order to overcomethese drawbacks.
I. INTRODUCTION
A Wireless Sensor Network (WSN) is built up of a col-lection of nodes. These nodes communicate wirelessly. Eachone of them has processing capability, contain memory, havea radio frequency (RF) transceiver, power source and varioussensors or actuators. Various applications can use such a net-work, many of which requires real time guarantees. Environ-mental monitoring, industrial monitoring, military, agriculture,surveillance, medical field are only a few examples of a varietyof possible real time applications of WSN. These applicationsare categorized as real time since they demand that data istransmitted from the source and delivered to the destinationwithin a specified deadline.
In this paper, we address real time challenges in WSN. Themost obvious challenge from a real time perspective is to guar-antee end to end delay for messages. This must be achievedunder the constraints of minimum energy consumption andminimum resource requirement. The former ensures longer lifefor WSN and the latter ensures low cost. Employing WSNsatisfactorily to real time applications requires developmentof robust real time techniques. The unlimited potential ofWSN in various applications and the fact that vast majorityof these applications are real time increases the significanceof developing robust and reliable real time techniques toaddress these challenges. Use of WSN in medical field toconnect medical sensors wirelessly to handheld devices orPCs to monitor and treat patients is an example of a hardreal time WSN system. Wireless Medium is a lossy link
with high unpredictability and unreliability. This makes thedevelopment of robust real time techniques for WSN verydifficult. In addition, the factors such as minimization ofenergy consumed and resources used by the sensor node toin turn minimize the cost and energy makes this task evenmore challenging. Most existing protocols end up making alot of unrealistic assumptions like ignoring the possibility ofwireless interferences, location awareness etc. This limits theirapplicability to real world.
We present in this paper real time applications and chal-lenges in WSN. We then focus on two specific real timetechniques, 1) Contention-Free Periodic Message SchedulerMAC protocol[3] and 2) RAP: A Real Time Communica-tion Framework for WSN[4]. Reference paper [3] basicallydescribes techniques to come up with contention-free messageset from a given periodic message set. A message set is saidto be contention-free, if only one message is ready to betransmitted at a time. This contention-free periodic messagescheduler has lower time and space complexities comparedto an EDF scheduler. The latter [4] presents a complete realtime communication framework for large scale WSN. Thekey component of this architecture is Velocity MonotonicScheduling (VMS), which prioritizes the messages dependingon their deadline and distance to travel. RAP provides generalAPIs as a convenient high level service for general sensingapplications and this framework scales well in large scalewireless sensor networks because of the use of localized andefficient algorithms at every layer. Also the simulations usingRAP communication architecture shows that it considerablyreduces the end to end deadline misses in multi-hop scenarios.We also analyse and present the drawbacks of the aboveintroduced techniques in terms of the impractical assumptionsmade.
In section 2, we give an overview of real time applicationsof WSN. Real time challenges are listed out in section 3.Section 4 focuses on two real time methods 1) a contention-free periodic message scheduler MAC protocol and 2) RAP areal time communication architecture. Section 5 analyses thedrawbacks of these real time techniques and proposes futurework to overcome these drawbacks. Finally we conclude thepaper by summarizing the contents discussed.
2
II. REAL TIME APPLICATIONSIn this section, we run through a list of possible real time
applications of WSN and also list out a few in use exam-ples. Many possible real time applications like environmentalmonitoring, industrial monitoring, surveillance, medical, agri-culture, structural monitoring etc can utilize wireless sensornetworks. In environmental monitoring, the sensor nodes aredeployed over a region where some phenomenon is requiredto be monitored. When the sensors detect the event beingmonitored, the event is reported to a base station, which takesappropriate actions. WSN save the cost of wiring and moreimportantly enables access to previously inaccessible locationslike in rotating machinery for industrial monitoring, on animalsfor monitoring of their habitat etc.
There are a lot of areas where WSNs are currently used.WSN is deployed to monitor eruption at Volcan Tungurahua inCentral Ecuador. The sensors basically monitor the infrasonicsignals during eruption and this data is transmitted to the basestation 9 km away. Another example is Codeblue, a designto support medical sensors connected wirelessly to handhelddevices or PCs to monitor and treat patients. The sensorshere collect heart rate, oxygen saturation and EKG data andtransmit them over a 100m wireless link to a PDA. However,this work is currently not approved to be implemented inpractice. Pinptr is a wireless sensor network developed tolocalize the position of a sniper. The sensor detects muzzleblast and acoustic wave that originate from sound of shooting.The time of arrival of these acoustic waves is used to estimatethe sniper position and this is send to the base station. InDARPA’s self healing minefield, sensor nodes are placed onanti-tank mines. The peer to peer communication betweenthese nodes are used to respond to attacks by activating properanti-tank mines, thus complicating the progress of enemytroops. Intel’s wireless vineyard is an example of using WSNfor agricultural monitoring. Here, we have sensor nodes placedon the field, and they collect data. All data is redirected to datamules, small devices carried by people (or dogs) and thesedata mules interpret the collected data and use them to makedecisions on the presence of parasites and use of appropriateinsecticide [1].
III. REAL TIME CHALLENGESIn this section, we look at the challenges in wireless sensor
networks from a real time perspective. The most obvious realtime challenge is to guarantee bounded end to end delay formessages. Most end to end message transmission is multi hopin all practical scenarios. This is a WSN routing layer concern.Scalability of WSN is another factor that routing layer mustsupport for most applications. Real time Medium Access Con-trol (MAC) protocols aim at guaranteeing forwarding delaysin single hop scenarios. Wireless links are generally lossy andhighly unpredictable due to the environmental factors. Thismakes our goal of meeting the real time challenges associatedwith WSN very difficult. Furthermore, the delay guaranteeshave to be made under strict energy and resource constraintsof sensor nodes. This makes the task even more challenging.
Lifetime of a WSN is extremely critical for manyapplications[2]. Energy consumption of the nodes is the mainlimiting factor of the lifetime of a network. Thus minimizingenergy at all the individual layers in the network protocolstack is very important to prolong the life of the system. Withregards to resource constraints, real time protocols should bedesigned to incur minimal overhead so that the nodes needonly minimum processing or computing power and only asmall amount of memory to operate on. This ensures lowcost for WSN. These same factors must be considered whiledesigning other software layers like application or transportlayer for a sensor node.
IV. REAL TIME METHODS
There are many common mechanisms that real time MACprotocols and routing protocols use to meet the timing con-straints. MAC protocols often make use of traffic regulationmechanisms, scheduling of message transmissions, structurednetwork topology etc [1]. Routing protocols often rely ongeographic forwarding, multi-path routing, prioritized queuingmodel etc [1]. In the next two sub sections, we present indetail, two real time techniques designed to be employed inWSN.
A. CONTENTION-FREE PERIODIC MESSAGE SCHED-ULER MEDIUM ACCESS CONTROL (MAC) LAYER
Here we take a look at a contention-free periodic messagescheduler medium access control described in detail in [3]. Itis assumed that the network is single hop and every node inthe network can communicate directly with every other node.As the name indicates, this MAC protocol only deals withperiodic messages and the messages consist of a number offixed size packets. At the physical layer, the assumption ismade that each individual packet takes a fixed network timeunit to be transmitted. A message set is harmonic if and onlyif each message period is a positive integer multiple of allsmaller message periods and equivalently, each message perioddivides all larger message periods. A synchronization protocolis described in [3], which requires only a small overhead, onthe order of 1% of network bandwidth. A single message mi
is modeled by the parameters (φi, Ci, Di, Ti), where φi isthe phase or offset, Ci the number of packets in the message,Di is the relative deadline, which specifies the length of timeafter release by which transmission must be complete, and Tiis the message period. The utilization of a message set is UM
= Σ|M |i=1Ci/Ti, where |M | is the number of messages in the
set . Given a set of messages M, we find a message set M cf
whose schedule is contention-free. A message set is said to becontention-free, if only one message is ready to be transmittedat a time.
This section describes a centralized sub-optimal attributeassignment algorithm to come up with a contention-free mes-sage set M cf from a given message set M, consisting ofmessages mi=(φi, Ci, Di, Ti). As a first step, we convertthese multi-packet messages into single-packet messages. Thissimplifies our analysis and gives our protocol more degree of
3
freedom to work on. The period of each single-packet messagemay be less than or equal to the period of the correspondingmulti-packet message, ∀i : T cf
i ≤Ti. We also require that thephases of individual packets of the same message preservethe order among themselves, ∀mi ε M, ∀j,k=1...Ci : j<k→ φcfi,j < φcfi,k. The phase of each packet is greater thanor equal to zero and less than that message’s period, ∀i,j:0≤φcfi,j<T
cfi . The deadline of the original multi-packet mes-
sage is met if the assigned phases of individual packets of thismessage have a maximum difference less than or equal to thedeadline. Now to prove that this phase difference constraintholds, we assume for simplicity that a message deadline isits period, ∀miεM: Di=Ti, which is a realistic assumption.Suppose ∃i : φcfi,Ci
−φcfi,1>Ti. Equivalently, ∃i: φcfi,Ci>Ti+φ
cfi,1.
Since φcfi,1>0, we have φcfi,Ci≥Ti. This contradicts the phase
constraint made earlier, which says, ∀i,j: 0≤φcfi,j<Tcfi [3].
Thus, we cannot have a maximum phase difference betweenindividual packets of a message greater than the deadlineof that message, which guarantees that the deadline of theoriginal multi-packet message is preserved even though themessage is divided into many single-packets. Now we have aset of single-packet messages with some constraints specifiedfor their period and phase. Next we show how to use thecentralized sub-optimal attribute assignment algorithm to comeup with a contention-free message set from the above givenmessage set. Algorithm 1 is shown in figure 1 below.
Fig. 1. Centralized attribute assignment algorithm [3]
The basic idea behind this algorithm is to assign the phasesand the periods of the messages properly to come up with acontention-free message set from the given message set. Lines1 and 2 in Algorithm 1 make the message set harmonic byreducing the periods to largest smaller power of 2. Line 3initializes a variable which holds the next available time slotfor assigning the phase of a message. The table S representsthe message schedule, in which a true entry corresponds to atime slot already in use by a message for its transmission. Line4 initializes this table to false in the beginning. In line 5, weiterate through all messages and assign the current messagephase to the next available free slot in line 6. Lines 6 and7 updates the message schedule table S accordingly. Finally,lines 9 and 10 updates the phase variable to the next available
free slot after checking through the message schedule [3].Thisalgorithm is executed offline. Online, for each message a nodesends or receives, a real time task is created whose attributesare derived from the message attributes by multiplying eachby the network time unit tunit. The scheduling algorithm isthen implemented by executing each task when it is released.
From the properties of harmonic message sets, it can beproved that if a message set M has utilization UM≤1/2, thenthere exists a contention-free phase and period assignment[3].A comparison was made between EDF scheduler Time Di-vision Multiple Access (TDMA) and contention-free schedulerTDMA based on time complexity and space complexity. Timecomplexity indicates the complexity of the algorithm to comeup with a message to be transmitted during a time slot. Spacecomplexity represents the memory requirement at each sensornode. It was shown that contention-free scheduler TDMA haslower time and space complexities when a node receives lessthan 70% and 95% of all messages respectively [3].
B. RAP: A REAL TIME COMMUNICATION ARCHITEC-TURE FOR WIRELESS SENSOR NETWORKS
From a MAC protocol for real time wireless sensor net-works, we now move on to a complete real time commu-nication architecture for WSN. This RAP communicationarchitecture is described in detail in [4]. The framework isshown in Figure 2.
Fig. 2. The RAP Communication Architecture [4]
The applications interact with RAP communication frame-work through a set of query/event APIs. In general, a message(query or event) is directed to a set of sensors in an area ratherthan a specific sensor address. Location Addressed Protocol(LAP) converts the request to send a message to a set ofsensors in a specific area to the addresses of those sensors.The assumption made is that the routing layer is aware of thephysical geography. Geographic Forwarding makes a decisionon which immediate neighbor to transmit the message to. Ittransmits a packet to an immediate neighbor if 1) that neighborhas the shortest geographic distance to the packet’s destinationand 2) it is closer to the destination than the forwarding node[4].
4
The key component of RAP real time communication archi-tecture is the Velocity Monotonic Scheduling (VMS), whichmakes packet scheduling at a node both deadline aware as wellas distance aware. Basically, the packet priority is assignedin such a way that shorter the deadline, higher the priorityof the packet and longer the distance, higher the priority ofthe packet. VMS assigns the priority of a packet based onits requested velocity. Two priority assignments schemes arediscussed. Static Velocity Monotonic (SVM) calculates a re-quested velocity at the sender of each packet and this remainsfixed on each hop. Assuming a sender location of (x0,y0),destination location (xd,yd) and an end to end deadline of thepacket D, then requested velocity is set to
V = dis(x0, y0, xd, yd)/D
where dis(x0,y0,xd,yd) is the distance between sender anddestination. Dynamic Velocity Monotonic (DVM) recalculatesthe requested velocity of the packet upon its arrival at eachforwarding sensor node. The velocity is given by
V = dis(xi, yi, xd, yd)/(D − Ti)
where (xi,yi) is the location of the current node and the newterm Ti represents the elapsed time or the time that the packethas been in the network [4].
Finally the MAC layer assures that the wireless mediumaccess is granted according to the priority of the messagepacket. MAC layer uses the light-weight CSMA/CA protocol,but with a couple of extensions to include priority of packetsin determining the medium access. Firstly, the initial wait timeafter idle, which represents the wait time for a node after thechannel becomes idle, is included and is given by
DIFS = BASE DIFS × PRIORITY
where, DIFS is a counter value set by the node once channelbecomes idle and the node waits for a random period of timebetween 0 and DIFS before sending a Request To Send (RTS)packet. Thus, nodes having packets with a higher priority (cor-responding to smaller value for PRIORITY) choose a smallerwaiting time. Secondly, backoff window increase function,which represents the increase in time that a node waits whena transmission collision occur, is included and is given by
CW = CW×(2+(PRIORITY −1)/MAX PRIORITY )
where, MAX PRIORITY is the maximum value of PRIOR-ITY (corresponding to lowest priority). Thus, the backoff timefor a node having lower priority packets increases faster than anode having higher priority packets waiting to be transmitted[4]. Detailed simulations of wireless sensor networks demon-strated that RAP considerably reduces the end to end deadlinemisses in multi-hop scenarios. But no real time guarantees areprovided for the messages.
V. DRAWBACKS OF EXISTING TECHNIQUES ANDFURTHER WORK PROPOSAL
Most real time techniques make a lot of assumptions, whichare not practical. One very common misconception is the very
objective of real time techniques, which states that the goalis to provide hard real time guarantees for each transmittedmessage. Considering the highly unpredictable and unreliablenature of wireless links, the real time objective should be re-stated to accommodate to a large extent the inherent propertiesof wireless sensor networks [1].
Here we look at a few unrealistic assumptions made bythe real time techniques discussed in the previous section.Contention-free periodic message scheduler MAC protocolassumes that all the nodes are in same wireless range, whichis not true in most practical deployment scenarios. Mostpractical WSN deployments rely on multi-hop for messagetransmission. The RAP real time communication architectureassumes that every node is aware of the physical geography.This location awareness is in itself a big problem to be solved.Furthermore, both the real time techniques make the assump-tions like radio transceiver either in transmitting, receivingor turned off mode, not considering the transition time fromtx to rx and vice versa. This transition time introduces alarge enough interval of time, which must be considered whendesigning MAC protocols. Otherwise, even TDMA scheduledmesssage transmissions can lead to collisions if the nodes arein the same interference range. Another unrealistic assumptionthat both techniques rely on is that if no other node istrying to access the medium, the medium is free (ignoringthe possibility of wireless interferences). This is especiallycritical in case of TDMA based MAC protocols. In TDMA,we select a message in a node from a set of messages inmany nodes, in contention for transmission and we assumethat at this point in time, wireless medium is free. But in realworld scenarios, still wireless interferences can occur frommany other sources. Lastly, the simulators used for evaluatingthese techniques would have to be studied to understandhow realistically they model WSN deployment environment.It is difficult to develop a realistic radio model due to thedeep complexities of a practical WSN environment. Simplemodels used by simulators could hide design flaws and limitapplicability of these techniques in real world [1].
In future, we need to design and implement timelinessmethods for WSN without relying on these unrealistic as-sumptions. Real time objectives need to be tuned in orderto take into account the wireless sensor network properties.MAC protocols are required to be robust enough to dealwith unstable and weak links. Routing protocols should beable to support timeliness using minimum resources and theyshould be designed for scalability and adaptability. Also thesimulators, used for validation of these techniques shouldbe closely related to the real world, modeling the wirelessenvironment more realistically [1].
VI. CONCLUSIONS
Wireless sensor networks are currently receiving consider-able attention due to their unlimited potential. In this paper, welisted some real time applications of WSN. We presented thechallenges in WSN from a real time perspective and introduceda few general techniques used to meet these challenges. Then
5
we focused on two specific real time methods, namely, acontention-free periodic message scheduler MAC protocoland a real time communication framework for WSN: RAP.Drawbacks of existing real time techniques were discussedand future work was proposed to overcome these drawbacks.
REFERENCES
[1] Ramon Serna Oliver and Gerhard Fohler :Timeliness in Wireless SensorNetworks:Common Misconceptions. 9th International Workshop on Real-Time Networks RTN’2010, Brussels, Belgium, July 2010.
[2] Daniele Puccinelli and Martin Haenggi: Wireless Sensor Networks: Ap-plications and Challenges of Ubiquitous Sensing.Circuits and SystemsMagazine, IEEE, 2005.
[3] Thomas W. Carley, Moussa A. Ba, Rajeev Barua and David B. Stewart:Contention-Free Periodic Message Scheduler Medium Access Control inWireless Sensor / Actuator Networks. Real-Time Systems Symposium,2003.
[4] Chenyang Lu, Brian M. Blum, Tarek F. Abdelzaher, John A. Stankovicand Tian He : RAP: A Real-Time Communication Architecture forLarge-Scale Wireless Sensor Networks.IEEE Real-Time and EmbeddedTechnology and Applications Symposium, 2002.
6
Abstract— “AFDX (Avionics Full Duplex Switched Ethernet)
developed for the Airbus 380 represents a major upgrade in both
bandwidth and capability”[3] In order to determine the upper
bound of avionic flow in end-to-end delay in this kind of
networks there have been made a lot of researches for the
approaches. Starting from the basic (pessimistic) certification
model - Network Calculus, then to compare the results - the
Simulation model. Next the Stochastic Network Calculus with the
probability of exceeding the upper bound and Model checking
approach for the exact worst-case delay of each flow. The latest
work focus on improving the Trajectory approach - one for
distributed system which look closely in the priority queuing
settings.
The object of the paper is to give the global view of these
methods.
I. INTRODUCTION
To guarantee the determinism of avionic communications
there have been added new mechanisms to the evolutional
AFDX switched Ethernet . It is a big challenge in this area to
demonstrate that upper bound can be determined for end-to-
end communications delays. In mentioned technology all of
the avionics communications are able to be statically
described: asynchronous multicast communication flow are
identified and quantified and what’s important flows can be
statically mapped on the network of AFDX switches. In a
given flow – end-to-end communication delay of frame can be
described as the sum of transmission delays on links and
latencies in switches. What’s also really useful is that the
AFDX is based on Full Duplex so it assures no collision on the
physical support, and no CSMA/CD support is necessary.
The problems that occurs is that with sharing resources in
functions, leads to complex construction. Proving in this case
that no frame will be lost by network (no switch queue will
overflow) and to evaluate the end-to-end transfer delay
through network is the aim of the invented approaches.
However we need to remember that an exact stochastic
analysis of an industrial avionics network is unaffordable, due
to the number of VLs of such a network configuration.
We find big difficulties due to increase in the complexity
of the embedded systems, in term of rise in number of
integrated functions and their connectivity. What is interesting,
is that for example “setting randomly priorities gives worst
bounds than using no priorities” [1] what clearly means that
selecting priority policy is a way for a good solution. What’s
also important the occurring indeterminism problem was
shifted to the switch level where various flows can enter in
competition for sharing outputs ports of given switch.”
The first step which was made in researches was the
Network calculus mainly for the certification purpose. It
compute a worst-case upper bound for each communication
flow. “It allowed the scaling of the switches memory buffers
in order to avoid buffer overflow and frame losses”[3] what is
obviously pessimistic in the analysis.
Then to understand real behavior of AFDX the simulation
model was used. “This approach allows calculations on
modeled network of each flow the end-to-end delay according
to representative subset of possible scenarios”[3]. The
simulations in this case were made using configurations
provided by Airbus. However this approach cannot be used for
certifications obviously because rare events can be missed by
the simulations and we use it only on some scenarios.
Useful following approach was the Stochastic network
calculus which was proposed to compute a probabilistic upper
bound. Thanks to this approach the “computation of the
probability p for an end-to-end delay to exceed a given bound.
“[3] was done. The probability p in this case can be interpreted
as the acceptable probability that a frame misses its deadline.
Next method – the model-checking reviewed in the [4]
document give us computation of the exact worst-case delay
for each flow . “Unfortunately, it cannot cope with real AFDX
configurations, due to the combinatorial explosion problem for
large configurations. “[4].
The newest approach - Trajectory concept applied [4]
with “fixed priority policy was established in order to provide
the bounds needed for a deterministic avionics network with a
static priority Qos policy. The Idea of this approach is to
introduce additional non avionics traffic (with lower priority)
for improving the use of available AFDX resources”[4].
During improving all approaches and inventing new one
there have been mane simulations what was recently
discovered is that there is opportunity on focusing on the parts
that only influences end-to-end delay of a given flow. This
definitely will reduces the network and reduces the
calculations. We found out also that “The trajectory approach
with SP/FIFO scheduling is able to guarantee worst case end-
to-end delay for AFDX networks with static priority flows
differentiation QoS mechanism”[4] and it can be enhanced by
using grouping technique.
Paper is organized in this way. In on the next section (II)
The AFDX Network architecture will be presented. Then we
take look in (III a) for the Network Calculus and possible
improvements. Next the Simulation approach in (III b) as a
realistic example. In (III c) the Model checking approach for
determining an exact worst-case end-to-end delay and nn the
Global View of Methods for Evaluating
End-To-End Delays on AFDX Canko Canew, Raphael Guerra
Technische Universität Kaiserslautern, Germany
7
end Trajectory approach in section (III d) for improving the
use of available AFDX resources.
II. THE AFDX NETWORK ARCHITECTURE
An illustrative example of AFDX is depicted in Figure
below.
As we can see it is composed with five interconnected
switches S1 to S5. Switches doesn’t have any input buffer on
input and one FIFO buffer occurs on each output port. The
inputs and the outputs are called End Systems. “Each end
system is connected to exactly one switch port and each switch
port is connected to at most one end system. Links between
are full duplex. The end-to-end avionics traffic
characterization is made by definition of Virtual Links.”[4]
Virtual Link(VL) is a concept of Virtual channel for
communication. Thanks to them “it is possible to statically
define all the flows (VL) which enter the network” [4] The
packets that will be exchanged moves through the VL and
End Systems .The VL defines a logical unidirectional
connection which goes from one source end system to one or
more destination end systems. On the Figure 1, vx is a unicast
VL with path {e3-S3-S4-e8}, while v6 is a multicast VL with
path {e1-S1-S2-e7} and paths {e1-S1-S4-e8}.
Only one end system inside the AFDX can be source of one
VL. VL definition includes also important parameters, BAG
Bandwidth allocation Gap, which is the minimum delay
between two next packets of the associated VL, and also Smin,
Smax which stands for a minimum and maximum packet
length. Compliance of VL parameters is ensured by shaping
unit at end system level and a traffic policing unit at each
switch entry port.
AFDX network allows assigning either a high or low
priority to each VL in every switch output port [1]. The load of
a physical link here is defined as the portion of time a link is
busy”[2]
All of the constraints of AFDX added to vintage Ethernet
enables a precise analysis of the network, this allows
computation of an upper bound and dimensioning of output
buffers so that no packet is lost.
III. METHODS FOR BOUNDING END-TO-END DELAY
A. Network calculus for obtaining the delay & Network
calculus stochastic
Certification is mandatory in the context of avionics. We
need probabilistic upper bound on the end-to-end delay of
each flow. Like mentioned before “An exact stochastic
analysis of an industrial avionics network is unaffordable, due
to the number of VLs of such a network configuration.”[3]
“One way to solve the problem is to use pessimistic stochastic
analysis which is a safe approximation of the exact stochastic
analysis.”[3] The calculated upper bound associated with a
given probability is guaranteed to be greater than the exact
upper bound. The Calculus gives the latency bound of any
elementary network entity and for those elements that have a
queuing capability, a queue-size bound expressed either in a
number of bits or in a number of frames(with a simple
majorization using Smin).”[1]
Elementary entity offers service curve β to an input flow
constrained by an arrival curve α ,the calculus also brings the
arrival curve α* of the output flow: α* = α ∅ βwhere α ∅ βis
defined by:
The improvement of the calculus was defining “groups” of
VLs. We group VLs that exit from the same multiplexer and
enter another multiplexer together, i.e. VL that share two
segments of path gives tighter bound up to 40% . Frames of
this VL are serialized once exiting the firs multiplexer and thus
they don’t have to be serialized again in the following
multiplexers” [1].
Also an important improvement of the Network Calculus
was the stochastic version of this approach. Its Aim is to
obtain the statistical calculation of delay and backlog bounds.
This theory allows computation of the probability p for an end
to end delay to exceed a given by approach bound and what is
very useful is that it assures that the probability of exceeding
the computed bound is not greater than p.
B. Simulation approach as a realistic example
In this approach the main gol is to approximate real
network behavior. It requires realistic model of the network
and calculates the end-to-end delays of a given flow on a
subset of all possible scenarios It gives experimental upper
bound on a set of scenarios.
All elements of the network will be built as a queuing
station or object structure. They represent the simple network
elements: One-way Links, Buffers, Demuxes, Scheduler
Muxes. The selected policy of service is Fifo” [2]. Building
the simulations difrrent strategies. For example in frame
generation like: frames can be generated periodically, using
the BAG asa perios, or using the BAG as a minimum inter
emission time For phasing between VL: synchronized one,
first frame an every VL is transmitted at the same time, or
ohase randomly distributed between 0 and its BAG and for
frame size: minimum length for every frame of every VL, or
maximum, or average length between min and max, or random
length between minimum and maximum ones for every frames
of every VL.
During researches (for analysis reasons) there have been
also computed for each path the ratio the end-to-end delay
obtained by simulation and the one calculated with the network
calculus.
8
In null phasing (first frame initiated in the same time) VL
synchronous the ration- most of the paths the ratio between 5
and 40 %. All VL path with a ratio of at least 70% have a
length of 1 (they cross single switch). What was then showed
was that deterministic upper bound obtained by the network
calculus approach is reachable in case of single switching
communication level, null phase configuration is quite often
close from the worst-case configuration and in the random
phasing each of them have a specific random delay before the
emission of its first frame.
For example the delay between 0 and BAG, results – for all
VL paths the ratio under 20%. It pop out also what influence
has BAG occupation on end to end delays. When all BAG are
occupied does not always lead to worst-case end-to-end delay.
In case reviewed in document [2] the Simulation check also
influence of frame length using as a reference the “minimum
length”.
It is on easy in this approach to find a representative subset
of scenarios in order to calculate the end-to-end delay
distribution.. The new mentioned key idea was to model only
the elements of the network which have an influence on the
end-to-end delay distribution of the flow.
Using this idea\ with classification of VLs (pictured below),
we can distinguish paths that doesn’t have direct influence on
delay distribution of Vx, what in latter works leads to drastic
reductions of the simulation space. But in order to obtain
larger reduction the classification has to be exploited more
effectively
C. Model checking approach for determining an exact worst-
case end-to-end delay
The approach presented in paper [2] is based on timed
automata. “This Method explore all the possible states of the
system and thus it determine an exact worst-case end-to-end
delay. It implies to compute if a propriety, expressed by a
timed logic is verified or not” [2]. The aim of it was also to
describe system behavior with time
Each action a executed by a first timed automaton corresponds
to an action with the same name a executed in parallel by a
second timed automaton. In this particular case performing
transitions requires no time, but time “can run” in nodes.
In this timed automata we use extension by so called
committed nodes or shared integer variables.
Like depicted above “A1 performs m1 and simultaneously
A2 performs m1. Then A1 performs m2 and simultaneously
A3 performs m2. As s2 of A1 is committed, the two transitions
m1 and m2 are performed simultaneously without time
evolution. This extension allows to model broadcast
communications mechanism through timed automata.”[2]
With mentioned shared integer variables a set of variables is
shared by timed automata. In such a way these values can be
consulted and updated by any timed automata.
The Reachability analysis here is performed by model
checking encoding the property in terms of reachability of a
given node of one of the automata.
To calculate worst-case, the method consist of: verifying
that a frame is received before a global transmission delay. By
the generalization of considered in [2] system, looking on the
picture below we can find defined 2 groups of VLs.
“GrVL1 a group of VL that all merge in Switch1 by the
same input port, cross Switch2 and go out by the same
output port. Similarly,
GrVL2 is a group of VL that all merge in Switch2 bt the
same input port and go out by the same output port.
Model-checking says here that worst-case end-to-end delay
is reached when GrVL1 and GrVL2 groups are synchronous.
This can help the simulation approach choosing a phasing
between VL that gives higher end-to-end delays that the null
phasing.” [2]
D. Trajectory approach
This approach is used to get deterministic upper bound on
end-to-end response time in distributed systems and to allow a
worst case delay computation identifies for a packet m the
busy periods and the packets impacting its end-toned delay on
all the nodes visited.
General distributed systems can by depicted like on the
picture nr 5. Each flow crossing system follows a static path
which is and order sequence of nodes. “ It assumes, with
regards to any flow ri following path Pi, that any flow rj
following path Pj , with Pj 6= Pi and Pj \ Pi 6= ;,never visits a
node of path Pi after having left this path.”[4]
9
„Normally flows are scheduled with a combined fixed
priority and FIFO algorithm in every visited node (non-
preemptive policy). The flows are at first sorter according to
fixed priority level and flows with same fixed priority are then
treated in FIFO order.”[4]
The end-to-end response time of packet is the sum of the
times spent in each crossed node and the transmission delay on
links. In Paper [4] they focus on fixed priority policy, and for
bounds needed for deterministic avionics network with a static
priority QoS mechanism. Idea was to introduce additional non
avionic traffic (wth lower priority) to improve the use of the
available AFDX resources.
The optimization of this approach was serialization of flows
with fixed priorities similar to the grouping technique in NC.
IV. DISCUSSION
We need the Network calculus approach since it gives the
guaranteed upper bound for end-to-end delay in ADFX but
unfortunately because of the assumptions made in this
approach usually this deal is not reachable.
Very good step was made with inventing of the “group
concept” for tightening the bound of arrival curves and inter
switches traffic which checked later in simulations gives 40%
better results on the end. After this it was also very useful to go
with the Network calculus approach into the Probabilistic
domain. The result of this step is a good candidate for
certification since it is guarantee. It give us also a computation
of probability p for an end-to-end delay to exceed the bound
which is given. However we cannot forget that due to
pessimistic assumptions of Network calculus this approach is
often pessimistic and because of that I don’t think that it will
be easy to improve more this approach.
Presented next approach – simulation gives an experimental
results for a given scenarios, off course it can be awfully
exceed if it will miss some rare event but what is more
important in my suggestion is that this approach gives a view
on a global estimation of the network load and it helps to
search new approaches and methods in bounding delays in
AFDX.
The next idea with the model-checking is not so useful. It
gives an exact delay and the corresponding scenario by
exploring all the possible states of the system. But it will be
not easy to use this one in real network configuration.
Definitely for a presented in documents models it will lead to
combinatorial explosion, so this one is still open topic in
researches.
I believe that a better research direction was with the latest
one, Trajectory approach which allows computation in
distributed systems can be also nicely improved by grouping
technique like in the Network calculus version, and it really
improve results with tighter upperbound on end-to-end delay.
What was really good step was taking a look for a low
priorities flow. Impact of them can be upper bounded per
switch by the transmission time of the biggest lower priority
packet.
V. CONCLUSIONS
In this paper we have took a brief look for AFDX networks
structure and approaches invented to bound the end-to-end
delays in avionic systems. This kind of networks represents
important upgrade which brings much better results and
decreases in hardware side like cabling etc.
Invented only for certification purposes Network Calculus
approach was brought here as a fundamental step in
calculations. Improved with grouping technique it brings a
major upgrade in respond time analysis.
The Network Calculus approach in probabilistic domain is also
mentioned since it also improve results in research by
calculating the probability p of exceeding the bound which is
given.
In this work we say also about Simulation model mainly to
show what’s the way to analyze AFDX in real environment,
and for obtaining a global estimation of network load. What’s
very important is the idea in this approach with which one we
only focus on the elements of the network which have an
influence on the end-to-end delay distribution of the flow. This
brings huge reductions in AFDX networks but in order to
obtain larger reduction the classification has to be exploited
more effectively.
The third approach presented in paper is based on timed
automata. This Model-checking method by exploring all the
possible states of the system determine an exact worst-case
end-to-end delay. As mentioned it is based on timed automata
so it is easy to guess that mainly the aim of it was to describe
system behavior with time.
The latest Trajectory appraoch was invented mainly for the
distribution system’s needs. “It identifies for a packet m the
busy periods and the packets impacting its end-to-end delay on
all the nodes visited by m. Thus, it allows a worst-case delay
computation”[4]. This approach main later idea is to introduce
additional non avionics traffic (with lower priority) for
improving the use of available AFDX resources. This method
with mentioned before grouping technique show that it
improves average by 10% the upper bound computed by
Network Calculus giving less pessimistic results.
REFERENCES
[1] F. Frances, C. Fraboul J. Grieu - Using Network Calculus to optimize
the AFDX network.
[2] Hussein Charara, Jean-Luc Scharbarg, J´erˆome Ermont, Christian
Fraboul - Methods for bounding end-to-end delays on an AFDX
network
[3] Jean-Luc Scharbarg, Frédéric Ridouard, and Christian Fraboul - A
Probabilistic Analysis of End-To-End Delays on an AFDX Avionic
Network
[4] Henri Bauer, Jean-Luc Scharbarg, Christian Fraboul – Applying
Trajectory approach with static priority queuing for improving the use
of available AFDX resources.
10
Abstract - The FlexRay communication protocol is expected to be
the de facto standard for high speed, in-vehicle communication.
FlexRay is a robust, scalable, deterministic digital serial bus
system designed for use in automotive applications. FlexRay is
designed to be faster and more reliable than CAN and TTP. In
this paper we will propose some approaches, how we can
optimally schedule messages in Static and Dynamic segment of
FlexRay networks. Also we will present, how to make optimal
schedule with the fault-tolerant communication in static segment
and on the end we will present two algorithms for scheduling
messages in switched FlexRay network.
I. INTRODUCTION
FlexRay is communication protocol for high speed, in-
vehicle communication. FlexRay has “Static Segment” for
time-triggered messages and “Dynamic Segment” for event-
triggered messages. Thus FlexRay can achieve data rates up to
10Mb/s. Also in the newest FlexRay networks, so called
“switched FlexRay networks”, due to application of “switch”
instead of active star we can achieve branch parallelism, which
gives us opportunity to have more than one sender in one
FlexRay cycle without collisions.
But with messages in FlexRay comes an issue of scheduling
those messages. Static and dynamic segment of FlexRay
consists of slots. In those slots we are placing our messages,
which are transmitted in FlexRay frames, that consist of
message data as multiples of 2-byte words and framing
overhead. First difficulty is to prepare our schedule optimally
with respect to allocated slots. We would like to implement
such an algorithm, which will place our messages in FlexRay
slots using as few slots as possible.
But as we know automotive networks are hard real-time
systems, so we also have to take care about deadlines of our
messages. So in fact our schedule has to be feasible, especially
for the event-triggered dynamic segment, moreover in dynamic
segment we want to minimize the bandwidth reservation (due
to using as few slots as possible).
In each network we have to be sure, that our transmission is
reliable. The easiest way to achieve reliability threshold in
information exchange networks is to retransmit data. But there
is problem, which data should we retransmit (if there is more
faults) and how many times, to achieve reliability threshold.
And last problem is, how to achieve parallelism in FlexRay
networks. So far in one FlexRay cycle there can be only one
sender. One of the approach to achieve many senders is
switched FlexRay network.
It is very important to find algorithm for scheduling
messages, which will take reasonable amount of time, because
in real-time systems we cannot “buy” more time. We have to
schedule our messages in the time, which we have and our
result should be optimal or at least reasonable and of course
they have to be feasible. And to be feasible, all deadlines have
to be meet. In hard real-time systems (i.e. FlexRay) missing a
single deadline can lead to catastrophic consequences,
including loss of human life.
In fault-tolerant communication for static segment, because
of many possibilities of messages retransmissions it is very
important to choose appropriate messages for retransmission
and to indicate number of repetitions for those messages to
achieve reliability threshold.
It is very difficult to implement such algorithm which will
do all those things, which we mention above and will do that in
reasonable amount of time. With reasonable amount of time
we can achieve only sub-optimal solutions. For the optimal
algorithms we have to deal with discrete nonlinear
optimization problem. And for the fault-tolerant
communication it is difficult to compute, which message we
should to retransmit and also how many times to achieve
reliability threshold “p”.
The basic idea of solution for the problem of optimal
message scheduling in FlexRay networks is to find algorithm
which will:
- use ass few slots as possible (optimization with respect
to allocated slots)
- minimize the used bandwidth (by decreasing used slots)
- prepare feasible schedule scheme (with respect to
deadlines)
- for time-triggered messages minimize the jitter
- For event-triggered messages minimize used bandwidth
(B) and cycle load (L), where bandwidth is the number
of reserved slots per one cycle for each node and cycle
load is the maximum number of slots reserved for
message transmission in one FlexRay cycle (FC)
For the switched FlexRay networks to achieve message
sending parallelism basic idea of solution is to use “switch”
instead of “active star” in network topology.
Presented in this paper algorithms are optimal with respect
to the number of used slots, what makes network faster and the
bandwidth utilization is higher. All presented algorithms gives
feasible schedule, what is the most important thing in hard
real-time systems. Although that presented algorithms in
reasonable amount of time can achieve only sub-optimal
solutions, those solutions are reliable in practice.
II. FLEXRAY BUS
Optimal scheduling approaches
for FlexRay bus Piotr Swedrowski, Raphael Guerra
Technishe Universitat Kaiserslautern, Germany
11
A. Optimization framework for scheduling the FlexRay bus
Now we introduce and MILP (Mixed Integer Linear
Programming) formulation to find solution for the FlexRay
scheduling problem with respect to: FlexRay protocol rules,
allocated slots for schedule scheme and messages deadlines
(feasible schedule).
The approach is, that we have equations and restrictions,
which can be used to formulate algorithm, which will
optimally compute schedule scheme. To implement algorithm
we need:
- Activation, release and deadline constrains
- Job starts time and preemptions
- Schedulability constrains
- FlexRay protocol rules
- Data dependencies
Optimal solution can be found in a reasonable amount of
time (within one hour) for a case study taken from an x-by-
wire system, which proves the efficiency of this MILP solver.
B. Static segment of FlexRay bus
Message scheduling for static segment
This process can be divided into two steps, where first is
message frame packing of periodic signal (NIP problem). But
fortunately this NIP problem can be reduced to ILP problem
by evaluating the properties of FlexRay static segment.
Periodic signals are packed into messages, we first have to
observe that only signals from the same node and with the
same period are packed into the same message. Here very
important thing is how we will choose duration time of static
segment in FC(FlexRay cycle). For the static segment we can
calculate the duration time of static segment using this
equation:
TSS = NSTS * TSTS
Where:
TSS – duration time of static segment
NSTS - number of nodes
TSTS – duration time of one slot in static segment
After signals are packed into messages, we obtain an optimal
message set. Now we have to schedule this set. And that is the
second part of our approach, now we have to schedule periodic
messages obtained in the first step periodically while obeying
the FlexRay operation.
In scheduling our message set we can have two different
scenarios:
1) Message schedule without jitter:
When we have messages without jitter, then we are only
taking into account messages periods. First we are ordering
messages in such way, that two messages with coprime
periods cannot be scheduled with the same FID without
jitter (FID – frame ID, number of slot, in which particular
node can transmit message). After ordering message we
have partial order in our message set and now e can
schedule our message set using GLPK (Gnu Linear
Programming Kit).
2) Message schedule with jitter:
Here we have to schedule our message set optimally with
respect to allocated slots and also to jitter. To do that we
have to calculate for each message jitter and then schedule
message. After schedule all messages in one FC we sum up
all jitters from messages and calculate overall jitter (with
some weight “p”), which after that we add to our used slots
in formulation of our optimization problem.
The main conclusion, which we should to say about
messages scheduling for static segment of FlexRay network is
that frame packing I essential to achieve satisfactory
utilization. For example without frame packing we can achieve
utilization U=0,6 and with frame packing we can achieve
utilization U=0,11 what gives us about 83% improvement.
Fault-tolerant communication for static segment
In automotive networks we can have many faults, which can
be caused by electromagnetic interference, radiation,
temperature variations, etc.. Such a faults can appear for a very
short amount of time and they can cause miscalculations in the
logic or data corruption and after that they can disappear
without permanent or physical damage to the circuit. To defeat
that problem, we propose retransmission approach.
First propose will be CLP-based approach (optimal
approach). The main goal is to achieve reliability threshold
“p”. And schedule scheme has to be feasible and optimal with
respect to the number of used slots.
We have to define smallest repetition number for each
message “M”, that reliability threshold “p” is achieved. After
that we have for each message its repetition number “ki”. Then
from the all “ki” we can distinguish upper and lower bound for
“ki”, such that the reliability threshold “p” is achieved. So now
our possibilities of “ki” is much smaller. Now we only need to
calculate for each message in our schedule the finish time and
check whether the retransmission is possible and also if it is
needed.
Of course CLP-based approach is very good and optimal but
it is very much time consuming. We propose efficient heuristic
approach. This approach we can divide into three steps:
1) Compute the required number of retransmission “ki” for
each message, such that reliability threshold “p” is
achieved.
GP ≥ p
2) The second stage involves the scheduling, now we have to
assign slots to messages. So we have to check whether the
retransmission is needed and possible, and if yes then how
many time we should retransmit message to achieve
reliability threshold “p”
3) In the third stage algorithm is identifying critical message
(scheduler fails to build a schedule and goes back to the
step first to compute new “ki”). In this stage algorithm is
trying to minimize the number of changes in previous
computed “ki” values, such that a schedule can be
constructed.
The optimal-CLP algorithm is very good, but also is very
complex and its evaluation time grows up exponentially with
the number of messages. Also the optimal-CLP is always able
12
to find optimal solution, but with the large test cases it could
not give an answer in reasonable amount of time (within one
hour).
Heuristic approach is less complex and its computation time
is smaller, but heuristic approach does not give optimal result.
In 80 test cases heuristic algorithm fails only 5 times.
When there were cases with no feasible schedule solution,
both optimal-CLP and heuristic reported that. In all cases
when heuristic was successful, it also obtained the same
optimization cost as the optimal-CLP.
C. Dynamic segment of FlexRay
In FlexRay networks we have static segment for time-
triggered messages and the dynamic segment for event-
triggered messages. Dynamic segment is also divided into slots
(like static segment). Here the smallest entity in dynamic
segment of FlexRay network is MS (mini slot). Messages are
mapped into a specific DYS (dynamic slot). One DYS = one
ore more MS. In Dynamic segment for each message we are
making reservation of DYS for that message. It is sufficient to
transmit message during the reserved DYS, such that message
will meet its deadline.
Now, when we have static and dynamic segment, we have to
choose appropriate Tc(duration time of one FlexRay cycle):
Tc ≥ TcDS + TcSS
where:
TcDS – duration time of Dynamic Segment in one FlexRay
cycle
TcSS – duration time of Static Segment in one FlexRay cycle
Cycle Load (Lj) of an FlexRay cycle “j” denotes the
maximum number of used MS (mini slots), that is reserved for
message transmission in FCj for an arbitrary assignment of
FIDs (Frame ID). FID denotes, in which slots particular node
can transmit message. To one node we can assign more than
one FID, what means that one node can transmit message in
many slots (not only in one slot). The number of FIDs cannot
be grater than number of slots in dynamic segment, and the
smallest number of FIDs is the number of nodes in network.
In dynamic segment we can assign multiple messages to the
same reservation and as a result of this assignment, more
reservations are utilized, such that the bandwidth “B” is
minimized. This process of assigning more than one message
to the same reservation is called message grouping.
Basic idea of message grouping in dynamic segment of
FlexRay network is to group messages in such way, that
messages with the smallest deadlines are in many groups. If we
do such a grouping process, then after this process we have
many groups where messages with smallest deadlines are in
many groups, but on the end not all groups will be used in our
schedule scheme. For example: message M1 has to be
transmitted only once during, for example 8FC(FlexRay
cycles), so after transmission of message M1 on the beginning
of our 8FC in next FC cycles we still have reserved DYS for
M1, but those DYS will be never used for transmitting M1,
because message M1 was already transmitted. We can use
those reservation for transmission of another message, which is
also in the same group and can be transmitted during the
reservation for message M1.
To optimally schedule messages in dynamic segment of
FlexRay network we have to minimize the number of reserved
DYS for all messages. To do that we have to select our groups
to schedule scheme in such way, that reservation of DYS for
those messages is minimized. In that case we have to minimize
the Lmax (maximum reservation of DYS for all messages in
dynamic segment).
This minimization problem is the NIP problem, so to
facilitate solution this problem is decomposed into two linear
binary programming (BIP) minimization problems (steps).
First step is to select groups in such way, that the bandwidth
reservation “B“ is minimized. We can do this step using
Tomlab. After first step we have our selected groups, for
which bandwidth is minimized.
Now in the second step we can minimize the cycle load
reservation Lmax, by computing the offsets for selected groups.
It is verified that the NIP and two steps BIP formulations
gives the same results for practical messages sets. For small
message sets NIP gives an optimal solution, but for larger
message sets (more than 16 messages in dynamic segment)
NIP fails. Two steps BIP approach does not give an optimal
solution, but two steps BIP approach is suitable in practice
examples and is less computationally expensive than NIP
approach. Test cases has shown, that by message grouping we
can reduce bandwidth reservation by about 20%.
III. SWITCHED FLEXRAY NETWORKS
Switched FlexRay networks are the next step in automotive
systems. In switched FlexRay networks messages are also
transmitted in frames, and many messages can be assigned into
one frame. The main difference between ordinary FlexRay
network and switched FlexRay network is that in ordinary
FlexRay there is thing called “active star”, which is used to
connect many branches of FlexRay network, and in switched
FlexRay network instead of “active star”, we have “switch” to
connect branches.
Fig. 1 The switched FlexRay network structure
By this replacement we have in switched FlexRay networks
branch parallelism. Branch parallelism means, that we have
something like two (or more – depends from switch and from
network architecture) sub-networks, which are independent
from each other, so we can schedule messages in those sub-
networks in parallel without any collision.
Such a way of parallelism gives us opportunity to have
multiple senders in one FlexRay cycle (because of the
independency of sub-networks).
13
But also in switched FlexRay networks one problem is to
create feasible schedule scheme by packing a set of frames in
as few slots as possible (optimization with respect to the
number of allocated slots). Another problem are also the
collisions between frames, while packing those frames.
First algorithm, which we would like to present is
Decreasing First –Fit algorithm. This is no optimal algorithm,
but it gives reasonably good schedules in very short amount of
time.
The algorithm approach is as follow:
1) Frames are sorted according to their assigned weight
(where weight denotes message size).
2) The scheduler tries to place each frame in the first slot, in
which frame fits (every time checks if there will be
collision with the previous frame).
3) If a frame cannot be placed in any slot, then algorithm
fails to find a feasible solution.
Second algorithm, which we would like to present is branch
and price algorithm, which is an optimal algorithm
(optimization with respect to number of used slots).
This algorithm firstly is trying to find the smallest possible
subset of packings, that contains all frames, this process is
called master problem. Algorithm starts only with very small
set of packings and then is iteratively adding packings to this
small set (this process is called column generation), after
some time of adding packings algorithm finishes with only
limited number of all possible packings. From this moment it
is called RMP (Restricted Master Problem). RMP is solved by
iteratively checking every feasible set until an optimal one has
been found, this is done by simplex algorithm. To determine,
which packing should be included in the smallest subset, the
“pricer” is used in each step of the simplex algorithm.
The pricer tries to maximize the product “yt p”, by finding
new value p. In the product “yt p”, y denotes dual value vector
obtained by simplex algorithm, in which every frame maps to a
value. When the pricer finds new value p, then it counts the
product and if yt p >1, then the packing is added to the RMP
(smallest packings subset), otherwise is not added.
We will present two kinds of pricers:
1) First-Fit pricer, which is very fast, but is no optimal. Its
approach is very similar to DFF (decreasing First-Fit)
scheduler – if first result is good, take it.
2) ILP pricer, which is an optimal pricer, but it much slower
than first-fit pricer. In practice this pricer is only used,
when first-fit pricer cannot find a packing first.
Those two proposed algorithms are very different, when we
compare their computation times. DFF algorithm is very fast
and also in practice provides reasonable good schedule
scheme, but this algorithm is not optimal one. BP (Branch and
Price algorithm) is an optimal algorithm, but due to its ILP
pricing its run-time is very long (in some cases many hours).
Moreover BP algorithm is very computationally expensive. BP
algorithm with First-Fit pricer is not optimal algorithm, but
gives reasonable good solution in reasonable amount of time.
IV. CONCLUSIONS
In this paper we present approaches to schedule messages in
FlexRay network. FlexRay networks are brand new systems in
automotive networks, thus still some of algorithms presented
in this paper gives non-optimal solutions, but we should to
notice that the solutions which those algorithms gives are
feasible (what is the most important issue for the hard real-
time system) and also we should to notice that given
algorithms are able to give reasonable schedule scheme in
reasonable amount of time. In switched FlexRay networks we
are able to have multiple senders in one FlexRay cycle, thus
we can achieve faster information exchange in FlexRay
networks.
V. REFERENCES
[1] Haibo Zeng, Wei Zheng, Marco Di Natale “Scheduling
the FlexRay Bus Using Optimization Techniques”
[Design Automation Conference, 2009. DAC '09. 46th
ACMIEEE July 2009]
[2] Ece Guran Schmidt, Klaus Schmidt “Message Scheduling
for the FlexRay Protocol The Static Segment” [Vehicular
Technology, IEEE Transactions, Jun 2009]
[3] Bogdan Tanasa, Unmesh D.Bordoli, Petru Eles, Zebo
Peng “Scheduling for Fault-Tolerant Communication on
the Static Segment of FlexRay” [31st IEEE Real-Time
Systems Symposium (RTSS10), San Diego, CA, USA,
November 30-December 3, 2010]
[4] Ece Guran Schmidt, Klaus Schmidt “Message Scheduling
for the FlexRay Protocol - The dynamic segment”
[Vehicular Technology, IEEE Transactions, Jun 2009 ]
[5] Thijs Schenkelaars, Bart Vermeulen, Kees Goossens
“Optimal Scheduling of switched FlexRay networks”
[Design, Automation & Test in Europe Conference &
Exhibition (DATE), 2011, 14-18 March 2011] .
14
CHALLENGES FACED BY ON-CHIP-NETWORKIMPLEMENTATION FOR REAL-TIME
EMBEDDED SYSTEMSNaga Rajesh GarikiparthiChair of Real Time Systems
Department of Electrical and Computer EngineeringTechnische Universitat Kaiserslautern
Raphael GuerraChair of Real Time Systems
Department of Electrical and Computer EngineeringTechnische Universitat Kaiserslautern
Abstract—The objective of this paper is to address the chal-lenges of energy consumption, hardware overhead and perfor-mance confronted by Network on Chip and the implementationsproposed to overcome them. Energy efficiency is one of themost important issues and relies strikingly on task allocation.Theexisting work does not well illustrate the trade-off between powerconsumed by the processor and the network links. One of thegoals of this paper is to present a scheme which accounts for thetrade-off between communication and processing power resultingin suboptimal mappings of tasks from the system point of view.From the hardware implementation cost standpoint, a priorityshare policy for real-time on chip communication is describedwhich reduces resource overhead. Experiments show that sig-nificant resource sharing could be achieved without missingdeadlines. For maintaining low latency and high throughput forbest effort traffic assuring guaranteed services to traffic withhard deadlines, the paper describes a run-time configurable NoCthat enables bandwidth guarantees with minimum impact onlatency for best effort traffic. There exists a lot of heterogeneity inembedded system applications (for e.g. automotive, avionics andconsumer electronics). Hence arises the need for a predictablefault tolerant integrated execution environment for componentbased design. This paper also focuses on a Time Triggered NoCarchitecture that provides a uniform interface to all types of com-ponents which supports component based design, thus enablingreuse. It also offers inherent fault isolation and mechanisms suchas Integrated Resource Management and a power aware systembehavior.
I. INTRODUCTION
MPSoCs are being used for implementing a wide variety ofmulti functional applications in parallel, for example in mobiledevices for streaming media and general purpose productivityapplications. Network on Chip has emerged as a new paradigmto overcome limitations of the current bus system basedcommunication infrastructure for System on Chip designs. Thebandwidth of NoC scales with the complexity of SoC networksize. Computation is no more a major problem today inbuilding applications, but communication in SoC has becomea bottle neck in guaranteeing real-time and energy efficiencyconstraints.
Energy efficiency is one of the most critical design issues
in embedded system design development. There is a need toevaluate the tradeoff between processing power and commu-nication power at design time [1]. The efficient co-executionof diverse applications on MPSoCs taking energy consumptioninto account is crucial to the success of architectures employedby product developers. Mobile Internet devices, which are usedfor communication (hard real time), multimedia playback,content creation and augmented reality (soft real time), aswell as office applications (best effort), potentially requirebandwidth all at the same time [3].
Worm hole switching has been widely used for real timecommunication on NoCs. The non determinism in routingpackets due to contentions of the channels leads to delaysand jitters which violate the hard real time constraints. Themajor problem of the priority based approach precisely isthat it requires distinct priorities and an exclusive virtualchannel for each traffic flow in a router port. This restrictedimplementation structure results in high area and energyoverhead and heavily limits its employment and development[1]. An increase in latency often reduces the General Purpose(GP) application performance dramatically, so the GP trafficcan be considered latency sensitive [3]. The challenges ofcommunication among nearly autonomous possibly heteroge-neous IP blocks in MPSoCs can be addressed by a novelsystem architecture which offers a component based designmethodology.
A major roadblock in the MPSoC development process ismapping and scheduling of tasks onto the platform. To pursuethe system level optimal solution, it is difficult to consider thetrade-off between task processing power and communicationpower together. The priority share policy leads to significantblocking and unpredictable network latency.
A unified approach was developed in [1] for efficient com-putation of the system wide energy optimal task allocation byextending the Integer Linear Program (ILP) and the SimulatedAnnealing formulations. The experimental results show thatthe new Simulated Annealing (SA) heuristic achieves perfor-mance very close to the global optimum and much higher
15
execution speed than ILP based solutions. The problem ofresource overhead due to the priority based approach wassolved in [2] by a priority share policy, where multiple trafficflows were assigned the same priority, hence sharing thesame virtual channel. The number of virtual channels andpriorities were reduced by 50% and 70% respectively. Thepossibility of achieving best possible latency for best efforttraffic was achieved by prioritizing best effort traffic overguaranteed throughput traffic while limiting the bandwidthallocated to the best effort traffic such that enough resourcesremain to meet the guarantees. Furthermore the design allowsfor configuration of the QoS mechanisms at run time to allowflexible use of the system [3]. A novel architectural frameworkthat supports composability and addresses the challenge ofside effect free composition of component services to formlarge systems was proposed by a Time-Triggered NoC. Thisapproach contributed towards an elevated level of designabstraction, determinism through encapsulation, a global timebase for SoC and an Integrated Resource Management.
In this context, this paper addresses various challengesencountered by Network on Chip and solutions proposedby eminent people in the domain. Section 2 describes theproposals briefly and outlines the result. Section 3 discussesfurther points of view and Section 4 draws conclusions.
II. SOLUTIONS FOR VARIOUS CHALLENGESA. Energy Aware Task Allocation
The idea presented in [1] takes communication power intoaccount to reduce the system level energy consumption. Anobjective function is defined which has to be minimized inILP formulation. In the current framework different DynamicVoltage Scaling (DVS) modes of a DVS-enabled processorcan also be incorporated to reduce the task processing power.Todays MPSoCs involve complex designs; hence the com-putational complexity of the ILP formulation grows rapidly.Therefore [1] targets on providing a scalable algorithm throughSimulated Annealing with the Timing Adjustment algorithm.The SA (Simulated Annealing) optimization is started froma baseline mapping instead from a random mapping. Beforecomputing the mapping all the tasks are sorted in decreasingorder of desirability. The idea of desirability is to considerthe task with high gain first, in order to prevent its preferredprocessor from being occupied by other tasks. The task withhighest desirability is first assigned to a processor with lowestprocessing energy. The next task to be assigned is allocatedto the processor that has enough resources and minimizesthe total energy consumption considering the communicationwith the tasks that are mapped previously. This procedure isdone until all tasks are allocated. This optimization processcontinues until it reaches the total number of iterations Nconfigured by the user [1].
The SA is extended by a Timing Adjustment phase, whoseaim is to fine tune the timing of an accepted mapping tomeet the timing constraints.This phase keeps the mappingunmodified if it meets the deadline. Otherwise it examinesthe neighboring mappings to find a new solution that can
Fig. 1. Experimental results for evaluating tradeoff between Ep and Ecom[1]
improve the timing. The adjusted mapping is then checkedfor feasibility and the feasible mappings are compared to findthe best mapping so far. This procedure continues until thedeadline is met or no improvement can be found for all tasks.In the former case, the final mapping is the output of the TA,and in the later case, the problem is reported unfeasible.
The energy aware task allocation described in this paper wasevaluated using 10 task graphs as shown in Figure 1. Iit wasevident that by sacrificing processing energy, a higher amountof communication energy can be saved, resulting in a systemwide saving. Averaging over 10 graphs, 15% energy savingwas achieved. The heuristic algorithm outperforms ILP1 wherecommunication enegery was not considered for minimizationin the ILP, 11% energy is saved comparatively, while solvingan ILP problem may take from several minutes up to severalhours, SA-TA takes less than a second.
B. Worm Hole Switching with Priority Share Policy
Worm hole switching is a very popular cut-through strategyfor NoC. Each packet in a wormhole is divided into numberof flits and each flit is given a priority. The shared physicallink is accessed through virtual channels which is a resourceallocation technique which incorporates multiple independentbuffers to accommodate the flits for each shared link. A prior-ity arbiter decides which virtual channel could be given accessto the shared physical link. Differing from previous works isthat multiple traffic flows per virtual channels are supported.Each traffic flow is assigned a natural priority produced bythe distinct priority per flow policy, and also a system priorityis assigned to each flow where all flows competing for thesame virtual channels are given the same system priority asshown in Figure 2. The complicated blocking analysis wasavoided by collapsing all the flows with the same priorityinto one single scheduling entity and a novel schedulabilityanalysis was presented. This requires the consideration ofthe direct and indirect competing relationships. The directcompeting relationship means a traffic flow has at-least one
16
Fig. 2. A case of traffic flows with Priority Share [2]
physical link in common with the observed traffic flow. In theindirect competing relationship there is an intervening trafficflow between two traffic flows which do not share a link.Assuming that all natural and system priorities are assigneda traffic flow may encounter the following interferences andblocking.
• Direct Interference from traffic flow with higher systempriority
• Indirect Interference from traffic flow with higher systempriority
• Direct blocking from traffic flow with same system pri-ority
• Indirect blocking from traffic flow with same systempriority [2].
In [2] a greedy priority allocation policy which ensuresschedulability with reduced time complexity is introduced. Theintuition of the algorithm is as follows: at each system priorityGk, if any traffic flow τi exists that when τi is mapped topriority Gk, all the flows which have been assigned systempriority Gk or less are still schedulable, τi will be assignedpriority Gk. If no additional flow mapped to Gk can lead toa schedulable system, the system priority is increased. If aschedulable priority ordering exists under distinct priority perflow policy, there must exist a schedulable priority orderingunder priority share [2].
The priority share policy outperforms by exhibiting a re-markable hardware cost saving ; consuming only 20.3% ofpriority levels and 38.4% of virtual channels compared withthe original approach when the network maximum link loadreaches 0.4.
C. QoS Aware Link Arbitration Scheme
The mechanism to give priority to best effort (BE) traffic foroptimal latency by limiting its rate to allow enough bandwidthfor guaranteed throughput (GT) traffic is implemented in theswitch allocator for each router. Every output port requires aseparate arbiter which consists of a selective priority arbiterand a traffic shaper. The traffic shaper consists of a bucketof tokens, and tokens are added at an average token rate ofrtoken. The selective priority arbiter grants requests from BEvirtual channels first as long as there are tokens in the shapersbucket. Consequently, the token rate determines the averagerate of prioritized BE traffic. For every BE flit sent one token
is removed from the bucket and when the bucket is depletedof all tokens GT traffic is prioritized.
A GT connection between two routers can be set by ad-justing the traffic shaper settings on the corresponding outputports along the route of packets between those routers. Toefficiently shape BE traffic only on the affected routes, routingmust be deterministic for GT traffic. [3] implements distributeddimension ordered XY routing. For the latency sensitive besteffort applications evaluated in this paper a speed up to 14%was achieved and the latency of BE traffic was improved by47%.
D. Time Triggered Network on Chip
There is an inherent concurrency in a typical embeddedapplication (e.g., automotive electronics, avionics). The centralelement of the presented SoC architecture as seen in Figure 3 isa Time Triggered NoC that connects multiple heterogeneousIP Blocks called micro components. A micro component isan application sub system that provides a part of the serviceof the overall system, for example a braking system. A microcomponent comprises two parts: a host and a Trusted InterfaceSub System (TISS). The behavior of a micro component canneither disrupt the computations nor the communication per-formed by other micro components. The host implements theapplication services and the TISS is a dedicated architecturalelement that protects the access to the TT-NoC. Each TISScontains a table which stores a priori knowledge concerningthe global points in time of all message transmissions andreceptions of the respective micro component and ensure thetemporal ordering and consistent delivery order of the packets.
The purposes of the Time Triggered NoC encompass clocksynchronization for the establishment of global time base, aswell as the predictable transport of periodic and sporadic mes-sages. Using TDMA, the available bandwidth of the NoC isdivided into periodic conflict free sending slots. The allocationof sending slots of time triggered NoC to micro componentsoccur using a communication primitive called pulsed datastream. A pulsed data stream is a time triggered periodicunidirectional data stream that transports data in pulses witha defined length from one sender to n a priori identifiedreceivers at a specified phase of every cycle of a periodiccontrol system [4].
TT-SoC enables integrated resource management throughthe Trusted Network Authority (TNA) and the Resource Man-agement Authority (RMA). The RMA computes new resourceallocations for the non safety critical application subsystems,while the TNA ensures that the new resource allocationshave no adverse effect on the behavior of the safety criticalapplication sub systems. In order to prevent any unintendedinterference between subsystems, the time triggered SoC archi-tecture ensures temporal and spatial partitioning with respectto encapsulated communication channels which is enforced byTISS.
The TT-NoC is composed of fragment switches. The TISSof each micro component is connected to exactly one fragmentswitch via the TTNoC interface. A particular flit called routing
17
Fig. 3. Structure of Time Triggered SoC Architecture: trusted sub system(shaded) and non-trusted sub-system (hosts of micro components) [4]
flit, entering an interconnect carries switching informationcalled switching op-code which directly represents a hop. Thecomplete sequence of switching information from the sendingTISS to the receiving TISS is called routing information anddefines the route of an encapsulated communication channel.Hence, the sender affects the route which will be taken. This iscalled source routing [5]. The TTNoC enables the coexistenceof several encapsulated communication channels that conveypulsed data streams at the same instant of time among thenetwork topology. But at the same time interference mustbe prevented by avoiding the situation at all or by meansof interleaving fragments. The trouble of specifying branchedroutes and multiple receivers in multi-casting was solved bysplit point multi casting of the TTNoC.
The novel architecture described in this paper, a customhardware prototype was produced by TTTech1.Considering a32 bit data bus a theoretical throughput of 11.2 Gbit/sec couldbe achieved on a single encapsulated channel.
III. DISCUSSIONS
In the method proposed for energy aware task allocation in[1] the authors have not mentioned the problem of hot spots.When they consider the allocation of the tasks for the baselinemapping, to reduce the communication power the dependenttasks are placed closer. Hence this leads to hot spots due toincreased temperatures.
The solution presented in [2] to reduce the resource overhead is very elegant. The consideration of priority sharingthrough which the number of virtual channels is reduceddecreases the cost, area and also energy consumption. Thetraffic models presented and novel schedulability analysismake the computational complexity low. The quality of servicecan be flexibly explored at design time. As the number oftraffic flows increases the resource saving also increases.
1TTTech Computertechnik AG, http://www.tttech.com
The implementation of [3] results in power overhead. Con-sidering its application in Mobile Internet Devices, powerconsumption is much more critical than accomplishing optimallatency for BE traffic, thus thwarting its practical implemen-tation.
The time triggered NoC architecture raises several interest-ing points.
• Inherent Fault isolation• Predictability• Flexibility of heterogeneous components integration.• The high amount of throughput.
IV. CONCLUSIONS
With communication gaining dominance over computationin SoC design at least in consumer electronics, to be successfulas the market leaders especially with the advent of multicores in mobile devices, power and area consumption willbe the USP. With techniques such as [1] substantial energyconsumption can be reduced. Costs also play a major role incommercial embedded systems; implementations as proposedin [2] reduce the hardware implementation costs. For appli-cations involving concurrent real time and general purposeexecutions, practicing solutions as presented in [3] improvelatencies of BE traffic.
With the increasing complexity of embedded systems, bypartitioning an application into a set of autonomous concurrentfunctions a linear performance can be achieved with theincrease in number of devices. The Time Triggered NoC offersan increased ease of abstraction for mixed criticality systemswhich is characterized by safety critical and less critical microcomponents. While other approaches establish evolution ofwell known design styles, the TT-SoC engenders a revolutionin SoC design.
REFERENCES
[1] Jia Huang, Christian Buckl, Andreas Raabe and Alois Knoll :Energy-Aware Task Allocation for Network-on-Chip Based Heterogeneous Multi-processor Systems. 19th International Euromicro Conference on Parallel,Distributed and Network-Based Processing 2011.
[2] Zheng Shi and Alan Burns: Real-Time Communication Analysis with aPriority Share Policy in On-Chip Networks. 21st Euromicro Conferenceon Real Time Systems 2009.
[3] Jonas Diemer, Rolf Ernst and Michael Kauschke: Efficient Throughput-Guarantees for Latency-Sensitive Networks-On-Chip. Design and Au-tomation Conference 2009.
[4] Roman Obermaisser, Christian El Salloum, Bernhard Huber and Her-mann Kopetz : Time-Triggered System-on-Chip architecture. IndustrialElectronics 2008.
[5] Christian Paukovits and Hermann Kopetz : Concepts of Switching in theTime-Triggered Network-on-Chip . Embedded and Real-Time ComputingSystems and Applications Conference 2008.