1 of 141/34
Embedded Systems Design:Optimization Challenges
Paul PopEmbedded Systems Lab (ESLAB)
Linköping University, Sweden
2 of 142/34
Outline
Embedded systems Example area: automotive electronics
Embedded systems design
Optimization problems Fault-tolerant mapping and scheduling Voltage scaling Communication delay analysis
Assessment and message
3 of 143/34
1%
99%
Embedded Systems
General purpose systems Embedded systems
Microprocessor market shares
4 of 144/34
Example Area: Automotive Electronics
What is “automotive electronics”? Vehicle functions
implemented with electronics Body electronics System electronics: chassis,
engine Information/entertainment
5 of 145/34
Automotive Electronics Market Size
8.9Market
($billions)
10.5
13.1 14.1 15.8 17.4 19.3 21.0
0
200
400
600
800
1000
1200
1400
1998 1999 2000 2001 2002 2003 2004 2005
Cost of Electronics / Car ($)
90% of future innovations in vehicles:based on electronic embedded systems
2006: 25% of the total cost
of a car will be electronics
6 of 146/34
Automotive Electronics Platform Example
Source: Expanding automotive electronic systems, IEEE Computer, Jan. 2002
7 of 147/34
Outline
Embedded systems Example area: automotive electronics
Embedded systems design
Optimization problems Fault-tolerant mapping and scheduling Voltage scaling Communication delay analysis
Assessment and message
8 of 148/34
Embedded Systems Design
Model of system implementation
Estimation:
exec. time
System platform
System model
System-leveldesign tasks
Analysis
Softwaresynthesis
Hardware
synthesis
Growing complexityConstraints
Time, energy, sizeCost, time-to-marketSafety, reliability
HeterogeneousHardware components
Comm. protocols
Mapping and scheduling
Voltage scaling
Communicationdelay analysis
9 of 149/34
Embedded System Design, Cont.
Goal: automated design optimization techniques Successfully manage the complexity of embedded systems Meet the constraints imposed by the application domain Shorten the time-to-market Reduce development and manufacturing costs
Optimization: the key to successful design
10 of 1410/34
Outline
Embedded systems Example area: automotive electronics
Embedded systems design
Optimization problems Fault-tolerant mapping and scheduling Voltage scaling Communication delay analysis
Assessment and message
11 of 1411/34
Optimization Problems
1. Mapping and scheduling1.1 Mapping to minimize communication1.2 Mapping and scheduling1.3 Fault-tolerant mapping and scheduling
2. Voltage scaling2.1 Continuous voltage scaling2.2 Discrete voltage scaling
3. Communication delay analysis
12 of 1412/34
Given Application: set of interacting
processes Platform: set of nodes
Problem #1.1: Mapping
P1
P4
P2 P3
m1 m2
m3 m4
Assessment Optimal solutions even for large problem sizes
N1 N2
P1
P4
P2 P3
m1 m2
m3 m4Determine
Mapping of processes to nodes Such that the communication is
minimized
13 of 1413/34
Problem #1.2: Mapping and Scheduling
Given Application: set of interacting processes Platform: set of nodes Timing constraints: deadlines
Determine Mapping of processes and messages Schedule tables for processes and messages
Such that the timing constraints are satisfied
S2 S1
P1
P4
P2
m1 m2m3 m4
P3
N1
N2
Bus
Schedule
table
Deadline
P1
P4
P2 P3
m1 m2
m3 m4
N1 N2
14 of 1414/34
Problem #1.2: Assessment
Scheduling is NP-complete even in simpler context D. Ullman, “NP-Complete Scheduling Problems”, Journal
of Computer Systems Science, volume 10, pages 384–393, 1975.
ILP formulation Can’t obtain optimal solutions for large problem sizes
Alternative: divide the problem Scheduling
Heuristic: List scheduling Mapping
Simulated annealing Tabu-search Problem-specific greedy algorithms
15 of 1415/34
Messages: Schedule tables
Messages: Fault-tolerant protocol
Processes: Schedule tables
Fault-Tolerant Mapping and Scheduling
...
Transient faults
TDMA bus: TTP
Processes: Re-execution and replication
16 of 1416/34
Fault-Tolerance Techniques
P1 P1 P1
Re-execution
N1
P1
P1
P1
Replication
N1
N2
N3
P1
P1
N1
N2
P1
Re-executed replicas
2
17 of 1417/34
Problem #1.3: Formulation
Application: set of process graphs Architecture: time-triggered system
...
Fault-model: transient faults
Given Application: set of interacting processes Platform: set of nodes Timing constraints: deadlines Fault model: number of transient faults in the system period
Determine Mapping of processes and messages Schedule tables for processes and messages Fault-tolerance policy assignment
Such that the timing constraints are satisfied
18 of 1418/34
P1N1
N2
TTP
P2
P3
S1S2
P4
m 2
Missed
Fault-Tolerance Policy Assignment
P1P2P3
N1 N2
40506060
8080
P4 4050
1
N1 N2P1
P4P2
P3
m1
m2
m3
P1N1
N2
TTP
P2
P3
S1S2 m 2
P4
N1
N2
P1
P3
S1S2
P4P2
P1
m 1m 1TTP m 2m 2
P2
m 3m 3
P3
P4
Missed
P1N1
N2
TTP
P2
P3
S1S2 m 2
P4No fault-
tolerance:application crashes
N1
N2
P1
P3
S1S2
P4P2
P1
m 2m 1TTP
Met
Optimization of fault-tolerance
policy assignment
Deadline
19 of 1419/34
Tabu-Search: Policy Assignment & Mapping
P1P2P3
N1 N2
40506060
7575
P4 4050
1
N1 N2P1
P4P2
P3
m1
m2
m3
N1
N2
TTP
P1
P3
S1S2 S2
P4
m2
P2
1101Wait0021TabuP4P3P2P1
1101Wait0021TabuP4P3P2P1 Current
solution
Designtransformations
N1
N2
TTP
P1 P3
S1S2
P4P2
m1
1101Wait0021TabuP4P3P2P1
1101Wait0021TabuP4P3P2P1
Tabu move &worse than best-so-far
S1
N1
N2
P1
P3
S1S2
P4P2
P1
m2
m1
TTP
1200Wait0012TabuP4P3P2P1
1200Wait0012TabuP4P3P2P1
Tabu move &better than best-so-far
N1
N2
TTP
P1 P3
S1S2 S2
P4
m2
P2
1101Wait0021TabuP4P3P2P1
1101Wait0021TabuP4P3P2P1
Non-tabu &worse than best-so-far
N1
N2
TTP
P1
P3
S1S2 S2
P4
m2
P2 P3
1101Wait0021TabuP4P3P2P1
1101Wait0021TabuP4P3P2P1
Non-tabu &worse than best-so-far
N1
N2
P1
P3
S1S2
P4P2
P1
m2
m1
TTP
1200Wait0012TabuP4P3P2P1
1200Wait0012TabuP4P3P2P1 Current
solution
Designtransformations
20 of 1420/34
Problem #2: Voltage Scaling
GSM Phone:SearchRadio link controlTalking
MP3 Player
Digital Camera:Take photoRestore photo
Timing constraints
0
10
20
30
40
50
60
70
199719992002200520082011
Battery powerChip power
Power constraints
21 of 1421/34
Problem #2: Voltage Scaling
deadlinet
P
1 2 3
Energy/speed trade-offs:
varying the voltagesVbs
CPUVdd
f1 f2 f3
Different voltages:different frequencies
Mapping and scheduling: given
(fastest freq.)
Power
deadlinet1 2 3
Slack
22 of 1422/34
0 2 3
5
4
1CPU1
CPU2
CPU3
BusVdd
VbsVdd
Vbs
Vdd
Vbs
Problem #2.1: Continuous Voltage Scaling
Given Application: set of interacting processes Platform: set of nodes, each having
supply voltage (Vdd) and body bias voltage (Vbs) inputs
Mapping and schedule table (including timing constraints)
P
deadlinet1 2 3
Slack
Architecture and mapping Schedule table / processor
23 of 1423/34
Problem #2.1: Continuous Voltage Scaling
Determine Voltage levels Vdd and Vbs for each process
Such that system energy is minimized and Deadlines are satisfied
deadlinet
P
1 2 3
P
deadlinet1 2 3
Slack
Output
InputHeight:
voltage level
Area:energy
Assessment Convex nonlinear problem
Polynomial time solvable with an arbitrary good precision A. Andrei, “Overhead-Conscious Voltage Selection for
Dynamic and Leakage Energy Reduction of Time-Constrained Systems”, technical report, Linköping University, 2004
24 of 1424/34
Problem #2.1: Formulation
Minimize energy E[0] + E[1] + E[2] + E_OH[1-2]
Such that Tstart[0] + Texe[0] Tstart[1]
Tstart[1] + Texe[1] + Toh[1-2] Tstart[2]
Tstart[2] + Texe[2] DL[2]
Energy due to processes
Overhead due to
voltage changes
Precedencerelationships
Deadlines
CPU1
0
1
2
BUS CPU2CPU1
0
1
2
BUS CPU2
25 of 1425/34
Problem #2.2: Discrete Voltage Scaling
Problem formulationGiven discrete execution frequencies
Processors can operate using a frequency from a fixed discrete set Changing the frequency incurs a delay and an energy penalty
Determine the set of frequencies for each task Such that system energy is minimized and Deadlines are satisfied
t
P deadline
2 31
f2 f2f2f3f3 f1
Discrete
26 of 1426/34
Given 1 processor: f{50, 100, 150} MHz
3 processes
1: P={10, 20, 30} mW, dl=1ms, NC=100 cycles
2: P={12, 22, 32} mW, dl=1.5ms , NC=100 cycles
3: P={15, 25, 35} mW, dl=2ms , NC=100 cycles
Schedule: execution order is 1, 2, 3
Determine For each process, number of clock cycles to be executed at each
frequency
such that the energy is minimized
),,(),,,(),,,( 33
23
13
32
22
12
31
21
11 ccccccccc
Problem #2.2: Example
27 of 1427/34
Problem #2.2: Example, Cont.
Assessment: Strongly NP-hard problem
The frequencies are now a set of integers; identical to: P. De, “Complexity of the Discrete Time-Cost Tradeoff Problem for
Project Networks”, Operations Research, 45(2):302–306, March 1997.
iiii NCccc 321 Each task has to execute the given number of cycles
iiii tf
c
f
c
f
c
3
3
2
2
1
1
Task execution time
1 iii ttstart Precedence constraints
iii dltstart Deadline constraints
i
ii
ii
ii P
f
cP
f
cP
f
c 33
32
2
21
1
1
Minimize energy
MILP formulation for the optimal solution
28 of 1428/34
Embedded Systems Design
Mapped andscheduled model
Estimation:
exec. time
System platform
System model
System-leveldesign tasks
Analysis
Softwaresynthesis
Hardware
synthesis
Communication protocols
Communicationdelay analysis
29 of 1429/34
Problem #3: FlexRay Analysis
FlexRay communication protocol Becoming de-facto standard in automotive electronics
BMW, DaimlerChrysler, General Motors, Volkswagen, Bosch, Motorola, Philips
Deterministic data transmission, fault-tolerant, high data-rate
...
Communication protocol: FlexRay
S R
Worst-casecommunication delay
ProblemGiven
Application: set of interacting processes Platform: set of nodes connected by FlexRay Implementation: Mapping and scheduling
Determine Worst-case communication delays for messages
30 of 1430/34
Problem #3: FlexRay Analysis, Cont.
Static Dynamic Static Dynamic
GeneralizedTime-division
multiple access
FlexibleTime-division
multiple access
Bus cycle
m4
m5
m6m7
Arrive dynamically
Off-line: Worst-case analysis
Off-line:Schedule table
m2 m3m1
Statically assigned
31 of 1431/34
Problem #3: Formulation and Example
Given FlexRay bus
Length of the static phase Length of the dynamic phase
Dynamically arriving messages Priorities
Determine for each message Worst-case communication delay
Static Static StaticDynamic Dynamic Dynamic
Fixed size bin
m4
m5
m6
m7
Priority
DynamicDynamicDynamic
Analyze this!
m4 m5m6m7Static Static Static
Bin covering: two bins
32 of 1432/34
Problem #3: Assessment
”Classic” bin covering problemGiven
Set of bins of fixed integer size Set of items of integer size
Determine Maximum number of bins that can be filled with the items
Assessment Asymptotic fully polynomial time approximation
FlexRay dynamic phase analysis ≠ ”classic” bin covering Bins have an upper limit: size of the dynamic phase Assessment
Approximation algorithm does not exist MILP formulation feasible up to 60 messages
Wanted:better
solution
33 of 1433/34
Outline
Embedded systems Example area: automotive electronics
Embedded systems design
Optimization problems Fault-tolerant mapping and scheduling Voltage scaling Communication delay analysis
Assessment and message
34 of 1434/34
Message
Optimization Key to successful embedded systems
design
Challenges Classify the problems Divide the problem into sub-problems Formulate the problems Solve the problems optimally Fast and accurate heuristics for specific
problems