Upload
christian-zurlo
View
27
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Wireless Sensor Networks. Low Power Design. Outline. Introduction – Importance of Low Power Design Power and Energy Low Power at various levels of circuit design: System and Architecture Level Register Transfer and Logic Level Physical Level Conclusion. Importance of Low Power Design. - PowerPoint PPT Presentation
Citation preview
Wireless Sensor Networks
Low Power Design
Outline
Introduction – Importance of Low Power DesignPower and EnergyLow Power at various levels of circuit design:
System and Architecture LevelRegister Transfer and Logic LevelPhysical Level
Conclusion
Importance of Low Power Design
Power is considered as the most important constraint in embedded systems
Low power design is essential in:high-performance systems (reason: excessive power dissipation reduces reliability and increases the cost imposed by cooling systems and packaging)portable systems (reason: battery technology cannot keep the pace with large demands for devices with light batteries and long time between recharges)
Sources of Power Consumption
The three major sources of power consumption in digital CMOS circuits are:
21 2 3avg t L dd clk sc dd leakage ddP p C V f I V I V P P P
where:P1 – capacitive switching power P2 – short circuit powerP3 – leakage current power
Trends in Power ManagementReducing power is now a mainstream design issue
Power and Energy
Power and Energy are related (E=∫Pdt)
Minimizing the power consumption is important forthe design of the power supplythe design of voltage regulatorsthe dimensioning of interconnectshort term cooling
Minimizing the energy consumption is important due torestricted availability of energy (mobile systems)
limited battery capacities (only slowly improving)very high costs of energy (solar panels, in space)
coolinghigh costslimited space
long lifetimes, low temperatures
Low Power at various levels of circuit design
higher impactmore options
AlgorithmLevel
ArchitectureLevel
CircuitLevel
Process DeviceLevel
SystemLevel Design partitioning, Power Down
Complexity, Concurrency, Locality,Regularity, Data representation
Voltage scaling, Parallelism,Instruction set, Signal correlations
Transistor sizing, Logic optimization,Activity Driven Power Down, Low-swing logic, Adiabatic switching
Threshold Reduction, Multi-threshold
The design of low power circuits can be tackled at different levels, from system to technology
Power and Synthesis Flow
Accuracy of Power Estimation
Po
ten
tial
fo
r P
ow
er S
avin
gs
Behavioral
RTL
Gate
Switch
20%
400%
50%
10%
Expectations
Algorithmic
Behavioral
RT Level
Tech. indepen.
Tech dep.
Layout
Power manage
Algorithm selection
ConcurrencyMemory
Clock ctrl
Structural transform.
Extraction/decomp.
Tech. mappingGate sizing
Placement
orders of magnitude
several times
10-90%
10-15%
15%
20%20%
20%
System and Architecture Level
Given a certain application, there are several possibilities for low power optimizations of the system:
Selection of an optimum algorithm with respect to the cost function the design
Partitioning into building blocks
Voltage/Frequency scaling
Dynamic power management
Minimize waste and overhead (indirectly) – increase regularity, locality
System and Architecture Level
Instruction set selectionMult-Add
vsMult, Add
Module selection
Ripple Addervs
Carry Select
Memory Management
Global FlowSelection
Memory Selection
Memory Assignment
AllocationHow Many?
2 MULTs(M1, M2)
2 ADDERs(A1,A2)
AssignmentWhich HW?
+ +
+ +
D
D
A1 A1
A2A1
M2
M1 M1
M2
SchedulingWhen?
Exu
Tim
e
InputAlgorithm
OutputArchitecture
HardwareLibrary
System and Architecture LevelAlgorithm selection and optimization
The first choice in design flow is usually the selection of an optimum algorithm with respect to the cost function
The term cost depends on the application and typically includes the number of operations, memory accesses and the memory size that is required by this algorithm
Power reduction is achieved with:
Scheduling of operations
Adaptive implementations of certain algorithms
System and Architecture LevelOptimizations for Memory Accesses
A paradigm for energy efficient software:Avoid using of memory operands as far as possible
Improve register utilization
Example of heapsort program [Jan M. Rabaey ‘97]:
Handtuning for performance:15% reduction in time, 13.5 reduction in energy
Register allocation of temporaries:5% reduction in current, 7% reduction in time, 11.4% reduction in energy
Further optimizationFurther 22.4% reduction
Total: 40.6% reduction in energy cost
System and Architecture LevelDesign partitioning
Optimum partitioning of the design will result in orders of magnitude power reduction
Examples of partitioning for low power:Partitioning the design in such a way as to confine the operations involving maximum switching activity to a single block
Partitioning the memory and distributing it to different blocks instead of centralized memory
Hardware/Software partitioning
Optimum partition of a design into analog and digital sections
System and Architecture LevelDesign partitioning – Interconnections
Interconnect power is important
Interconnect may contribute large percentage to total power dissipation and to total reduction
Interconnect power is greatly affected by architecture level design decisions
System and Architecture Level
Cheap localized
Few global All communicationsuse long global buses
bus accesses
communication
Spatially Global Spatially Local
Reduced # of global bus accesses
Reduced buffer power
Reduced # of multiplexers
Design partitioning
System and Architecture Level
-0.2 0.0 0.2 0.1-0.1
8th-order IIR
cascade filter
Spectral Partitioning places computational nodes on 1-D axis based on “closeness” — identifies candidates for clustering
Partitioning may lead to extra hardware units. This does not necessarily mean an increase in area!
Design partitioning – spectral partitioning
System and Architecture Level
DPDPDP
DP DP DP
ctlctl
ctl ctl
Glo
bal
bus:
470
0 DP DP
DPDP
ctl
ctlctl
ctl
Glo
bal
bus:
240
0
Non-local LocalUnits 4 add, 3 shift 4 add, 4 shiftGlobal buses 106 accesses 6 accessesBus power 2 mW 0.3 mWTotal Power 21.3 mW 16.3 mWArea 8.78 mW 7.46 mW
Average: Power reduction: 18.5 % Area Reduction: 1%
Design partitioning - Result
System and Architecture Level
• Subroutines
Coarse-grained regularity
• Loops
Fine-grained regularity
Usually evident to user Not obvious to user
• Similar code
+
*
+
*-
*
>>=
fragments
Regular implementations typically reduce interconnect and/or controller requirements [Mehru96]
Exploiting Regularity
System and Architecture LevelCommon Design Approaches
Max. processor speed(TMAX)
1) Compute-intensive andshort-latency processes
Time
De
sir
ed
Th
rou
gh
pu
t
3) System idle
2) Background and long-latency processes
Processor Usage Model
In order to reduce power following design approaches can be used:
Compute ASAP
Clock Frequency Reduction
Voltage Scaling
System and Architecture LevelCompute ASAP
In this approach the processor always performs the desired computation at maximum throughput
This is the simplest approach
Time
Delivered throughput
Energy/Operation
Desired throughput
Del
iver
ed T
hrou
ghpu
t, E
nerg
y/O
pera
tion
System and Architecture LevelClock Frequency Reduction
A common low power design technique is to reduce the clock frequency, fclk
This in turn reduces the throughput, and power dissipation, by proportional amount
The energy consumption remains unchanged
This approach is more energy inefficient, because the processor delivers the same amount of computation per battery life, but at lower level of peak throughput
Time
Delivered throughput
Energy/Operation
Desired throughput
Del
iver
ed T
hrou
ghpu
t, E
nerg
y/O
pera
tion
System and Architecture Level
Voltage ScalingWhen fclk is reduced the processor’s circuits have a longer cycle time to complete their computation
With voltage scaling down, i.e. reducing Vdd, the delay of the circuits increase
But, the energy/operation, which is quadratic function of Vdd, decreases
Time
Delivered throughput
Energy/Operation
Desired throughput
Del
iver
ed T
hrou
ghpu
t, E
nerg
y/O
pera
tion
System and Architecture Level
Voltage Scaling
Minimizing the delay penalty due to voltage scaling
Architecture-levelspeedup (pipelining, concurrency), then downscale supply voltage, ormatch supply voltage with throughput requirement
multiple supply voltages in the same designone supply voltage for each block
Circuit-levellowering threshold voltageheavily process-dependent
System and Architecture Level
Dynamic Power ManagementDynamic power management is a design methodology that dynamically reconfigures an electronic system to provide the requested services and performance levels with a minimum number of active components or a minimum load on such components
RUN
SLEEPIDLE~90s
~10s 160ms
Wait for interrupt Wait for wake-up event
P=400mW
~90s~10s
P=50mW P=0.16mW
OBSERVER CONTROLLER
Workloadinformation
Power Manager
SYSTEM
Observations Commands
Power Manager
Power State Machine
Register Transfer and Logic Level
Low-power techniques at RTL and Logic Level can be subdivided into:
techniques for lowering the capacitance and the switched voltage
minimizing global communication
logic optimization by synthesis tools (area, speed)
techniques to reduce the toggle rate of nodes with a high relative capacitance
guarding techniques
pipelining
reorganization of logic gates and operators
Register Transfer and Logic Level
Reducing switching activity
Guarding technique (clock gating)
Clock gating means to shut down the clocking for a certain group of registers under a certain guard condition
advantages: they are implemented with minor overhead in area and design effort
disadvantages: testability
Register Transfer and Logic Level
Examples of guarding technique
Latch Latch
Adder
A B
L_A L_B
Datain
Sel
1
0
R1
R2
N-bits binary Comparator
An
Bn
A1.....An-1
B1…..Bn-1
Ctrl
Y=A>B
Register Transfer and Logic Level
Reducing switching activity
Pipeliningreduces critical path (enables savings due to voltage scaling, or slower but energy-efficient algorithms)reduces glitchesdisadvantages: area overhead (with an implicit increase of capacitances and increase in clock power)
Register Transfer and Logic Level
Reducing switching activity
Reorganization of logic gates and operatorsmanual (reorganization of logic cells and reordering inputs)automatic (performed by synthesis tools):combinatorial
don’t care optimizationpath balancingfactorization
sequentialstate encodingretiming
Register Transfer and Logic LevelReducing switching activity – examples of reorganization
++
+
+
++
Flattening
A
B
C
D
A
A
B
C
D
Factoring
Idea: Remove common expressions to reduce capacitance
Caveat: This may increase activity!
Pa = 0.1
Pb = 0.5
Pc = 0.5
Don’t Care Optimization
Example: a b c
Activity is maximized for P(1) = 0.5!
Sequential logic optimization
State encodingseems to be of minimal impact in general
Data encoding in data pathse.g. use of sign-magnitude , one-hot, or redundant representations
mostly ad hoc
Retiming for low powerregisters can be strategically placed to reduce glitching, or to perform path balancing
Physical Level
On this level of abstraction the number of manually guided optimizations is quite limited The place and route tools automatically minimize the wire length (and wire capacitances) according to the time constraintsThis doesn’t represent the optimum concerning power consumptionThere are some design tasks which can nevertheless be exploited to save power:
partitioning (taking into account the interconnections between the layout blocks)back-annotating of layout capacitances together with switching activity information from gate level simulation to the synthesis tool (enables reoptimization of logic for low-power)
Power is a distributed problem – spans all designs disciplines: standards (GSM, OS), software, digital and analog hardware, process
Power related design decisions must be weighed against all of the system constraints: size cost, performance, testability, time to market … to develop a successful system
Low power design techniques have to be implemented at different levels of system design in order to achieve the best results
Conclusion