Vhdl Lp MAPLD2004

Embed Size (px)

Citation preview

  • 8/12/2019 Vhdl Lp MAPLD2004

    1/54

    VHDL Design Tips and

    Low PowerDesign Techniques

    Jonathan Alexander

    Applications Consulting Manager

    Actel Corporation

    MAPLD 2004

  • 8/12/2019 Vhdl Lp MAPLD2004

    2/54

    2 MAPLD 2004Alexander

    Agenda

    Advanced VHDL

    ProASICPlus Synthesis, Options and

    AttributesTiming Specifications

    Design Hints

    Power-Conscious Design Techniques

    Summary

  • 8/12/2019 Vhdl Lp MAPLD2004

    3/54

    3 MAPLD 2004Alexander

    Actel ProASICPlus Design Flow

    VHDL

    SourceDirectives

    Logic

    Optimization

    Technology

    Mapping

    Technology

    Implementation

    Synthesis

    Place&

    Route

    Attributes

    Timing

    Timing, Pin,

    Placement

  • 8/12/2019 Vhdl Lp MAPLD2004

    4/544 MAPLD 2004Alexander

    What is Synthesis?

    The mapping of a behavioral description to aspecific target technology,

    i.e. Generates a structural netlist from a HDL description

    Includes optimization steps

    Optimize the design implementation for Higher Speed

    Smaller Area

    Lower Power

  • 8/12/2019 Vhdl Lp MAPLD2004

    5/545 MAPLD 2004Alexander

    ProASICPlus HDL Attributes andDirectives

    Attributes are used to direct the way your design isoptimized and mapped during synthesis.

    Directives control the way your design is analyzedprior to synthesis. Because of this, directives must be

    included in your VHDL source code.

    Three important ProASICPlus attributes or directives

    are available:

    syn_maxfan (attribute)

    syn_keep (directive)

    syn_encoding (attribute)

  • 8/12/2019 Vhdl Lp MAPLD2004

    6/546 MAPLD 2004Alexander

    ProASICPlus HDL Attributes andDirectives (contd)

    syn_maxfan = Value

    Value Range > 4

    Can be assigned to an input port, register output, or a net

    Overrides the global Fanout Limit setting

    The tool wil l replicate the signal if this attr ibute is associated withit

    Syntax In the HDL code

    attribute syn_maxfan of data_in : signal is 1000;

    In the constraint file define_attribute {clk} syn_maxfan {200}

  • 8/12/2019 Vhdl Lp MAPLD2004

    7/547 MAPLD 2004Alexander

    ProASICPlus HDL Attributes andDirectives (contd)

    syn_keep = 1

    When associated with a signal, this directive prevents Synplifyfrom combining or collapsing the node.

    This attribute can be associated with combinatorial signals only

    Syntax

    In the HDL code

    At t r i but e syn_keep of st : si gnal i s I nt eger : =1 ;

    In the constraint file

    def i ne_at t r i but e {st } syn_keep {1};

  • 8/12/2019 Vhdl Lp MAPLD2004

    8/548 MAPLD 2004Alexander

    Agenda

    Advanced VHDL ProASICPlusSynthesis and Options and Attributes

    Timing SpecificationsDesign Hints

    Power-Conscious Design Techniques

    Summary

  • 8/12/2019 Vhdl Lp MAPLD2004

    9/549 MAPLD 2004Alexander

    Timing Constraints Specification

    Synplify ProASICPlus mapper allows specification of thefollowing:

    Global Design Frequency

    Multi-clock design

    Skew between two clocks

    Input and output delays

    Functional multi-cycle and false paths

    All these timing specifications are available in theGUI, the presentation will cover the sdc constructsonly.

  • 8/12/2019 Vhdl Lp MAPLD2004

    10/5410 MAPLD 2004Alexander

    Design Frequency Specification

    Multiple Clocks

    Graphical User Interface Frequency item allows

    specification of a global value for all clocks This setting influences the operator architecture selection

    (speed or area) during mapping

    This value should be set to the highest frequency required inthe design

    To specify individual values for different clocks, use thefollowing sdc construct

    define_clock {clock_1} -freq define_clock {clock_2} -freq

  • 8/12/2019 Vhdl Lp MAPLD2004

    11/5411 MAPLD 2004Alexander

    Skew Specification in Synplify

    To define a skew between two clocks, use the following

    constraint:

    define_clock_delay - r i se {cl ock1} - r i se {cl ock2} val ue

    Example

    define_clock_delay - r i se {CLK19M} - r i se {MPU_CLK} 1. 0

    define_clock_delay - r i se {MPU_CLK} - r i se {CLK19M} 2. 0

  • 8/12/2019 Vhdl Lp MAPLD2004

    12/5412 MAPLD 2004Alexander

    Input Delay

    Specifies the input arrival time of a signal in relationto the clock.

    It is used at the input ports, to model the interface ofthe inputs of the FPGA with the outside environment.

    The value entered should represent the delay outsideof the chip before the signal arrives at the input pin

    To specify the input delay on an input port, use thefollowing constraint:

    define_input_delay {I nput Por t Name} Val ue

  • 8/12/2019 Vhdl Lp MAPLD2004

    13/54

    13 MAPLD 2004Alexander

    Output Delay

    Specifies the delay of the logic outside theFPGA driven by the top-level outputs.

    Used to model the interface of the outputs of

    the FPGA with the outside environment.

    To specify the output delay, use the

    following constraints:

    define_output_delay {Out put Por t Name} Val ue

  • 8/12/2019 Vhdl Lp MAPLD2004

    14/54

    14 MAPLD 2004Alexander

    Functional False Path

    define_false_path allows user to specify paths whichwill be ignored for timing analysis, but will still be

    optimized, without priority within Synplify.

    The following options are available :

    -from < a register or input pin>

    -to

    -through

    Example

    define_false_path - f r om Regi st er _A

    define_false_path - t o Regi st er _B

    #Paths to Register_B are ignored

    define_false_path - t hr ough t est _net

    #Paths through Int_Net are ignored

  • 8/12/2019 Vhdl Lp MAPLD2004

    15/54

    15 MAPLD 2004Alexander

    Agenda

    Advanced VHDL ProASICPlus Synthesis, Options and Attributes

    Timing Specifications

    Design Hints

    Power-Conscious Design Techniques

    Summary

    L t A i l Si l

  • 8/12/2019 Vhdl Lp MAPLD2004

    16/54

    16 MAPLD 2004Alexander

    - - I ni t i al Descr i pt i oncase St at e iswhen WAI T =>

    if Cr i t i cal then

    Tar get

  • 8/12/2019 Vhdl Lp MAPLD2004

    17/54

    17 MAPLD 2004Alexander

    Late Arrival Signal:Another Hint !

    mux

    >=

    +B

    A_late

    Max

    C

    D

    >=

    B

    Max

    A_late

    C

    D

    .

    begin

    if ( (A_late + B) >= Max)then Out = C;

    else Out = D;end if;

    end Process;

    mux

    Out

    if ( (B - Max) >= A_l at e)Out = C;

    else Out = D; .

    Out

  • 8/12/2019 Vhdl Lp MAPLD2004

    18/54

    18 MAPLD 2004Alexander

    Signal vs Variable

    Variable assignments are sensitive to order. Variables are updated immediately

    Signal assignments are order independent.

    Signal assignments are scheduled

    Process (Clk)begin

    if(ClkEvent and Clk=1) then

    Trgt1

  • 8/12/2019 Vhdl Lp MAPLD2004

    19/54

    19 MAPLD 2004Alexander

    Resource Sharing andOperand Alignment

    With Resource

    Sharing

    (Smaller)

    Without Resource

    Sharing

    (Larger and Slower)

    Implementations

    mux

    mux

    *

    X

    Y

    Y

    Z

    Sel

    Sel

    Res

    mux

    *

    *

    X

    Y

    Y

    Z

    Sel

    Res

    mux

    *Y

    X

    Z

    Sel

    ResOperand

    Alignment

    (Faster*)

    HDL Code

    process ( X, Y, Z, Sel )begin

    if ( Sel = 0 ) then

    Res

  • 8/12/2019 Vhdl Lp MAPLD2004

    20/54

    20 MAPLD 2004Alexander

    Resource Sharing to Avoid

    Buses

    mux

    mux

    =

    X

    Y

    Z

    T

    Sel

    Sel

    Eq

    With Resource

    Sharing

    (Larger and Slower)

    16VHDL Code

    1

    16

    process ( X, Y, Z, T, Sel )begin

    if ( Sel = 0 ) thenEq

  • 8/12/2019 Vhdl Lp MAPLD2004

    21/54

    21 MAPLD 2004Alexander

    Internal Three-state Buffers

    At the VHDL Level

    Either Using the

    Multiplexer based

    modified VHDL code, orReplace the three-state

    structure using

    the equivalent following

    AND-OR structure

    tri_en1

    tri_en2

    tri_en3

    tri_en4

    tri_in1

    tri_in2

    tri_in3

    tri_in4

    tri_out

    mux_en1

    mux_en2

    mux_en3

    mux_in2

    mux_in3

    mux_in4

    mux_in1

    tri_en1

    tri_en2

    tri_en3

    tri_en4

    tri_in1

    tri_in2

    tri_in3

    tri_in4

    tri_out

    mux_out

  • 8/12/2019 Vhdl Lp MAPLD2004

    22/54

    22 MAPLD 2004Alexander

    Agenda

    Advanced VHDL

    Power-Conscious Design Techniques

    Data Path SelectionFSM Encoding

    Gating Clocks and Signals

    Advanced Power Design Practices

    Summary

    Sources of Dynamic Power

  • 8/12/2019 Vhdl Lp MAPLD2004

    23/54

    23 MAPLD 2004Alexander

    Sources of Dynamic PowerConsumption

    Switching CMOS circuits dissipate power during switching

    The more logic levels used, the more switching activity needed

    Frequency Dynamic power increases linearly with frequency

    Loading Dynamic power increases with capacitive loading

    Glitch Propagation

    Glitches cause excessive switching to occur at relatively highfrequencies.

    Clock Trees

    Clock Trees operate at high frequency under heavy loading, sothey contribute signif icantly to the total power consumption.

  • 8/12/2019 Vhdl Lp MAPLD2004

    24/54

    24 MAPLD 2004Alexander

    Data Path Elements Selection

    Basic block selection is critical as the power/speedtradeoff has to be well identified

    Power is switching activity dependent, thus input datapattern dependent

    Watch the architecture of the basic arithmetic andlogic blocks

    Check area/speed and fanout distribution/number of logic levels

    High fanout + large number of logic level = higher glitchpropagation

    Investigate pipelining effect on power dissipation

    Impact on clock tree power consumption

    Impact on block fanout distribution

  • 8/12/2019 Vhdl Lp MAPLD2004

    25/54

  • 8/12/2019 Vhdl Lp MAPLD2004

    26/54

    26 MAPLD 2004Alexander

    Review: Ripple Adder

    Carry signal switching propagates through all the stagesand consumes Power

  • 8/12/2019 Vhdl Lp MAPLD2004

    27/54

    27 MAPLD 2004Alexander

    Review Carry Look-Ahead Adder

    Carry signal switching propagates through less stages

    However, higher number of Logic Level

  • 8/12/2019 Vhdl Lp MAPLD2004

    28/54

    28 MAPLD 2004Alexander

    Carry Select Adder Overview

    Principle:Do it twice (considering Carry=0 and Carry=1)

    then when actual Carry is ready,Select appropriate result

    Carry signal switching propagates through less stages

    However, higher duplication and complexity

  • 8/12/2019 Vhdl Lp MAPLD2004

    29/54

    29 MAPLD 2004Alexander

    Adder Architectures

    Delay (ns)

    5

    10

    15

    20

    25

    30

    35

    40

    45

    4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

    Width

    RPL CLA CLF BK

    Area (# Tiles)

    10

    60

    110

    160

    210

    260

    310

    360

    4 5 6 7 8 9 10 11 12 1 3 14 1 5 16 17 18 1 9 20 21 2 2 23 2 4 25 26 27 28 29 3 0 31 3 2

    Bit Width

    RPL CLA CL F BK

    Forward Carry Look Ahead (CLF): Fastest but also largestBrent and Kung (BK):Almost same speed as CLF but drastically small

    Carry Look Ahead (CLA): Relatively small and slow

    Ripple (RPL): Smallest but slowest

    Brent and Kung: Best area/speed tradeoff

  • 8/12/2019 Vhdl Lp MAPLD2004

    30/54

    30 MAPLD 2004Alexander

    Adders Power Dissipation

    Brent and Kung: Lowest Power DissipationLowest logic levels

    Lowest fanout

    Power Consumption of 32 bit Adder (Speed)

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

    Frequency

    Power(mW)

    RPL CLA BK CLF

  • 8/12/2019 Vhdl Lp MAPLD2004

    31/54

    31 MAPLD 2004Alexander

    Data Path Architectures

    Adders ArchitecturesArchitecture Evaluation

    Test Results

    Multipliers

    Architectures and Power Implications

    Pipelined Configurations

    Pipeline Effect on Power

    Pipelining vs re-Timing

  • 8/12/2019 Vhdl Lp MAPLD2004

    32/54

    32 MAPLD 2004Alexander

    Multipliers Power Consumption

    Wallace Advantages Over Carry-Save Multiplier (CSM) Uniform switching propagation

    Less logic levels

    Lower average fanout

    32 Bit Multipliers

    100

    200

    300

    400

    500

    600

    700

    800

    900

    5 7 9 11 13 15 17 19 21 23 25

    Frequency

    CSA Wallace

  • 8/12/2019 Vhdl Lp MAPLD2004

    33/54

    33 MAPLD 2004Alexander

    Data Path Architectures

    Adders ArchitecturesArchitecture Evaluation

    Test Results

    Multiplier

    Architectures and Power Implications

    Pipelined ConfigurationsPipeline Effect on Power

    Pipelining vs re-Timing

  • 8/12/2019 Vhdl Lp MAPLD2004

    34/54

    34 MAPLD 2004Alexander

    Pipelining for Glitch Reduction

    A logically deep internal net is typically affected by moreprimary inputs switching, and is therefore more susceptibleto glitches

    Pipelining shortens the depth of combinatorial logic by

    inserting pipeline registersPipelining is very effective for data path elements such as

    parity trees and mult ipliers

    ProcessingUnit

    Regi

    ster

    Register

    Regi

    ster

    1/2

    ProcessingUnit

    1/2

    ProcessingUnit

    ff

  • 8/12/2019 Vhdl Lp MAPLD2004

    35/54

    35 MAPLD 2004Alexander

    Pipelining Effect on Power

    0

    50

    100

    150

    200

    250

    300

    350

    400

    5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

    Frequency

    Power(mW)

    FFT_10 ClockTree_10

    FFT30 ClockTree_30

    Pipelined FFT

    Non- Pipelined FFT

    Pipelined Clock Tree

    Non- Pipelined Clock

    Tree

    Pipelining increases clock tree power, but overallpower is lowered

  • 8/12/2019 Vhdl Lp MAPLD2004

    36/54

    A d

  • 8/12/2019 Vhdl Lp MAPLD2004

    37/54

    37 MAPLD 2004Alexander

    Agenda

    Advanced VHDL

    Power Conscious Design Techniques

    Data Path Selection

    FSM EncodingGating Clocks and Signals

    Advanced Power Design Practices

    Summary

    FSM and Counter Encoding:I t P

  • 8/12/2019 Vhdl Lp MAPLD2004

    38/54

    38 MAPLD 2004Alexander

    Impact on Power

    State One Hot Gray Binary

    SO OOOOOOO1 OOO OOO

    S1 OOOOOO1O OO1 OO1S2 OOOOO1OO O11 O1O

    S3 OOOO1OOO O1O O11

    S4 OOO1OOOO 11O 1OO

    S5 OO1OOOOO 111 1O1

    S6 O1OOOOOO 1O1 11O

    S7 1OOOOOOO 1OO 111

    Total Number of Transitions 16 8 11

    Maximum Transitions Per

    Clock Cycle

    2 1 3

    Clock Load 8 3 3

    Counters and FSMs:State Register Transitions

  • 8/12/2019 Vhdl Lp MAPLD2004

    39/54

    39 MAPLD 2004Alexander

    State Register Transitions

    0

    20

    40

    60

    80

    100

    120

    140

    4 8 16 32 64

    Number of States

    Numbero

    fStateRegisterT

    oggles

    Gray Binary One Hot

    Counters Power Measurementon ProASIC

  • 8/12/2019 Vhdl Lp MAPLD2004

    40/54

    40 MAPLD 2004Alexander

    on ProASIC

    Binary Vs Gray

    Power Consumption (mW)

    50

    100

    150

    200

    250

    300

    10

    12

    14

    16

    18

    20

    22

    24

    26

    28

    30

    32

    34

    36

    38

    40

    Frequency

    Po

    wer(mW)

    Binary (mW)

    Gray(mW)

    Power dissipation for 200 instances of 8 bit-counters

    As expected Gray counters dissipate less power (~25%)

    FSM Encoding: Effects on Power

  • 8/12/2019 Vhdl Lp MAPLD2004

    41/54

    41 MAPLD 2004Alexander

    FSM Encoding: Effects on Power

    170 States FSM Power Consumption

    0

    20

    40

    60

    80

    100

    120

    140

    5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

    Frequency

    P

    ower(mW)

    Binary Binary Clock Gray One Hot One Hot Clock

    Agenda

  • 8/12/2019 Vhdl Lp MAPLD2004

    42/54

    42 MAPLD 2004Alexander

    Agenda

    Advanced VHDL

    Power Conscious Design Techniques

    Data Path Selection

    FSM Encoding and Effect on Power

    Gating Clocks & SignalsAdvanced Power Design Practices

    Summary

  • 8/12/2019 Vhdl Lp MAPLD2004

    43/54

    Gating Clocks

  • 8/12/2019 Vhdl Lp MAPLD2004

    44/54

    44 MAPLD 2004Alexander

    Gating Clocks

    Most Used mechanism to gate clocks

    Data_Out (N Bits)

    FSML

    A

    T

    C

    H

    CLK

    CLK_En

    LD_Enable

    FSM

    CLK

    LD_Enable

    New_Data

    New_Data (N Bits)

    Gating clock signals with combinatorial logic is not recommended.

    Glitches are easily created by the clock gate which may result in incorrect

    triggering of the register

    Gating Signals:Address Decoder Example

  • 8/12/2019 Vhdl Lp MAPLD2004

    45/54

    45 MAPLD 2004Alexander

    Address Decoder Example

    IN0

    IN1

    OUT0

    OUT1

    OUT2

    OUT3

    IN0

    IN1

    OUT1

    OUT2

    OUT3Enable/Select

    OUT0

    A switching activity on one of the input of the decoder

    will induce an large number of toggling outputs

    Enable/Select signal prevents the propagation of their

    switching activity

    Agenda

  • 8/12/2019 Vhdl Lp MAPLD2004

    46/54

    46 MAPLD 2004Alexander

    Agenda

    Advanced VHDL

    Power Conscious Design Techniques

    Data Path Selection

    FSM Encoding and Effect on Power

    Gating Clocks and Signals

    Advanced Practices

    Summary

    VHDL Coding Effect on Power

  • 8/12/2019 Vhdl Lp MAPLD2004

    47/54

    47 MAPLD 2004Alexander

    VHDL Coding Effect on Power

    StableExpression

    Mux

    MuxGlitchyExpression

    GlitchyExpression

    StableExpression

    Mux

    Mux

    Example: IF THEN . ELSE .;

    Re-organizing the code helps to prevent propagation

    of switching activity

    Delay Balancing

  • 8/12/2019 Vhdl Lp MAPLD2004

    48/54

    48 MAPLD 2004Alexander

    Delay Balancing

    If all primary inputs have the same arrivaltime and the same switching probability,

    balancing trees eliminates switching

    propagation

    +

    X

    Y

    +

    T

    Z

    ++

    X

    Y +

    +TZ

    Un-Balanced Balanced

    Guarded Evaluation

  • 8/12/2019 Vhdl Lp MAPLD2004

    49/54

    49 MAPLD 2004Alexander

    Guarded Evaluation

    Technique used to reduce switching activity by addinglatches or floating gates at the inputs of combinatorial

    blocks if their outputs are not used.

    Example: Results of multiplier may or may not be used

    depending on the condition, Adding transparent

    Latches or AND gates on the inputs avoids power

    dissipation as they mask useless input activity.

    Condition

    Mux

    Multiplier Mux

    Multiplier

    La

    tch

    ConditionCondition

    Pre-computation Based PowerReduction

  • 8/12/2019 Vhdl Lp MAPLD2004

    50/54

    50 MAPLD 2004Alexander

    educt o

    Combinatorial

    Logic

    Pre-ComputationLogic

    Pre-ComputationInput

    Gated

    Input

    R1

    R2

    Common Clock

    Outputs

    Operator Reduction

  • 8/12/2019 Vhdl Lp MAPLD2004

    51/54

    51 MAPLD 2004Alexander

    p

    Based on transformations of operations intocomputationally equivalent implementations

    Example: Distributive Multiplication over

    Addition (resource sharing)

    (X*Y) + (Z*Y) = (X+Z) * Y

    *

    X

    Y

    *

    Y

    Z

    +

    +

    X

    Z

    Y

    *

    Input Signals Ordering

  • 8/12/2019 Vhdl Lp MAPLD2004

    52/54

    52 MAPLD 2004Alexander

    p g g

    Never forget that adders are commutative and associative

    Amplitude of IN is larger than the amplitude of IN >> 7 and IN >> 8

    +IN

    IN >>7 +

    >>8IN

    +

    ININ+

    >>8IN

    >>7

    SwitchingP

    robability

    IN>>7IN>>8 IN

    2 4 6 8 10 12 14 ..

    Sign Bit Correlation

    Bit Number

    Summary

  • 8/12/2019 Vhdl Lp MAPLD2004

    53/54

    53 MAPLD 2004Alexander

    y

    Advanced VHDL Design Tips Identify critical and late arrival signals in your design

    Write code in a way that reduces the logic levels for suchsignals

    Perform functions such as state determination whilewaiting for late signals

    Low Power Design TechniquesReduce switching activity per clock cycle

    Reduce propagation of switching activity

    Use power-efficient architecture and encodingDisable logic blocks whose outputs are not used

    Re-evaluate expressions to achieve the above

    Additional Resources

  • 8/12/2019 Vhdl Lp MAPLD2004

    54/54

    54 MAPLD 2004Alexander

    Documents available onhttp://www.actel.com

    Low Power Resource Center

    http://www.actel.com/products/rescenter/power/index.html

    Power Conscious Design with ProASIC http://www.actel.com/documents/PowerConscious.pdf

    Low Power Design for Antifuse FPGAs

    http://www.actel.com/documents/lowpower.pdf

    http://www.actel.com/http://www.actel.com/products/rescenter/power/index.htmlhttp://www.actel.com/documents/PowerConscious.pdfhttp://www.actel.com/documents/lowpower.pdfhttp://www.actel.com/documents/lowpower.pdfhttp://www.actel.com/documents/PowerConscious.pdfhttp://www.actel.com/products/rescenter/power/index.htmlhttp://www.actel.com/