23
ECE 368 A Tour by Example of Non-Trivial Circuit Design and VHDL Description Lecture Notes # 4 Shantanu Dutt Electrical & Computer Eng. University of Illinois at Chicago

Lect4 Design Tour

Embed Size (px)

DESCRIPTION

Electrical & Computer Eng.University of Illinois at Chicago

Citation preview

  • ECE 368 A Tour by Example of Non-Trivial Circuit Design and VHDL Description

    Lecture Notes # 4 Shantanu Dutt

    Electrical & Computer Eng.University of Illinois at Chicago

  • OutlineCircuit Design ProblemSolution Approaches: Truth Table (TT) vs. Computational/Algorithmic Yes, hardware, just like software can implement any algorithm!Flat vs. Divide-&-ConquerDivide-&-Conquer:Associative operations/functionsGeneral operations/functionsExpressing the hardware soln. using programming language constructs incl. recursions and iterationsCircuit Synthesis Translation of program-language description to a digital ckt.Summary

  • Circuit Design ProblemDesign an 8-bit greater-than comparator that compares two 8-bit #s available in two registers A[7..0] and B[7..0] that o/ps: F = 1 if A > B and F = 0 if A
  • Circuit Design Problem (contd)Approach 2: Think computationally/algorithmically about what the ckt is supposed to compute:Approach 2(a): Flat algorithmic approach:Note: A TT can be expressed as a sequence of if-then-elsesIf A = 00000000 and B = 00000000 then F = 0 else if A = 00000000 and B = 00000001 then F=0 . else if A = 00000001 and B = 00000000 then F=1 .Essentially a re-hashing of the TT same problems as the TT approachNeed to think computationally & structurally (i.e., based on the structure of the program at hand) at a higher level!

  • Circuit Design Problem (contd)Approach 2(b): Structural algorithmic approach:Be more innovative, think of the structure/properties of the computational problemE.g., think if the problem can be solved in a hierarchical or divide-&-conquer (D&C) manner: D&C approach: See if the problem can be broken up into 2 or more smaller subproblems that can be stitched-up to give a soln. to the parent prob. Do this recrusively for each large subprob until subprobs are small enough for TT-based solution If the subprobs are of a similar kind (but of smaller size) to the root prob then the breakup and stitching will also be similarDo recursively until subprob-sizeis s.t. TT-based design is doable

  • Shift Gears: Design of a Parity Detection CircuitA Series of XORs(b) 16-bit parity treeDelay = (# of levels in AND-OR tree) * td = log2 (n) *tdAn example of simple designer ingenuity---a bad design would have resulted in a linear delay that the VHDL code & the synthesis tool would have been at the mercy of. No concurrency in design (a)---the actual problem has available concurrency, though, and it is not exploited well in the above linear design Complete sequentialization leading to a delay that is linear in the # of bits n (delay = (n-1)*td), td = delay of 1 gate All the available concurrency is exploited in design (b)---a parity tree. Question: When can we have a tree-structured circuit for a chain of the same operation on multiple operands? Answer: (1) First of all when the operation makes sense for any # of operands. (2) It should be possible to break it down into smaller-size operations. (3) Finally, when the operation is associative. An operation x is said to be associative if: a x b x c = (a x b) x c = a x (b x c). Thus if we have 4 operations a x b x c x d, we can either perform this as a x (b x (c x d)) [getting a linear delay of 3 units] or as (a x b) x (c x d) [getting a logarithmic (base 2) delay of 2 units and exploiting the available concurrency due to the fact that x is associative]. We can extend this idea to n operands (& n-1 operations) to perform as many of the pairwise operations as possible in parallel (& do this recursively for every level of remaining operations), similar to design (b) for the parity detector [xor is an associative operation!] and thus get a (log2 n) delay.f = (((x(15) xor x(14)) xor (x(13) xor x(12))) xor ((x(11) xor x(10)) xor (x(9) xor x(8))))xor (((x(7) xor x(6)) xor (x(5) xor x(4))) xor ((x(3) xor x(2)) xor (x(1) xor x(0))))

  • D&C for Associative Operations Let f(xn-1, .., x0) be an associative function. What is the D&C principle involved in the design of an n-bit xor/parity function? Can it also lead automatically to a tree-based ckt?f(a,b)abf(xn-1, .., x0)Stitch-up function---same as theoriginal function for 2 inputs Using the D&C approach for an associative operation results in the stitch up function being the same as the original function (not the case for non-assoc. operations), but w/ a constant # of operands (2, if the orig problem is broken into 2 subproblems) If the two sub-problems of the D&C approach are balanced (of the same size or as close to it as possible), then unfolding the D&C results in a balanced operation tree of the type for the xor/parity function seen earlierf(xn-1, .., xn/2)f(xn/2-1, .., x0)

  • entity parity_tree is a (2**k)-bit parity treegeneric (k : natural, gate_delay : time := 2 ns);-- n = 2**k is the # of inputsport (x : in std_logic_vector ( 2**k - 1 downto 0);f : out std_logic);end entity parity_tree;

    architecture struct of parity_tree istype matrix is array (k-1 downto 0, 2**k - 1 downto 0) of std_logic;signal wire : matrix;beginouter_loop: for j in k-1 downto 0 generateinner_loop: for i in 0 to 2**j - 1 generatefirst_level: if j=k-1 then generatexor_gates_level1: entity work.xor_2(behav) direct instantiationgeneric map (gate_delay); -- pass gate delay to xorport map (x(2*i), x(2*i+1), wire(j,i));end generate;lower_levels: if j < k-1 then generatexor_gates_lower: entity work.xor_2(behav)generic map (gate_delay); port map (wire(j+1, 2*i), wire(j+1, 2*i + 1), wire(j,i));end generate; -- if generateend generate; -- inner generate for loopend generate; -- outer generate for loopf

  • Comparator Circuit Design Using D&CA Useful property: At any level, comp. of MS (most significant) half determines o/p if result is > or < else comp. of LS determ. o/p Can thus break up problem at any level into MS and LS comparisons & based on their results determine which o/p to choose for the higher-level (parent) resultComp A[7..4],B[7..4]Comp. A[7..0]],B[7..0]Stitch-up of solns to A1 and A2 to form the complete soln to AA1A2Comp A[3..0],B[3..0]Comp A[7..6],B[7..6]Comp A[5,4],B[5,4]A1,1A1,2Comp A[7],B[7]Comp A[6],B[6]A1,1,1A1,1,2Small enough to bedesigned using a TT(2-bit 2-o/p comparator) Is this is associative?not sure For a non-associative func, determine its property(ies) thatallows determining a correctstitch-up function (requiresingenuity, solid thinking)

  • Comparator Circuit Design Using D&C (contd.)Comp A[7..4],B[7..4]Comp. A[7..0]],B[7..0]Stitch-up of solns to A1 and A2to form the complete soln to AAA1A2Comp A[3..0],B[3..0]Comp A[7..6],B[7..6]Comp A[5,4],B[5,4]A1,1A1,2Comp A[7],B[7]Comp A[6],B[6]A1,1,1A1,1,2OR Once the D&C tree is formulated it is easy to get the low-level & stitch-up designs Stitch-up design shown here(Compact TT)

  • Comparator Circuit Design Using D&C Final Design 2-bit2:1 Mux22my(5)my(3)(2)I0I1 2-bit2:1 Mux22my(4)my(1)(2)I0I1my(5)(2)my(5)(1)my(4)(1)Log n levelof Muxes Delay(8-bit comp.) = 3 (delay of 2:1 Mux) + delay of 2-bit comp. Note parallelism at work multiple logic blocks are processing simult. Delay(n-bit comp.) = log n (delay of 2:1 Mux) + delay of 2-bit comp. H/W_cost(8-bit comp.) = 7(HW_cost(2:1 Muxes)) + 8(H/W_cost(2-bit comp.)

    H/W_cost(n-bit comp.) =(n-1)(H/W_cost(2:1 Muxes)) + n(H/W_cost(2-bit comp.))

  • Comparator Circuit Design Using D&C Behavioral Description using a High-Level Language Recursive Description: Procedure Compare(A[m, k], B[m, k])Begin if m-k>=1 then { f[2..1] = Compare(A[m, m-(m-k)/2], B[m, m-(m-k)/2]); If f[2] = 0 then return(f[2..1]) /* result has been determined based on MS comp. */ else { return(Compare(A[m-1-(m-k)/2, k], B[m-1-(m-k)/2, k]);}else /* m-k=0 single-bit comparison problem */ { if A[m] > B[m] then return(1,0) else if A[m] < B[m] then return(0,0) else return(0,1) }End Main program: Compare(A[7..0], B[7..0]); Problem: The design has been sequentialized perform MS comparison look at the results if needed, perform LS comparison, instead of MS and LS comparisons being performed simultaneously.Thus no parallelism! Delay is linear in n as opposed to log n w/ parallelism Limitation of regular programming languages in specifying parallelism Need a Hardware Description Language (HDL) for specifying parallelism. VHDL & Verilog are such languages2:1 Muxo/po/po/pstartenableI0I1

  • Comparator Circuit Design Using D&C Behavioral Description using a High-Level Language Iterative Description Flattening the recursion: Procedure Compare(A[n-1, 0], B[mn-1, 0])Begin for i = n-1 downto 0 do { if A[i] > B[i] then return(1) else if A[i] < B[i] then return(0) else if i=0 and A[0] = B[0] then return(0) }End Main program: Compare(A[7..0], B[7..0]); Same problem of sequentialization higher-order bit compared before next lower- order bit and so on, leading to a linear delay in # of bits (as opposed to log n with parallelism)1-bitcomparatorf(7)A[7] B[7]1-bitcomparatorf(6)A[6] B[6]1-bitcomparatorf(5)A[5] B[5]1-bitcomparatorf(4)A[4] B[4]1-bitcomparatorf(3)A[3] B[3]1-bitcomparatorf(2)A[2] B[2]1-bitcomparatorf(1)A[1] B[1]1-bitcomparatorf(0)A[0] B[0]st enst enst enst enst enst enst enLogic for selecting one of the comparator o/ps corresponding to the 1st comparator from the left that has st=0F

  • Concurrent Statements & Component Instantiations in VHDL Parallelism or concurrency needs to be explicitly specified by an HDLa synthesis tool will mostly not be able to extract any parallelism from a description (i.e.,coding) that does not explicitly expose the parallelism VHDL specifies concurrency using concurrent statements VHDL specifies iterative and recursive specifications of concurrency using iterative generate statements and conditional generate statements In the simple ckt to the left OR gates A and B are supposed to operate in parallel/concurrently No way to specify this in a s/w prog. lang (or in a VHDL behavioral dsecription)VHDL Description:entity simple_ckt isport(a, b, c, d: in std_logic; z: out std_logic); -- like procedure input/output variable declarationsend entity simple_ckt; -- above are input/output ports or wiresarchitecture data_flow of simple_ckt issignal x, y : std_logic; -- declaring internal wiresbegin x
  • D&C-based Comparator Design Description using VHDL

    entity tree_comparator isgeneric (n: natural) parameterizes the design sizeport(A, B: in std_logic_vector(n-1 downto 0); f: out std_logic_vector(0 to 1)); end entity tree_comparator; architecture struct_recursive of tree_comparator issignal f1, f2 : std_logic_vector(0 to 1); begin simpl_comp: if n = 1 generatebeginLeaf_comp: entity work.one_bit_comp(behav)port map (A(n-1), B(n-1), f);end generate simpl_comp; compound_comp: if n > 1 generatebegin comp1: entity work.tree_comparator(recursive) generic map (n/2) port map (A(n-1 downto n/2), B(n-1 downto n/2), f1); comp2: entity work.tree_comparator(recursive) generic map (n/2) port map (A(n/2 - 1 downto 0), B(n/2 -1 downto 0), f2); mux_2bit: entity mux_two_to_one(behav) generic map (2) -- # of bits port map (f1, f2, f1(1), f); f1 & f2 are 2-bit data i/ps, -- f1(1) is the 1-bit select, f is the 2-bit outputend generate compound_comp;end architecure struct_recursive;VHDL Description Generate, Recursion, Concurrency

  • D&C-based Comparator Design Description using VHDL (contd.)Component Descriptions:

  • SummaryFor complex digital design, we need to think of the computation underlying the design in an algorithmic and high-level manner:is it amenable to the D&C approach (i.e., can be broken into smaller-sized problems whose outputs can be stitched-up)?are there properties of this computation that can be exploited for faster, less expensive, modular designThe design is then developed in a D&C manner & the corresponding circuit may be synthesized by describing it compactly using a structural HDL formFor an operation/func x on n operands (an-1 x an-2 x x a0 ) if x is associative, the D&C approach gives an easy stitch-up function, which is x on 2 operands (o/ps of applying x on each half). This results in a tree-structured circuit with (log n) delay instead of a linearly-connected circuit with (n) delay can be synthesized.If x is non-associative, more ingenuity and determination of properties of x is needed to determine the break-up of the function and the stitch-up function. The resulting design may or may not be tree-structuredA hardware description language with a structural form is useful to describe large circuits with all the designed parallelism, and then have them synthesized automatically. VHDL provides special hardware-oriented constructs for the description of hardware that is not available in regular sequential s/w programming languages: especially, concurrency (via data flow or instantiation statements) and circuit-delay specifications.VHDL also has constructs that ease the description of regular-patterned circuits (linear arrays, multi-dimensional arrays, regular trees, etc.) of arbitrary size: generate statements and recursion.

  • Structural VHDL allows the designer to represent a system in terms of components and their interconnections. This module discusses the constructs available in VHDL to facilitate structural descriptions of designs.

    Copyright Notice for RASSP Slides(material included w/ explicit acknowledgement in next few slides)

  • Generate Statement --- from RASSP slides

    Structural descriptions of large, but highly regular, structures can be tedious. A VHDL GENERATE statement can be used to include as many concurrent VHDL statements (e.g. component instantiation statements) as needed to describe a regular structure easily. In fact, a GENERATE statement may even include other GENERATE statements for more complex devices.. Some common examples include the instantiation and connection of multiple identical components such as half adders to make up a full adder, or exclusive or gates to create a parity tree.

  • Generate Statement For Scheme --- from RASSP slides VHDL provides two different schemes of the GENERATE statement, the FOR-scheme and the IF-scheme. This slide shows the syntax for the FOR-scheme. The FOR-scheme is reminiscent of a FOR loop used for sequence control in many programming languages. The FOR-scheme generates the included concurrent statements the assigned number of times. In the FOR-scheme, all of the generated concurrent statements must be the same. The loop variable is created in the GENERATE statement and is undefined outside that statement (i.e. it is not a variable or signal visible elsewhere in the architecture). .The loop variable in this FOR-scheme case is N. The range can be any valid discrete range. After the GENERATE keyword, the concurrent statements to be generated are stated, and the GENERATE statement is closed with END GENERATE.

  • This slide shows an example of the FOR-scheme. The code generates an array of AND gates. In this case, the GENERATE statement has been named G1 and instantiates an array of 8 and_gate components. The PORT MAP statement maps the interfaces of each of the 8 gates to specific elements of the S1, S2, and S3 vectors by using the FOR loop variable as an index.Generate Statement For Scheme Example --- from RASSP slides

  • The second form of the GENERATE statement is the IF-scheme. This scheme allows for conditional generation of concurrent statements. One obvious difference between this scheme and the FOR-scheme is that all the concurrent statements generated do not have to be the same. While this IF statement may seem reminiscent to the IF-THEN-ELSE constructs in programming languages, note that the GENERATE IF-scheme does not provide ELSE or ELSIF clauses. The Boolean expression of the IF statement can be any valid Boolean expression. Generate Statement If Scheme --- from RASSP slides

  • Generate Statement If Scheme Example --- from RASSP slides The example here uses the IF-scheme GENERATE statement to make a modification to the and_gate array such that the seventh gate of the array will be an or_gate. Another example use of the IF-scheme GENERATE is in the conditional execution of timing checks. Timing checks can be incorporated inside a GENERATE IF-scheme. E.g., the foll. statement can be used: Check_time : IF TimingChecksOn GENERATE This allows the boolean variable TimingChecksOn to enable timing checks by generating the appropriate concurrent VHDL statements in the description. This parameter can be set in a package or passed as a generic and can improve simulation speed by shutting off this computational section.

    *************