30
MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 1 MOBY-DIC Final Workshop Circuit implementations Marco Storace

MOBY-DIC Final Workshop Circuit implementations Marco Storace

  • Upload
    hank

  • View
    48

  • Download
    0

Embed Size (px)

DESCRIPTION

MOBY-DIC Final Workshop Circuit implementations Marco Storace. Normal forms. PWA generic (PWAG) -) more focused on accuracy of the corresponding digital circuit implementations -) meets the requirements of ‘ direct ’ synthesis methods -) the architecture requires a Finite State Machine. - PowerPoint PPT Presentation

Citation preview

Page 1: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 1

MOBY-DIC Final WorkshopCircuit implementations

Marco Storace

Page 2: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 2

Normal formsPWA generic (PWAG)-) more focused on accuracy of the corresponding digital circuit implementations-) meets the requirements of ‘direct’ synthesis methods-) the architecture requires a Finite State MachineExample of generic PWA functionCorresponding (irregular) domain partition

Page 3: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 3

PWA Simplicial (PWAS)-) more focused on speed of the corresponding digital circuit implementations-) suitable for both ‘indirect’ and ‘direct’ synthesis methods

x1

x2

Example of PWAS functionCorresponding (simplicial) domain partition

Normal forms

Page 4: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 4

PWA multi-resolution hyper-rectangular (mPWAR)-) suitable for both ‘indirect’ and ‘direct’ synthesis methods-) even discontinuous PWA functions-) the architecture requires a Finite State Machine

Normal forms

Corresponding domain partitionExample of mPWAR function

Page 5: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 5

PWA single-resolution hyper-rectangular (sPWAR)-) simpler digital architecture w.r.t. mPWAR, resulting in faster evaluation time-) even discontinuous PWA functions

Normal forms

Corresponding domain partitionExample of sPWAR function

Page 6: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 6

Point location problemAll normal forms PWA functions = sets of affine functions defined over subregions of a given domain.Computational standpoint: we have to solve 2 problems.1) Point location problem: find the polytope a given input point belongs to. Chosen normal form higher or lower computational effort (e.g., regular partition decreases the required computational effort). Circuit point of view: this problem can be a bottleneck for some normal forms and for large input dimensions.2) Computation of the affine function defined over a given polytope. This require just a memory + products and sums problem trivial from a circuit point of view, but the required memory can become very large for high input dimensions.

Page 7: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 7

PWAG: Some detailsReference (hyperrectangular) domain scaled to:

SC = {x n: -1 xi < 1, i = 1,…,n}

PWAG function f defined over SC: f(x) = fj’x + gj for any x cj

SC is partitioned into generic polytopic regions cj (j = 1,...,LG), such that

SC = jLG cj

cj ck = for any j k

cj

Page 8: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 8

Point location strategy for PWAGMany algorithms proposed to solve this problem (e.g., see works by T.A. Johansen et al.), but not all of them suitable for circuit implementation.Quite efficient algorithm to solve the point location problem and suitable for circuit implementation: based on a binary search tree.Tree computed off-line based on the domain partition: each non-leaf node a partition edge and each leaf node one of the LG polytopes.Tree explored on-line: the search complexity ( computation time) depends mainly on the maximum depth of the tree, which in turn depend on LG and on the polytopes shapes.

Page 9: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 9

Also, the tree exploration requires the evaluation of the partition edges corresponding to non-leaf nodes, in order to continue the search in the tree branch containing the leaf node a given input belongs to.Such kind of data structure is circuit implemented through a Finite State Machine.

Main challenges: to minimize the tree depth ( minimize the time required by the on-line tree exploration) through the off-line optimization. Also relevant is keeping the total number of nodes at a minimum, as this would decrease the circuit dimensions.

Point location strategy for PWAG

Page 10: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 10

Digital architecturesThere are basically 2 architectures:A) mainly serial (simpler, but slower)B) mainly parallel (faster, but more complex)Both of them can have 3 input acquisition methods (with increasing complexity and speed):-) serial bit-wise (at each clock cycle one bit of all input components is read)-) serial component-wise (at each clock cycle a whole input component is read)-) parallel (all components are read together in one clock cycle).Then we have 6 possible combinations, allowing one to choose the best trade-off between speed and size of the circuit.

Page 11: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 11

Digital architecturesBinary search treeContains a memory

and either a Multiply

and Accumulate (MAC)

block (serial

architecture) or a bank of

multipliers and

adders (parallel architect

ure)

Three possible acquisition methods

Can add delays to meet the sampling

times

Page 12: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 12

Latency times

n: dimension of the input (i.e., number of components of the input vector x)m: dimension of the output (i.e., number of components of the PWA function f)d: maximum depth of the binary search treeb: number of bits used to code the input

Input method PWAG(A) serial PWAG(B) parallelbitwise d(n+2)+m(n+2)

+b+22d+2m+b+3

comp-wise d(n+2)+m(n+2)+n+2

2d+2m+n+3

parallel d(n+2)+m(n+2)+3

2d+2m+4

Page 13: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 13

The other normal formsWhen d and/or n become too large, the PWAG circuit solution can be either too slow ( impossible to meet the sampling times) or too complex ( impossible to implement it in a given FPGA board).In this case, we can try to resort to the other normal forms, that can be used to approximate the PWAG function through a PWA controller (PWAS or mPWAR or sPWAR), which is usually faster and/or simpler.

Page 14: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 14

PWAS: Some detailsReference (hyperrectangular) domain scaled to :

SC = {x n: 0 xi < mi, i = 1,…,n}For circuit reasons, we choose mi = 2pi - 1, with pi positive integer. Often mi = mS iThe domain is partitioned into

hyperrectangles grid of

vertices. Each hyperrectangle is partitioned in turn into n! non-overlapping simplices.

simplexvertex

Page 15: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 15

Point location strategy for PWASMain idea: the regular partition of the domain makes the point location problem quite a simple task, owing to the Kuhn lemmas. Drawback: the number of coefficients (i.e., size of the memory) equals the number of vertices curse of dimensionality. The effects of this problem can be reduced by adding a pre-scaler block that allows to have non-uniform simplicial partitions.

Main challenges: to find a uniform or non-uniform simplicial partition to approximate at best a given PWAG function, but keeping the total number of nodes at a minimum, as this would decrease the circuit dimensions.

Page 16: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 16

Digital architecturesThere is no need of a Finite State Machine to solve the point location problem. The main operation required to this end is sorting in ascending order the decimal parts of the input components.Again, there are basically 2 architectures:A) serial and B) parallel, with the 3 input acquisition methods described in the PWAG case. Then we have 6 possible combinations, allowing one to choose the best trade-off between speed and size of the circuit.The Output block has the only function of waiting for some clock cycles in order to meet the correct sampling times, since the PWAS architectures have latencies depending only on n, m, and b.

Page 17: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 17

Digital architecturesContains a memory

and either a Multiply

and Accumulate (MAC)

block (serial

architecture) or a bank of

multipliers and

adders (parallel architect

ure)

-) 3 acquisition methods-) contains a sorter-) may contain a pre-scaler ( non-uniform simplicial partition)

Page 18: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 18

Latency times

n: dimension of the intput (i.e., number of components of the input vector x)m: dimension of the output (i.e., number of components of the PWA function f)b: number of bits used to code the input

Input method PWAS(A) serial PWAS(B) parallelbitwise m(n+3)+b+1 b+3

comp-wise m(n+3)+n+1 n+3parallel m(n+3)+2 4

Page 19: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 19

PWAR: Some detailsReference domain:

SC = {x n: -1 xi < 1, i = 1,…,n}Single-resolution (sPWAR)Each coordinate axis is divided into mC (= 2r for circuit reasons) subintervals of the same length 2(1-r)

Multi-resolution (mPWAR)r levels of refinement of S = S(0). At first level S(1) each coordinate axis is divided into 2 identical subintervals of unitary length. Then, only some hypercubes are splitted, others not, as opposed to the single-resolution case. Choice of the hypercubes to be further refined: depends on the level of detail required for the PWA function in a certain region of the domain. At level S(r) the subintervals’ length is 2(1-r)

Page 20: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 20

mPWARsPWAR

PWAR: Some details

r = 3

Page 21: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 21

Point location strategy for mPWARMain idea: to build offline an orthogonal search tree such that the real-time search complexity is logarithmic with respect to the number of regions.Dimension of the tree: scales with nDepth of the tree ( time required by the on-line exploration): depends on the refinement level r.

Main challenges: to minimize the orthogonal tree depth by tuning the parameter r ( minimize the time required by the on-line tree exploration). Also relevant is keeping the total number of nodes at a minimum, as this would decrease the circuit dimensions.

Page 22: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 22

Digital architecturemPWAR: architecture very similar to the PWAG case, the only difference is in the Finite State Machine that addresses the memory

Page 23: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 23

Digital architecturesPWAR: there is no need of a binary search tree, the architecture is more similar to the PWAS case (also in this case the point location problem is solved by exploiting the regular partition)

Page 24: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 24

PWAS and sPWAR: suitable for small-scale problems (higher speed and simpler digital structure). No FSM required. sPWAR: also discontinuous functions.Both heavily affected by “curse of dimensionality” (memory size grows exponentially with input dimension n).PWAG: implements every generic PWA function. Latency and number of parameters to be stored grow very fast for problems with a highly irregular domain partitioning.mPWAR: simpler structure (→ shorter computation times) and great memory saving (no information about the edges). Drawback: dimensions of the FSM, mainly for high-dimension problems requiring a deep level of refinement.

Comparisons between architectures

Page 25: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 25

Comparisons between architecturesArchitecture Latency (# ck

cycles)Memory

sizeMax # of

nodesPWAG(A) d(n+2)+m(n+

2)+3(E+LG)(n+1)

PWAG(B) 2d+2m+4

PWAS(A) m(n+3)+2 (mS+1)n NO treePWAS(B) 4 (mS+1)n(n

+1)NO tree

mPWAR(A) r+n+1 LM(n+1)mPWAR(B) r+1

sPWAR(A) n+2 (mr)n(n+1) NO treesPWAR(B) 2 n: number of input variables (n = dim(x))

LG, LM : number of regions (PWAG, mPWAR, resp.)E: number of edges defining the domain partition (PWAG)d, r: maximum depth of the (binary, orthogonal) tree (PWAG, mPWAR, resp.)mS, mr = 2r: number of subintervals per dimension (PWAS, sPWAR)

para

llel i

nput

met

hod

Page 26: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 26

Comparisons between architectures

Double integratorFunction Max.

errorRel. error

PWAS 0.45 0.022mPWAR 0.31 0.061sPWAR 0.29 0.027

Solution of the MPC problem: PWAG function u*(x)Approximation with PWAS and PWAR functions u(x)

2 kinds of error computed:-) maximum error : maxx |u*(x) – u(x)|-) relative error : Sj |u*(xj) – u(xj)|/Sj |u*(xj)| (computed on a grid of samples)3D

exampleFunction Max. error

Rel. error

PWAS 0.75 0.065mPWAR 0.30 0.042sPWAR 0.45 0.074

Page 27: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 27

Comparisons between architectures

PWAS approximation: mS1 = 31, mS2 = 7, mS3 = 15 ( 19530 simplices & 4096 coefficients)

mPWAR approximation: r = 5, LM = 1548 regions

sPWAR approximation: mr1 = 32, mr2 = 8, mr3 = 16 ( 4096 regions & 16384 coefficients)

Benchmark example: 3D example (Mayne & Rakovic, Int. J. Robust and Nonlinear Control, 2003)Domain S = {-10 x1,3 10, -5 x2 5}Constraints on the control: -1 u 1Matlab Multi-Parametric tbx PWAG control law u*(x)LG = 256 polytopes E = 249 edges d = 12

Page 28: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 28

Comparisons between architectures

Latency: obtained as (# of clock cycles needed to perform the PWA computation) 50ns (i.e., clock period @20MHz, frequency suitable for all architectures after the post-synthesis simulation on the FPGA).Some architectures can work at higher frequencies.

Used b = 12 bits to represent the data in all cases.Data obtained with older PWAG and PWAS architectures ( not always with the data reported in previous slides).

Performances on a Xilinx Spartan III FPGA (xc3s200).

Power consumption for all architectures: 60 mW

Page 29: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 29

Comparisons between architectures

Architecture Occup. Slices Latency [ms] Mem. size [kb]

PWAG 53% 3.25 24.624PWAS(A) 12% 0.35 49.152PWAS(B) 12% 0.15 196.608

mPWAR(A) 79% 0.45 74.304mPWAR(B) 78% 0.3 74.304sPWAR(A) 2% 0.25 196.608sPWAR(B) 2% 0.1 196.608

One can choose among the architectures, if any, that fulfill the system constraints (available FPGA board, sampling times, etc.).

Page 30: MOBY-DIC Final Workshop Circuit  implementations Marco  Storace

MOBY-DIC final workshop --- Noordwijkerhout (NL), August 23, 2012 30

Conclusions• Many circuit solutions to implement

explicit MPC control systems

• Proposed architectures completely reconfigurable and suitable for FPGA implementation ( customisation of the HW at an attractive price even in low quantities!)

• Architectures particularly attractive for fast, small-size and low-power applications

• For large-scale productions ASIC solutions!