28
Moby-Dic Workshop, Noordwijkerhout, August 23, 2012 ASIC Implementation of the PWA ASIC Implementation of the PWA Generic Canonical Form Generic Canonical Form Dpto. Electrónica y Electromagnetismo, Universidad de Sevilla Instituto de Microelectrónica de Sevilla-CNM-CSIC; [email protected] MOBY-DIC Project FP7-IST-248858 Noordwijkerhout, August 23, 2012 Antonio J. Acosta

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012 ASIC Implementation of the PWA Generic Canonical Form Dpto. Electrónica y Electromagnetismo, Universidad

Embed Size (px)

Citation preview

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

ASIC Implementation of the ASIC Implementation of the PWA Generic Canonical FormPWA Generic Canonical Form

Dpto. Electrónica y Electromagnetismo, Universidad de Sevilla

Instituto de Microelectrónica de Sevilla-CNM-CSIC; [email protected]

MOBY-DIC Project FP7-IST-248858Noordwijkerhout, August 23, 2012

Antonio J. Acosta

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Introduction: Role of ASICs in control & characteristics of

ASICs

Design of MPC_ASICs: From high-level specifications to

silicon

PWAG Architecture Selection

Design, Integration and test of a PWAG ASIC

Test Results

Outline of the presentationOutline of the presentation

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Model-based synthesisModel-based synthesis

Description

Design flow

Simplification

Heuristic knowledge Numerical data

Verification

Synthesis

Tuning /

Identification

HW (VHDL)or SW (C, C++, Java)

Non-linear plant

-1 0 1 2 3 4 5 6 7 8-4

-2

0

2

4

6

8

10

12

14

-1 0 1 2 3 4 5 6 7 8-4

-2

0

2

4

6

8

10

12

14

-1 0 1 2 3 4 5 6 7 8-4

-2

0

2

4

6

8

10

12

14

-1 0 1 2 3 4 5 6 7 8-4

-2

0

2

4

6

8

10

12

14

Simulation Experiment

ee

out

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Dedicated HW

Digital ASIC

Embedded ControllerEmbedded Controller

Externalmemories

Testboard

FPGAExpansion boards

DSP FPGA

Embedded SW

- Performances ++ Flexibility -

- Cost +

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

ASIC Design Process

FINAL

CONCEPTIONAND

SPECIFICATIONS

DESIGNAND

VALIDATION

ASIC FABRICATION

Final designtape-out

- Viability analysis

- Alternatives vs. constraints

- Cost analysis:

- Added values:

IPs Libraries

Low-cost solutions (FPGAs)

EUROPRACTICE: Interface with CAD vendors & foundries

DESIGN HOUSE

CLIENT

FOUNDRY (TSMC)

- Flexibility in specifications

+ Development: human, technical, ...+ Production: Silicon

+ Testing, reliability, ...

Test of final ASICTech Info:Devices modeling

ASIC

Cell libraries

CADVENDORS

EDA toolssupport

TESTHOUSE

TestersDfT

+ Time-to-market, 1st silicon success, ...

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

FRONT-END

BACK-END

H DL

H DL S IM U L A T IO N

T O O L S

L O G IC S Y N T H E S I S

T O O L S

O P T IM I Z A T I O N

TO O L S

C E L L S L IB R A R Y

C L O C K I N G A N D

C R I T I C A L -P A T H TO O L S

D F T T O O L SD F T L IB R A R Y

D E S I G N C A P T U R E T O O L S

S TRU

CT.

HD

L

NET

LIST

S CH

EMA

TICS

T EST

BENC

H

S IM U L A T I O N

TO O L S

L A Y O U T TO O L S

PO ST-SY N THE SIS

V E R I F IC A T IO N A N D

M A S K F I L E

SIM.L IB B A C K A N N O T A T IO N

PH YS IC S

CE LLS E X T R A C T I O N T O O L S

DE

SIG

ND

ET

AIL

ED

DE

SIG

NP

HY

SIC

DE

SIG

NA

RC

HIT

EC

T.

BEHAVIORALMODELING

BEHAVIORALSIMULATION

LOGICSYNTHESIS

OPTIMIZATION

TEST

SYNTHESIS

SCHEMATICSCAPTURE

ANALYSIS ANDSIMULATION

PLACE&ROUTE

VERIFICATIONAND EXTRACTION

ASIC Design Flow

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

ASIC Design Example

HDLHDL

Area EstimationArea Estimation

Logical verification Timing and power

estimation

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Moby-Dic Methodology for optimal controllerMoby-Dic Methodology for optimal controller

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Control and Circuit decisions

Canonical form (PWAG)

No. input-output

Precision

Control surface

Configurable architecture

Parametrizable design

Programmability issues

HW requirements & limitations

…MOBY-DICTOOLBOX

HDL code ParametersFOR SELECTED CASE-STUDIES

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Proposed PWAG ArchitectureProposed PWAG Architecture

MEMORY

FSM

Modification of the one in [OLIV09]

The binary tree is stored in a Memory

The data in the TreeMemory are the address of ParamMemory

Less rigid, more configurable (different trees, on-line computation)

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Canonical Form: PWA Generic

Maximum number of inputs: 4 (configurable 1-4)

Bit number of inputs/parameters: 12-bits

Bit number of output: 26-bits (although the precision is 12)

Fixed-point arithmetic

Maximum number of polytopes plus edges: 4096

Maximum Tree depth: 13 (configurable 1-13)

ASIC major specifications

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Technology and CAD Tools SelectionTaiwan Semiconductor Manufacturing Company (TSMC)

90 nm, 9 metal layers

MiniAsic:1,875 x 1,875 µm2 (2011), 100 samples.

Memories on Chip

Dual Port RAM Memories

Access and Writing times below 5 ns (worst case)

CAD tools used:

DESIGN ANALYZER (SYNOPSYS)

SOC ENCOUNTER, DFWII (CADENCE)

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

214=16384 TREE MEMORY (TMEMO)

12

TMEMO stores all the nodes of the binary tree:214 -1= 20+ 21+…+ 213, being 13 the maximum tree depth

World lenght=12 enables 212 edges plus polytopes

Two ck cycles for each data writing

32768 ck cycles to write the whole memory, 0.64s with a 50MHz clock

Layout dimensions: 1060.95µm x 577.15µm = 612332.6µm2

Working Modes: Writing TMEMO

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

212=4096 PARAMETER MEMORY (PMEMO)

60

PMEMO stores all the possible edges and polytopes212 = 4096 edges plus polytopes

World lenght=60 enables 5 12-bit parametersSix ck cycles for each data writing

24576 ck cycles to write the whole memory, 0.48s with a 50MHz clock

Layout dimensions: 1190.51µm x 569.12µm = 677543.1µm2

Working Modes: Writing PMEMO

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Combinational

Delay < 4ns worst case

Arithmetic Unit

Word length conditioning circuit for tunable fixed point

Working Modes: Normal operation

x1

h1

x2

h2

x4

h4

k

≤ 0

decision

f(x)=fPWA(x) when a leaf is reached

24

24

x3

h3

24

24

26

12

12

12

12

12

12

12

12

12

26

COND.

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Memory Timing

TMEMO

TQ

TQ

PORT A

(write mode)

PORT B

(read mode)

PMEMO

ADDRESS PORT B (read mode)

outputPQ

PQ

clk !clk

PORT A

(write mode) Output ready in only one clock cycle

X1 X2 X3 X4

Input Acquisition

clk

valid_in

Fully parallel load needs 48 pins

Parallel load of 12-bit inputs in 4 clock cycles

If one/two/three inputs, set X2X3X4/X3X4/X4=0

Working Modes: Normal operation

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

PARAMETER MEMORY

TREE MEMORY

Arith Unit

INPUTCONTROL UNIT

OUTPUT

Parallel load of relevant data (snapshot)

Serial Test Output (shifting out the 86-bit register)

Concurrent to Operation Mode

Working Modes: Test

test

14

12

60

clk

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Layout

I/O Ports: 48 pins

VDD/GND Ports: 12 pins

Package: JLCC68

TMEMO

PMEMO Area: 1860 x 1860 µm2

Active: 1460 x 1460 µm2

No. cells: 3135

Memory: 54 KB

% Memory: 60%

Post-layout simulated

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Structure of ASIC

Operation mode P=00Writing TMEMO mode P=01Writing PMEMO mode P=10Test mode P=11

No. Inputs: configurable from 1 up to 4No. Outputs: 1Input Resolution: 12 bitsOutput Resolution: 26 bits Parameters Resolution: 12 bitsMaximum No. hyperplanes plus polytopes: 4096Depth of binary search tree: configurable from 1 up to 13Integration technology: 90nm, 2.5V-1.2V, 9 metal layers, TSMC (Taiwan Semiconductor Manufacturing Company)

TMEMO: 16384x12=24KBPMEMO: 4096x60=30KBSize: 1860x1860 µm2

Package: JLCC68

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Test Setup of ASIC

Powersupply HPE3630A

LogicAnalyzer Agilent16823A

Experiment controlled with Matlab

Oscilloscope Agilent DSO6104A

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Test process of ASIC: Go/no go test

Go/No go test: Simulation post-synthesis vs experimental data

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Test process of ASIC: Examples

Double integrator, ACC and DC-DC in open-loop fashion

Memory stored with parameters and trees obtained from the Moby-Dic toolbox

The comparison between expected and obtained results was made by Matlab

ExampleNo.

Inputs

Memory contentsDepthtree

No.patternsTree Parameters

Double Integrator 2 191 163 8 625

ACC 4 2009 209 12 625

DC-DC 4 572 265 9 625

Output surface (ASIC)

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Characterization of ASIC

20 packaged samples, allowing statistical analysis

@ DC@ 50 MHz@ fmax

To discard bad samples

For specific conditions

AU

TO

MA

TIC

FL

OW

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Summary of results

Fmax (MHz) Power @DC (mW)

Power @50MHz (mW)

Power @fmax (mW)

ACC 96.7 ± 1.7 1.6 ± 0.2 20.1 ± 0.5 37.0 ± 1.8

DC-DC 97.5 ± 1.0 1.6 ± 0.2 22.8 ± 0.6 41.9 ± 1.2

Double_integrator 107.5 1.6 18.1 38.1

100% effectiveness with very reduced variations with process

Double integrator (2 inputs) reaches higher frequency and consumes less than ACC and DC-DC (4 inputs)

Static power is dominated by leakage in memories

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Latency(ck cycles) Throughput Multipliers Memory

FPGA n+(n+2)d [Tck(n+(n+2)d)]-1 1 Nr(n+1)

ASIC n+2+2d [Tck(n+2d)] -1 n Nr(n+1)+2d+1

n: No. dimensions (inputs)

d: depth of the tree

Nr: No. Regions (edges+polytopes)

Analysis of costs & performance

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

ACC [OLIV11] Archict. Hw

resources Latency Multipliers Memory

FPGAPWAG 87% occup. 2.6 us 1 3.3KB

PWAS-S 31% occup. 390ns 1 11.5KBPWAS-P 95% occup. 67 ns 5 57.6KB

ASIC PWAG 3.5mm2 120-240 ns 4 54KB (27.3KB used)

Post-layout simulation of ASIC provides 4-8 ns for clock cycle

Analysis of cost&performance for CSs

DC-DC Archict. Hw resources Latency Multipliers Memory

FPGAPWAG-S 11% occup. 3.35µs 1 0.59KBPWAG-P 10% occup. 1.15µs 4 0.59KB

ASIC PWAG 3.5mm2 72-144 ns 4 54KB (2.58KB used)

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

[OLIV09] A. Oliveri, T. Poggi, M. Storace, “Circuit implementation of piecewise-affine functions based on a binary search tree,” European Conference on Circuit Theory and Design (ECCTD’09), pp. 145–148, Antalya, Turkey, August 2009.

[OLIV11] A. Oliveri, G.J.L. Naus, M. Storace, W.P.M.H. Heemels, "Low-complexity approximations of PWA functions: a case study on Adaptive Cruise Control“, European Conference on Circuit Theory and Design (ECCTD'11), pp. 694-697, Linköping, Sweden, August 2011.

References

Moby-Dic Workshop, Noordwijkerhout, August 23, 2012

Conclusions and Remarks

• ASIC to cover different case-studies• VLSI is not a simple translation from FPGA • Configuration and programmability is provided -> IP

hard block

• Extensive usage of toolbox to get surface parameters and fully functional simulation

• ASIC performances overtake FPGA in speed (x10) and power (÷10)