View
32
Download
1
Category
Preview:
DESCRIPTION
Impact of Interconnect Architecture on VPSAs (Via-Programmed Structured ASICs). Usman Ahmed Guy Lemieux Steve Wilton System-on-Chip Lab University of British Columbia. What is a Structured ASIC?. An FPGA without re programmable interconnect Interconnect is mask-programmed. - PowerPoint PPT Presentation
Citation preview
Impact of Interconnect Architecture
on VPSAs (Via-Programmed Structured ASICs)
Usman Ahmed Guy Lemieux Steve Wilton
System-on-Chip LabUniversity of British Columbia
2
What is a Structured ASIC?
• An FPGA without reprogrammable interconnect– Interconnect is mask-programmed
InterconnectLayers
Transistors
3
What is a Structured ASIC?
• An FPGA without reprogrammable interconnect– Interconnect is mask-programmed
• Two types
InterconnectLayers
Transistors
Via ProgrammableVPSA
This Talk
4
What is a Structured ASIC?
• An FPGA without reprogrammable interconnect– Interconnect is mask-programmed
• Two types
InterconnectLayers
Transistors
Metal ProgrammableMPSA
Via ProgrammableVPSA
This Talk FPT 2009
Key Messages
1. Structured ASICswill be the key technologyof the future.
5
Key Messages
1. Structured ASICswill be the key technologyof the future.
Because the key issues thatmake structured ASICs attractivehave not been solved.
They are growing more prominent.
6
Key Messages
2. Interconnect matters.
1. Structured ASICswill be the key technologyof the future.
Because the key issues thatmake structured ASICs attractivehave not been solved.
They are growing more prominent.
7MPSAs have better performance,
VPSAs are cheaper.
8
Motivation for Structured ASICs
• Enormous NRE + Design costlimit access to advanced process
• Many ICs are still manufactured with old process technologies– New processes (90nm and below)
49% of TSMC revenue in 2009 Q1
• FPGAs not suitable for low power, consumer-oriented,hand-held devices
• SAs reduce mask cost + risk
• SAs must reduce design effort
• SAs make advanced processes profitable
• SAs lower power, lower costthan FPGAs
9
Talk Outline
• Cost model
• Experimental methodology– Metrics
– CAD flow
– Architecture modeling
• Area, cost trends
• Conclusions
10
Talk Outline
• Cost model
• Experimental methodology– Metrics
– CAD flow
– Architecture modeling
• Area, cost trends
• Conclusions
11
VPSA Die-Cost
• Cost is more important than die area• Primary cost components
– Die Area– Number of configurable layers (New for structured ASICs)
• Secondary cost components– Die Yield– Wafer and processing cost– Volume requirements
12
VPSA Cost Model
Costdie = Cbase +
Ccustom +
Cproto
13
VPSA Cost Model
Costdie = Cbase +
Ccustom +
Cproto
Cost of the masks for the base (common portion)
Cost of fabricating the base portion ++
14
VPSA Cost Model
Costdie = Cbase +
Ccustom +
Cproto
Cost of the masks for the base (common portion)
Cost of fabricating the base portion ++
Cost of the remaining masks Cost of fabricating the remaining portion ++
15
VPSA Cost Model
Costdie = Cbase +
Ccustom +
Cproto
Cost of the masks for the base (common portion)
Cost of fabricating the base portion ++
Cost of the remaining masks Cost of fabricating the remaining portion ++
Similar to Ccustom, but depends on the number of spins
16
VPSA Cost Model
Costdie = Cbase +
Ccustom +
Cproto
• Die Area and Yield: Ngdpw
• Configurable layers: Nvl
• Fixed layers: Nfm , Nfm , Nfm , Nfm
• Mask/wafer cost: Csml,Csmu,Cwpm, Nmprl
• Volume requirements: Vtot, Vc
Ccustom
Cbase
Cproto
Primary Cost Variables
Constantsl m v u
gdpw
swfmfmwpm
tot
fmfmfmsmfmsm
N
CNNC
V
NNNCNC vmuvmull)(
gdpw
fmfmfmfmvlwpm
c
fmvlsm
N
NNNNNC
V
NNCuvmvvu)(
swfmfmfmfmvlfmwpmfmvlsmc
s CNNNNNNCNNCV
Nuvmvlvu
)(
1
17
VPSA Cost Model
3210 KK
NK
NNK
Costgdpw
vlgdpw
die
ConfigurabilityDie Area
1 … 5 via layers4503 (10mm2 die) … 450 (100mm2 die)
18
VPSA Cost Model
3210 KK
NK
NNK
Costgdpw
vlgdpw
die
$4800 $440 $0.74
$1.08
• Key Assumptions– 45nm Maskset cost: $2.5M
– Total volume: 2M
– Per-customer volume: 100k
– No. of spins: 2
ConfigurabilityDie Area
1 … 5 via layers4503 (10mm2 die) … 450 (100mm2 die)
19
VPSA Cost Model
• At constant cost, area can be traded for number of customizable layers
VPSA
40
80
120
160
200
Cor
e A
rea
(mm
)2
2 3 4 5
Nvl
1
$ 8
$12
$18
$25
$35
20
VPSA Cost Model
• At constant cost, area can be traded for number of customizable layers
VPSA
40
80
120
160
200
Cor
e A
rea
(mm
)2
2 3 4 5
Nvl
1
$ 8
$12
$18
$25
$35
11 mm2/layer
21
VPSA Cost Model
• At constant cost, area can be traded for number of customizable layers
VPSA
40
80
120
160
200
Cor
e A
rea
(mm
)2
2 3 4 5
Nvl
1
$ 8
$12
$18
$25
$35
MPSA
2 3 4 5 6
40
80
120
160
200
NrlC
ore
Are
a (m
m )2
$ 8
$12
$18
$25
$35
11 mm2/layer 30 mm2/layer
22
Talk Outline
• Cost model
• Experimental methodology– Metrics
– CAD flow
– Architecture modeling
• Area, cost trends
• Conclusions
23
Metrics
• Cost– Detailed cost model (just presented)
• Area– Placement grid size after whitespace insertion
• Determined by CAD flow
• Delay and Power– Please see paper
24
Talk Outline
• Cost model
• Experimental methodology– Metrics
– CAD flow
– Architecture modeling
• Area, cost trends
• Conclusions
25
CAD FlowInput Circuit
Insert Whitespace
No
Detailed Routing
Global Routing
Placement
Tech. Mapping
Any Congestion?
Any Congestion?
Yes
No
Yes
Area/Delay/Power/Cost Estimate
T-VPACK
Std. Cell Placer: CAPO
Std. Cell Router: FGR
Custom Router -Pathfinder based
Interconnect Architecture
26
CAD FlowInput Circuit
Insert Whitespace
No
Detailed Routing
Global Routing
Placement
Tech. Mapping
Any Congestion?
Any Congestion?
Yes
No
Yes
Area/Delay/Power/Cost Estimate
T-VPACK
Std. Cell Placer: CAPO
Std. Cell Router: FGR
Custom Router -Pathfinder based
Interconnect Architecture
- Simulated annealing too slow- Uniform whitespace distribution- Logic cells snapped to grid
27
CAD FlowInput Circuit
Insert Whitespace
No
Detailed Routing
Global Routing
Placement
Tech. Mapping
Any Congestion?
Any Congestion?
Yes
No
Yes
Area/Delay/Power/Cost Estimate
T-VPACK
Std. Cell Placer: CAPO
Std. Cell Router: FGR
Custom Router -Pathfinder based
Interconnect Architecture
- Increase Placement Grid Size
28
CAD FlowInput Circuit
Insert Whitespace
No
Detailed Routing
Global Routing
Placement
Tech. Mapping
Any Congestion?
Any Congestion?
Yes
No
Yes
Area/Delay/Power/Cost Estimate
T-VPACK
Std. Cell Placer: CAPO
Std. Cell Router: FGR
Custom Router -Pathfinder based
Interconnect Architecture
- Routing graph for only single basic tile- Expand wavefront only along the global route
29
Talk Outline
• Cost model
• Experimental methodology– Metrics
– CAD flow
– Architecture modeling
• Area, cost trends
• Conclusions
30
Routing Fabrics
Crossover Fabric
All wires same length!
31
Routing Fabrics
Crossover Fabric
All wires same length!
32
Routing Fabrics
Crossover Fabric
All wires same length!
33
Routing Fabrics
Jumper Fabric
Long wires OK!
34
Routing Fabrics
Jumper Fabric
Long wires OK!
35
Routing Fabrics
Jumper Fabric
Long wires OK!
36
Routing Fabric Comparison
Layer2i+2 (i=0, 1, …)Layer2i+1 (i=0, 1, …)
These points coincideLong Segment that spans two blocks
Jumper is not required at this point
Layer2i+2 (i=0, 1, …)Layer2i+1 (i=0, 1, …)
Crossover Fabric Jumper Fabric
• Single via to extend
• All wires same: length-1
• Two vias to extend
• Short segments: 1 blocks
• Long segments: 4 blocks, staggered
• Two variants– Jumper20: 20% Long segments– Jumper40: 40% Long segments
37
Logic Block Model
• Characteristics of logic block– Physical dimensions
(in wire pitches)
– Pin locations
• Do not need low-levellayout details
38
Parameterize Logic Block
• Cover wide search space for logic blocks
• Vary layout density– Dense: Determined by # pins (small layout area)
– Sparse: Determined by Standard Cell implementation
• Vary logic capacity– Sweep number of inputs and outputs
• 2-input, 1-output logic blocks (shown here)
• 16-input, 8-output logic blocks (also in paper)
– Use logic clustering (T-VPack) as tech-mapper
39
Talk Outline
• Cost model
• Experimental methodology– Metrics
– CAD flow
– Architecture modeling
• Area, cost trends
• Conclusions
40
Area, Cost Trends
• Experimental results
– MCNC benchmarks• Geometric mean over 19 large circuits
– Logic block density• Dense, medium, and sparse
– Logic block capacity• From 2-input, 1-output to 16-input, 8-outputs
• Only 2-input, 1-output results shown here
41
Area and Die-Cost Trends
31 20
5
15
10
Cos
t ($)
Nvl MPSAs:(Nrl – 1)
Dense Logic BlockDense Logic Block
31 2
0.5
0
1
1.5
2
Cor
e A
rea Jumper40
Jumper20Crossover
MPSA
Nvl MPSAs:(Nrl – 1)
Area Cost
42
Area and Die-Cost Trends
31 20
5
15
10
Cos
t ($)
Nvl MPSAs:(Nrl – 1)
Dense Logic BlockDense Logic Block
31 2
0.5
0
1
1.5
2
Cor
e A
rea Jumper40
Jumper20Crossover
MPSA
Nvl MPSAs:(Nrl – 1)Area
Cost
MPSA < Crossover < Jumper
Nvl = 1 more area,needs whitespace to route
43
Area and Die-Cost Trends
31 20
5
15
10
Cos
t ($)
Nvl MPSAs:(Nrl – 1)
Dense Logic BlockDense Logic Block
31 2
0.5
0
1
1.5
2
Cor
e A
rea Jumper40
Jumper20Crossover
MPSA
Nvl MPSAs:(Nrl – 1)Area Cost
MPSAs: more layers higher cost
VPSAs: more layers lower cost
MPSA < Crossover < Jumper
Nvl = 1 more area,needs whitespace to route
44
Area and Die-Cost Trends
31 20
5
15
10
Cos
t ($)
Nvl MPSAs:(Nrl – 1)
Dense Logic BlockDense Logic Block
31 2
0.5
0
1
1.5
2
Cor
e A
rea Jumper40
Jumper20Crossover
MPSA
Nvl MPSAs:(Nrl – 1)Area Cost
MPSAs: more layers higher cost
VPSAs: more layers lower cost
MPSA < Crossover < Jumper
Nvl = 1 more area,needs whitespace to route
45
Area and Die-Cost Trends
• Sparse layout is better! ???– Less whitespace needed
• Need to study whitespace allocation
31 2
0.5
0
1
1.5
2
Cor
e A
rea Jumper40
Jumper20Crossover
MPSA
Nvl MPSAs:(Nrl – 1)
Sparse Logic Block
31 20
5
15
10
Cos
t ($)
Nvl MPSAs:(Nrl – 1)
Sparse Logic Block
Area Cost
Delay and Power Trends
Key results (in paper):
MPSA is significantly better than VPSA
47
Talk Outline
• Cost model
• Experimental methodology– Metrics
– CAD flow
– Architecture modeling
• Area, cost trends
• Conclusions
48
Conclusions• Trends for VPSAs
– Die-cost more important than die-area
– MPSAs better in Area, Delay, and Power
– VPSAs better in Cost
– Interconnect Matters• Performance varies with different routing fabrics
• Even significant variation among VPSA structures
• Ongoing research– Interconnect architectures
– Whitespace insertion algorithm
49
Limitations
• CAD framework available onlinehttp://groups.google.com/group/sasic-pr
• This is early work … need improvements!– Whitespace insertion
– Buffer insertion
– Delay/Power of logic blocks
– Power/clock network area overhead
– SRAM-configurable logic blocks
Key Message
2. Interconnect matters.
1. Structured ASICswill be the key technologyof the future.
Because the key issues thatmake structured ASICs attractivehave not been solved.
They are growing more prominent.
MPSAs have better performance,
VPSAs are cheaper.50
51
52
CAD Framework Available
Power and Delay Trends
54
Metrics
• Area– Determined from placement grid size
• Delay– Average net delay (Elmore model)
• Register locations unknown; critical path delay calculation is difficult
• CAD flow is not timing driven
• Power– Total metal + via capacitance
55
Talk Outline
• Cost model
• Experimental methodology– Metrics
– CAD flow
– Architecture modeling
• Area, delay, power, cost trends
• Cost model sensitivity
• Conclusions
56
Power TrendsIn
terc
onne
ct P
ower
1.5
2
1
1 2 3
Jumper40Jumper20Crossover
Nvl
Dense Logic Block
57
Power Trends
• Significant range for different routing fabrics• More custom via layers → Lower Power
– Especially for dense layouts
Inte
rcon
nect
Pow
er
1.5
2
1
1 2 3
Jumper40Jumper20Crossover
Nvl
Dense Logic Block
1 2 3
1.5
2
1
Nvl
Sparse Logic Block
Jumper40Jumper20Crossover
58
Power Trends
• Re-Normalized to MPSAs• VPSAs use more power
– 2x (sparse) to 6x (dense) more than MPSAs
12 3
2
3
5
6
4
1
VP
SA
Inte
rco
nnec
t P
ow
er
MP
SA
Int
erc
onne
ct P
ower
Jumper40Jumper20Crossover
Nvl MPSAs:(Nrl – 1)
Dense Logic Block
12 3
2
3
5
6
4
1
Jumper40Jumper20Crossover
Nvl MPSAs:(Nrl – 1)
Sparse Logic Block
59
Delay TrendsIn
terc
onne
ct D
elay
1.5
2
1
1 2 3Nvl
Dense Logic Block
Crossover
60
Delay TrendsIn
terc
onne
ct D
elay
1.5
2
1
1 2 3Nvl
Dense Logic Block
CrossoverJumper40
61
Delay TrendsIn
terc
onne
ct D
elay
1.5
2
1
1 2 3Nvl
Dense Logic Block
Crossover
Jumper20Jumper40
62
Delay TrendsIn
terc
onne
ct D
elay
1.5
2
1
1 2 3Nvl
Dense Logic Block
Crossover
Jumper20Jumper40
1.5
2
1
1 2 3Nvl
Sparse Logic Block
Jumper40Jumper20Crossover
• Significant range for different fabrics– Delay improves with more custom via layers
• Jumper Fabric: Long segments improve delay(but higher power)
63
• Re-Normalized to MPSAs• VPSA delay up to 20x worse
Delay Trends
Jumper40Jumper20Crossover
5
10
15
20
VP
SA
Inte
rcon
nect
Del
ay
MP
SA
Inte
rcon
nect
Del
ay
1 2 3Nvl
MPSAs:(Nrl – 1)
Dense Logic Block
5
10
15
20
1 2 3Nvl
MPSAs:(Nrl – 1)
Sparse Logic Block
Jumper40Jumper20Crossover
64
Delay Trends
Jumper40Jumper20Crossover
5
10
15
20
VP
SA
Inte
rcon
nect
Del
ay
MP
SA
Inte
rcon
nect
Del
ay
1 2 3Nvl
MPSAs:(Nrl – 1)
Dense Logic Block
5
10
15
20
1 2 3Nvl
MPSAs:(Nrl – 1)
Sparse Logic Block
Jumper40Jumper20Crossover
Why isVPSA delayworse than
MPSA delay?• Re-Normalized to MPSAs• VPSA delay up to 20x worse
65
Delay Trends
Jumper40Jumper20Crossover
5
10
15
20
VP
SA
Inte
rcon
nect
Del
ay
MP
SA
Inte
rcon
nect
Del
ay
1 2 3Nvl
MPSAs:(Nrl – 1)
Dense Logic Block
5
15
25
1 2 3
VP
SA
Tot
al N
o. o
f Via
s
MP
SA
Tot
al N
o. o
f Via
s
35
1 2 3
2
3
4
5
6
VP
SA
Tot
al W
irele
ngth
MP
SA
Tot
al W
irele
ngth
Nvl MPSAs:(Nrl – 1)
• Re-Normalized to MPSAs• VPSA delay up to 20x worse
more vias
morewirelength
66
Delay Trends
Jumper40Jumper20Crossover
5
10
15
20
VP
SA
Inte
rcon
nect
Del
ay
MP
SA
Inte
rcon
nect
Del
ay
1 2 3Nvl
MPSAs:(Nrl – 1)
Dense Logic Block
5
15
25
1 2 3
VP
SA
Tot
al N
o. o
f Via
s
MP
SA
Tot
al N
o. o
f Via
s
35
1 2 3
2
3
4
5
6
VP
SA
Tot
al W
irele
ngth
MP
SA
Tot
al W
irele
ngth
Nvl MPSAs:(Nrl – 1)
more vias
morewirelength• Re-Normalized to MPSAs
• VPSA delay up to 20x worse
67
Delay Trends
Jumper40Jumper20Crossover
5
10
15
20
VP
SA
Inte
rcon
nect
Del
ay
MP
SA
Inte
rcon
nect
Del
ay
1 2 3Nvl
MPSAs:(Nrl – 1)
Dense Logic Block
5
15
25
1 2 3
VP
SA
Tot
al N
o. o
f Via
s
MP
SA
Tot
al N
o. o
f Via
s
35
1 2 3
2
3
4
5
6
VP
SA
Tot
al W
irele
ngth
MP
SA
Tot
al W
irele
ngth
Nvl MPSAs:(Nrl – 1)
more vias
morewirelength
• Re-Normalized to MPSAs• VPSA delay up to 20x worse
68
Cost Model Sensitivity
70
Talk Outline
• Cost model
• Experimental methodology– Metrics
– CAD flow
– Architecture modeling
• Area, delay, power, cost trends
• Cost model sensitivity
• Conclusions
71
Cost Model Sensitivity
• How sensitive is the die-cost to various factors?• Primary factors
– Die area– Number of customizable layers
• Secondary factors– Maskset cost– Volume requirements– Number of fixed lower masks
72
Cost Model Sensitivity
• VPSAs less sensitive to maskset cost
– Sensitivity to Maskset Cost
Cos
t ($)
531
10
15
20
25
2
Maskset Cost = $2.5M
Jumper40Jumper20Crossover
MPSA
5
10
15
20
25
31 2
Maskset Cost = $5M
Jumper40Jumper20Crossover
MPSA
Cos
t ($)
Nvl MPSAs:(Nrl – 1)
Nvl MPSAs:(Nrl – 1)
73
Cost Model Sensitivity
• VPSA cost increases more rapidly than MPSAs– Large area of VPSAs
– Sensitivity to Number of Fixed Lower Masks ( )lfm
N
Cos
t ($)
531
10
15
20
25
2
Jumper40Jumper20Crossover
MPSA
Nfm (Fixed Lower Masks) = 18 Nfm (Fixed Lower Masks) = 36
31 2
Jumper40Jumper20Crossover
MPSA
Cos
t ($)
5
10
15
20
25
Nvl MPSAs:(Nrl – 1)
Nvl MPSAs:(Nrl – 1)
74
Cost Model Sensitivity
• VPSAs less sensitive to customer volume than MPSAs
– Sensitivity to Per Customer Volume (Vc)
Cos
t ($)
531
10
15
20
25
2
Jumper40Jumper20Crossover
MPSA
Vc (Per Customer Volume) = 100k Vc (Per Customer Volume) = 50k
31 2
Jumper40Jumper20Crossover
MPSA
5
10
15
20
25
Nvl MPSAs:(Nrl – 1)
Nvl MPSAs:(Nrl – 1)
75
Recommended