Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
D l t f t l tiDevelopment of tools supporting FPGA reconfigurable hardwareFPGA reconfigurable hardware
EX VPREX-VPRA hit t E l ti /M difi tiArchitecture Exploration/Modification
Outline
FPGA h t i ti d D i IFPGA characteristics and Design Issues
Design Methodologies
Heterogeneous FPGA Platforms
T iTemperature-aware mapping
f d d i flSoftware-supported design flow
Conclusions
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Interconnect delay matters
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Energy Break-down for FPGAs
The energy dissipation of routing structure is the dominant energycomponent of the total energy [*]component of the total energy [*]
[*] K. Leijten-Nowak et.al, “An FPGA Architecture with Enhanced DatapathFunctionality” , pp. 195-204, FPGA’03
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
General view of FPGA wire segments and Switch BoxesSwitch Boxes
From Dehon and Wawrzyniek
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Statistic approach of connections into SBs
SB Connection Pattern
35%
40%
25%
30%
10%
15%
20%
0%
5%
10%
┼ └ ┘ │ ┴ ┤ ├┼ ┌ ┐ └ ┘ │ ─ ┴ ┤ ├ ┬Pattern
Source: Gang Wang, et al, “Statistical Analysis and Design of HARP Routing Pattern FPGAs,” Trans. CAD, 2006.
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Statistic approach of connections into SBs
SB Connection PatternSB Connection Pattern
30%
35%
40%
15%
20%
25%
70%
0%
5%
10%
15% 70%
Horizontal and vertical connections:
0%┼ ┌ ┐ └ ┘ │ ─ ┴ ┤ ├ ┬
Pattern
Horizontal and vertical connections:are the most frequently used
i i i th b f b d f th fi l timinimize the number of bends of the final routingSource: Gang Wang, et al, “Statistical Analysis and Design of HARP Routing Pattern FPGAs ” IEEE Trans On CAD 2006Pattern FPGAs, IEEE Trans. On CAD, 2006
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Channel Utilization across the FPGA device
0,90
1,00
0 60
0,70
0,80
of tr
acks
tr
0,40
0,50
0,60
zed
num
ber
o
0,20
0,30
,
norm
aliz
0,00
0,10
B Ed Middl T EdBottom Edge Middle Top Edge
Source: V. Betz, J. Rose and A. Marquardt, “Architecture and CAD for Deep-Submicron FPGAs”, Kluwer Academic Publishers 1999Kluwer Academic Publishers, 1999
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Channel Utilization across the FPGA device
0 80
0,90
1,00
0,60
0,70
0,80
r of t
rack
str
0,40
0,50
aliz
ed n
umbe
r
0,20
0,30
norm
a
0,00
0,10
Bottom Edge Middle Top Edge
Source: V. Betz, J. Rose and A. Marquardt, “Architecture and CAD for Deep-Submicron FPGAs”, Kluwer Academic Publishers 1999Kluwer Academic Publishers, 1999
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Outline
FPGA characteristics and Design IssuesFPGA characteristics and Design Issues
D i M th d l iDesign Methodologies
l fHeterogeneous FPGA Platforms
T t iTemperature-aware mapping
S ft t d d i flSoftware-supported design flow
Conclusions
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Proposed Methodology
1st St
Deter
conveFP
GA
requir tep:rm
ine entional A rem
ents
2nd Step:
Exploration for different Segm
ents
2nd Step
3rd Ste
Explordiffere ep:
ration for ent S
Bs
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Connectivity Requirements of conventional FPGAs
Normalized values from ALLNormalized values from ALL
MCNC benchmarks
The actually used hardwareThe actually-used hardware
resources provide an irregular
picturepicture.
Ideally, we had to use different
interconnection architecture atinterconnection architecture at
each (x,y) point of FPGA.
F th tFor that purpose, we propose a
piecewise-homogeneous FPGA
architecture consisting of a fewarchitecture consisting of a few
piecewise regions.
Detail info in FPL05, FPGA 06, RAW 06
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Power Consumption
• Normalized values from• Normalized values from ALL MCNC benchmarks
• Determination and• Determination and Visualization of hotspots
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Proposed Methodology
Visualization of spatial data extracted from conventional FPGAPlace and Route application into conventional FPGAExtract info for placement, routing, power dissipation, etc.Graphical representation of conventional FGPA requirementsHardware requirements (connectivity, delay, energy, etc.)
1st Step:
Determ
ine conventional FP
GA
requirem
entsDetermination of optimal Segment LengthExploration for a number of different Segment LengthsVis ali ation the res lts from the e ploration
2nd Ste
Explora
for diffeS
egme
Placement and Routing ProcedurePlace application into FPGA
2nd StVisualization the results from the exploration
Define a selection criterion (EDP, Delay, Energy, Area, etc.)Select the optimal Segment Length
ep: ation erent ents
D t i th ti l SB bi ti
Visualization the results from the placementDefine a selection criterion (EDP, Delay, Energy, Area, etc.)Re-place the application based on this criterionRoute the application with the “thermal-aware” placement
tep
Determine the optimal SB combinationExploration for a number of different SBs combinationsVisualization the results from the explorationDefine a selection criterion (EDP, Delay, Energy, Area, etc.)Select the optimal SB combination
3rd Step:
Exploration for
different SB
s
Thermal-aware application mapping
Optimal Interconnection Network
Determine the optimal ratio among SBs
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Connectivity requirements for two regions
1,00
afterprojection
0,50
0,00
0,00-0,50 0,50-1,00 0,00-0,50 0,50-1,00
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
P&R for MAC32 DSP Application (45x45 FPGA)
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
P&R for MAC32 DSP Application (45x45 FPGA)
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Homogeneous FPGA (conventional approach)
MCNC Benchmark:MCNC Benchmark: cm138a
Switch box: Subset
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Heterogeneous FPGA Architecture
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Heterogeneous FPGA Architecture
SubsetSwitch Box
UniversalSwitch Box
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
ALTERA STRATIX-II …..extension
Lx & Switch Box 1 Ly & Switch Box 2
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Outline
FPGA market Current trendsFPGA market, Current trends
D i M th d l iDesign Methodologies
l fHeterogeneous FPGA Platforms
T t iTemperature-aware mapping
S ft t d d i flSoftware-supported design flow
Conclusions
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Why temperature is critical?
Shorten interconnect and device life andShorten interconnect and device life and
package reliability
Increased interconnect resistivity
worse power-grid IR drops,
↑ interconnect RC delays
↑↑ leakage power due to exponential
inc ease of s b th eshold c ent ith Tincrease of sub-threshold current with T
↓ carrier mobility slower devices↓ carrier mobility slower devicesMEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Proposed Temperature-aware Placement MethodologyPlacement Methodology
Visualization of spatial data extracted from conventional FPGAPlace and Route application into conventional FPGA
1st St
Deter
conveFP
GA
requir
Extract info for placement, routing, power dissipation, etc.Graphical representation of conventional FGPA requirementsHardware requirements (connectivity, delay, energy, etc.)
tep:rm
ine entional A
rements
Determination of optimal Segment LengthExploration for a number of different Segment LengthsVisualization the results from the explorationDefine a selection criterion (EDP, Delay, Energy, Area, etc.)
2nd Step:
Exploration
for different S
egments
Placement and Routing ProcedurePlace application into FPGAVisualization the results from the placementDefine a selection criterion (EDP, Delay, Energy, Area, etc.)
2nd Step
Select the optimal Segment Length
Determine the optimal SB combinationExploration for a number of different SBs combinations
3rd Ste
Explor
differe
( , y, gy, , )Re-place the application based on this criterionRoute the application with the “thermal-aware” placement
Visualization the results from the explorationDefine a selection criterion (EDP, Delay, Energy, Area, etc.)Select the optimal SB combinationDetermine the optimal ratio among SBs
ep: ration for
ent SB
s
Thermal-aware application mapping
Optimal Interconnection Network
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Temperature-Aware Mapping
Problem Formulation: Given a certain FPGA device withProblem Formulation: Given a certain FPGA device with
specific power budget, find an appropriate mapping of an
li ti hi happlication which:
re-distributes the power budget over the whole FPGA
device into one more “balanced” way
reduces the number and amplitude of the powerreduces the number and amplitude of the power
consumption peaks of the hotspots
No impact on: (i) maximum device frequency operation, (ii)
total energy/power consumption, (iii) required areatotal energy/power consumption, (iii) required area
It can be used as a power and temperature management
strategyMEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Temperature Model
Divide the FPGA device into M×N distinct tilesU th th l RC i it [A] d th f lUse the thermal RC circuit [A] and the formula:
where the Rij is the transfer thermal resistance, the Pijrepresents the power consumption of the Tileij located at
fp p p ij
(i,j), while the Tij is the temperature at tile of the FPGAConclusion: the steady-state temperature of each location across the silicon die is a function of the power consumption p pof all the on-chip heat sources
[A] A. Krum, “Thermal management”, The CRC handbook of thermal[A] A. Krum, Thermal management , The CRC handbook of thermal engineering, pp. 2.1-2.92, CRC Press, Boca Raton, FL, 2000
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Temperature-Aware Mapping: Basic Concept
Initial mapping (VPR-based) Proposed mapping (EX-VPR)Unused-CLBs Unused-CLBs
Dehon A., “Balancing interconnect and computation in a reconfigurable computing array g g g y(or, why you don’t really want 100% LUT utilization), ACM Int. Symp. on FPGA 1999
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Temperature-Aware P&R Procedure
Unused-CLBsUnused-CLBs
Proposed mapping (EX-VPR)Initial mapping (VPR-based)
Dehon A., “Balancing interconnect and computation in a reconfigurable computing arrayDehon A., Balancing interconnect and computation in a reconfigurable computing array (or, why you don’t really want 100% LUT utilization), ACM Int. Symp. on FPGA 1999
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Temperature variation map of spla benchmark
Normalized TemperatureNormalized Temperature
Compute-bound
FPGAplane
applicationMapped onto the smallest square FPGA device FPGA plane FPGA planesquare FPGA device. It needs 61×61 FPGA with 3690 CLBs, 62 I/O padsIdentical hardware resources
I iti l i P d iInitial mapping
(VPR-based)
Proposed mapping
(EX-VPR)MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Temperature variation of bigkey benchmark
• I/O-bound applicationE ti li ti• Encryption application
• Mapped onto 54×54 island-style FPGA withisland style FPGA with 1707 CLBs and 426 I/O padsIdentical hardware resources
LOW HIGH
Initial mapping Proposed mappingInitial mapping
(VPR-based)
Proposed mapping
(EX-VPR)
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Average temperature variation across the FPGA over the 20 biggest MCNC benchmarksover the 20 biggest MCNC benchmarks
Normalized TemperatureNormalized Temperature
?
FPGA plane
?
FPGA plane
Initial mapping Proposed mapping
(VPR-based) (EX-VPR)
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Average temperature variational
ues
33% reduction
mpe
ratu
re v
a
Lower temperature “hotspots”
ith s
peci
fic te
p
(%) a
rea
wi
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Power 3-D Maps
Conventional mapping (VPR-based) Proposed mapping (EX-VPR)
Power reduction up to 33% at the “hotspots” (>0.7)Average values over the 20 biggest MCNC benchmarks
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
VPR-based vs temperature-aware mapping
BenchmarkVPR-based mapping Temperature-aware mapping
Delay×10-8 (Sec)
Power×10-3 (Watt)
Leakage×10-3 (Watt)
Energy×10-9 (Joule)
Delay×10-8 (Sec)
Power×10-3 (Watt)
Leakage×10-3 (Watt)
Energy×10-9 (Joule)
Alu4 9.77 59.66 11.50 5.83 8.69 67.03 11.30 5.82
Apex2 9.37 72.01 16.08 6.75 9.65 70.53 16.01 6.81
Apex4 8.86 47.04 11.77 4.17 9.10 46.46 11.54 4.23
bigkey 6.03 131.90 16.29 7.95 9.71 85.17 16.70 8.27bigkey 6.03 131.90 16.29 7.95 9.71 85.17 16.70 8.27
clma 10.1 201.01 23.1 75.6 9.23 198.43 23.45 79.9
des 7.76 135.51 24.77 10.5 11.1 102.08 24.74 11.4
diffeq 6.15 57.09 10.61 3.51 6.32 55.31 10.75 3.50
dsip 7.99 97.87 18.91 7.82 10.9 73.64 19.43 8.01
elliptic 10.9 113.31 32.36 12.4 11.7 108.70 32.46 12.8
ex1010 18.3 88.53 34.93 16.2 18.3 88.86 34.86 16.3
ex5p 9.26 46.59 10.36 4.32 7.17 57.15 10.35 4.10
frisc 16.1 75.56 37.54 12.1 13.4 83.69 37.50 11.2
misex3 11.7 47.20 11.22 5.55 8.34 62.90 11.05 5.25
pdc 20.4 109.56 58.16 22.3 18.7 114.33 58.41 21.4pdc 20.4 109.56 58.16 22.3 18.7 114.33 58.41 21.4
s298 13.4 51.20 11.51 6.88 13.4 52.02 11.35 6.95
s38417 9.85 232.95 40.74 22.9 9.79 235.68 40.46 23.1
s38584 7.45 226.02 35.4 43.2 7.85 215.03 34.91 35.6
seq 9.52 66.43 14.65 6.32 8.17 74.44 14.66 6.08
spla 15.6 83.56 37.78 13.1 18.0 77.48 38.08 13.9
tseng 5.55 57.32 6.09 3.18 5.53 57.87 6.03 3.20
A 10 7 100 01 23 19 14 5 10 8 0 10 0 02 14 4Average: 10.7 100.01 23.19 14.5 10.8 0.10 0.02 14.4
Average Gains -0.41 3.68 -0.05 1.04MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Outline
FPGA market Current trendsFPGA market, Current trends
Design Methodologies
Heterogeneous FPGA Platforms
Temperature-aware mapping
Software-supported design flowpp g
ConclusionsConclusionsMEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Development of tools supporting implementation on FPGAimplementation on FPGA
Complete Design Flow: p gInput VHDL Output Bitstream
The single only complete design flow in academia based on open-source tools and prunning on LinuxTypical FeaturesTypical Features
C/C++ languageInput format: RT VHDL Structural VHDL EDIF BLIFInput format: RT VHDL, Structural VHDL, EDIF, BLIFOutput: Configuration StreamTechnology IndependentTechnology IndependentPortability (e.g. x486, SPARC)Run on a local machine or through the Internet/IntranetRun on a local machine or through the Internet/Intranet
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Qualitative comparison
FEATURE AMDREL XILINX Univ. TORONTO ALLIANCE
VHDL/ VHDL/Data Input Format VHDL/VERILOG
VHDL/VERILOG BLIF VHDL
SynthesizerFormatFormatTranslation -ArchitectureDescriptionA hit tArchitectureExploration/ModificationPlace & RoutePlace & RouteBitstreamGenerationBack annotationBack annotationPower EstimationArea Estimation -GUIGUIUser Manual
OS LINUXSOLARIS/
WINDOWS/ SOLARIS LINUXOS LINUX WINDOWS/LINUX
SOLARIS LINUX
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Qualitative comparison between VPR and EX-VPR
Feature VPR EX-VPRFeature VPR EX-VPR
Placement
RoutingRouting
Supported Switch Boxes (SBs) Subset, Wilton, Universal
Subset, Wilton, UniversalUser specified Switch-BoxBox
Multiple switch boxes
Multiple Segmentsp g
Thermal/Temperature Analysis
Insertion of IP core
Power Estimation
Timing info (sec)g ( )
Silicon Area estimation (um2)
Application specific FPGA designpp p g
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Graphical User Interface (GUI)
Available at http://vlsi ee duth gr/amdrelAvailable at http://vlsi.ee.duth.gr/amdrelMEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Conclusions
Design a heterogeneous FPGA ArchitecturesDesign a heterogeneous FPGA Architectures
Significant gains in: Performance and Power
Development a temperature-aware placement
algorithm
“Balanced” thermal distribution across the FPGABalanced thermal distribution across the FPGA
Reduction of maximal temperature values in educ o o a a e pe a u e a ues
hotspots by 33% in average
Software environment for FPGA architecture
l tiexplorationMEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
Literature
K. Siozios, et al., “An Integrated Framework for Architecture Level Exploration of Reconfigurable Platform”, 15th Int. Conf. FPL 2005, pp 658-661, 26-28 Aug. 2005K Leijten Nowak and Jef L van Meerbergen “An FPGA Architecture with Enhanced DatapathK. Leijten-Nowak and Jef. L. van Meerbergen, An FPGA Architecture with Enhanced DatapathFunctionality”, FPGA’03, California, USA, pp. 195-204, Feb. 2003V. Betz, J. Rose and A. Marquardt, “Architecture and CAD for Deep-Submicron FPGAs”, KluwerAcademic Publishers, 1999 http://vlsi.ee.duth.gr/amdrelhttp://vlsi.ee.duth.gr/amdrelhttp://www.xilinx.com/products/silicon-solutions/fpgas/virtex/virtex4/overviewGuy Lemieux and David Lewis, “Design of Interconnection Networks for Programmable Logic”, KluwerAcademic Publishers, 2004K. Poon, A. Yan, S. Wilton, “A Flexible Power Model for FPGAs”, in Proc. of 15th Int. Conf. on FieldK. Poon, A. Yan, S. Wilton, A Flexible Power Model for FPGAs , in Proc. of 15th Int. Conf. on Field Programmable Logic and Applications, pp.312–321, 2002A. Dehon, “Balancing interconnect and computation in a reconfigurable computing array (or, why you don’t really want 100% LUT utilization), in Proc. of Int. Symp. on Field Programmable Gate Arrays, pp. 69-78, 1999
f fK. Siozios, et.al., “A Novel Methodology for Designing High-Performance and Low-Power FPGA Interconnection Targeting DSP Applications”, in Proc. of ΙΕΕΕ Int. Symp. on Circuits and Systems, 21-24 May 2006K. Siozios, K. Tatas, D. Soudris and A. Thanailakis, “Platform-based FPGA Architecture: Designing High-Performance and Low-Power Routing Structure for Realizing DSP Applications ” accepted forPerformance and Low Power Routing Structure for Realizing DSP Applications, accepted for presentation in RAW 2006, 13th Reconfigurable Architectures Workshop, Rhodes, April 25-26, 2006,Greece.A. Singh, G. Parthasarathy and M. Marek-Sadowska, “Efficient Circuit Clustering for Area and Power Reduction in FPGAs,” in ACM TODAES, Vol. 7, No. 4, Oct. 2002, Pages 643–663.Wei Huang, et al, “Hotspot: A Compact Thermal Modeling Methodology for Early-Stage VLSI Design”, IEEE Trans. on VLSI Systems, Vol. 14, No. 5, pp. 501-513, May 2006.S. Vassiliadis and D. Soudris, “Fine- and Coarse-Grain Reconfigurable Computing”, Springer 2007.Gang Wang, Satish Sivaswamy, Cristinel Ababei, Kia Bazargany, Ryan Kastner, and Eli Bozorgzadeh, “St ti ti l A l i d D i f HARP R ti P tt FPGA ” T CAD 2006“Statistical Analysis and Design of HARP Routing Pattern FPGAs,” Trans. CAD, 2006.
MEANDER Design Framework – VLSI Design and Testing Center – Democritus University of Thrace
More info:AMDREL website: http://vlsi ee duth gr/amdrelAMDREL website: http://vlsi.ee.duth.gr/amdrelEmail: [email protected] – [email protected]