33
Bachelor of Applied Science Thesis Defense An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays Kevan Thompson Computer Engineering School of Engineering Science, SFU

An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

  • Upload
    senona

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

Bachelor of Applied Science Thesis Defense . An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays. Kevan Thompson Computer Engineering School of Engineering Science, SFU. Overview. Introduction Background Methodology Results Conclusions and Future Work. - PowerPoint PPT Presentation

Citation preview

Page 1: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Bachelor of Applied Science Thesis Defense

An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Kevan ThompsonComputer Engineering

School of Engineering Science, SFU

Page 2: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Overview

Introduction

Background

Methodology

Results

Conclusions and Future Work

Page 3: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Introduction

2000 2002 2004 2006 2008 2010 20120

500000

1000000

1500000

2000000

2500000

Size of Xilinx FPGAs

Year

Num

ber o

f Log

ic Ce

lls

Virtex-2P

Virtex-7

Virtex-6

Virtex-5Virtex-4

Page 4: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

ASIC Vs FPGA

ASIC: Completely Custom Design

Large Initial Investment

Need to carefully design interconnect between nodes

FPGA:

Reconfigurable

Low cost for small volume runs

Wires already placed on the FPGA

Page 5: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Objective

Improvements in the Xilinx tools that have significantly affected the performance of NoCs on FPGAs

Improvements in NoC performance on FPGAs that are possible using manual PAR

The Star and Fully Connected topologies do not fit into current models

Page 6: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

NoC Terminology

Ring

Star

Mesh

Fully Connected

•Topology

•Node

•Degree

•Average Node Degree (AND)

Page 7: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Previous Work on NoCs on FPGAs

For Xilinx FPGAs:

Page 8: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Methodology

Multiplier Node| | | ... | | |

| | | ... | | |

Input FSL

Output FSL

Network Switch

.

.

.

8-bit multiplier node Two Fast Simplex Links (FSLs)

Network topology communication switch

FSLs: 16-word-deep queues,24-bit width

Multiplier uses 981 Flip-flops, and 653 LUTs

FPGA Xilinx Virtex-5 xc5vlx330

Page 9: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Results

10.1 Tools Vs 12.1 Tools

•Star, Ring, and Fully Connected Networks

Predicted Vs Measured Results

•Star, and Fully Connected Networks

Manual Implementation

Ring, Star, and Mesh Networks

Page 10: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

10.1 Tools VS 12.1 Tools for Star Networks

8 16 32 48 640

50

100

150

200

250

10.1 Tools Vs 12.1 Tools for Star Networks

10.1 Tools12.1 Tools

Number of Nodes

Maxi

mum Frequency (

MHz)

Page 11: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

10.1 Tools VS 12.1 Tools for Ring Networks

8 16 32 48 64192

194

196

198

200

202

204

206

208

210

10.1 Tools Vs 12.1 Tools for Ring Networks

10.1 Tools12.1 Tools

Number of Nodes

Maxi

mum Frequency (

MHZ)

Page 12: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

10.1 Tools VS 12.1 Tools for Fully Connected Networks

8 16 24 32 40 480

20

40

60

80

100

120

140

160

180

200

10.1 Tools Vs 12.1 Tools For Fully Connected Networks

10.1 Tools12.1 Tools

Number of Nodes

Maxi

mum Frequency (

MHz)

Page 13: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Percent Improvement of 12.1 Tools Over 10.1 Tools

8 16 32 48 640.0%

10.0%

20.0%

30.0%

40.0%

50.0%

60.0%

70.0%

80.0%

Percent Improvement of 12.1 Over 10.1 Tools

StarRingFully Connected

Number of Nodes

Percent Increase (%)

Page 14: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Star Networks

8 16 32 48 64 80 960

50

100

150

200

250

Predicted Vs Measured Results

Measured ResultsPredicted Results

Number of Nodes

Maxi

mum Frquency (

MHz)

𝒚= −𝟎.𝟑𝟎𝟗𝟎𝒙+ 𝟐𝟎𝟑.𝟖

Page 15: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Results

Page 16: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Results for Adjusted Model

8 16 32 48 64 80 960

50

100

150

200

250

Adjusted Predicted Vs Measured Result

Measured ResultsPredicted Results

Number of Nodes

Maxi

mum Frequency (

MHz)

Page 17: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Comparison of Models

8 16 32 48 64 80 96

-20

-10

0

10

20

30

40

50

60

Percent Difference Between Predicted and Measured Results

Original ModelAdjusted Model

Number of Nodes

Percent Difference (%)

Page 18: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Prediction of Adjusted Model for Random Networks

2 3 4 5 6 7 8 9 10

-70

-60

-50

-40

-30

-20

-10

0

Percent Error for Random Networks

Random_16Random_32Random_48

Average Node Degree

Percent Error (%)

Page 19: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Fully Connected Networks

8 16 24 32 40

-300

-200

-100

0

100

200

300

Predicted Vs Measured Results for Fully Connected Networks

MeasuredPredicted

Number of Nodes

Maxi

mum Frequency (

MHz)

Page 20: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Results

5 10 15 20 25 30 35 40 45 50 550

20

40

60

80

100

120

140

160

180

200

Interpolated Results for Fully Connected Networks

Number of Nodes

Max

imum

Fre

quen

cy (M

Hz)

Page 21: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

CAD Tool Synthesis Steps

Behavioural-level Synthesis [14]

Technology Mapping [15]

Placement [16]

Routing [16]

1

4

3

2

HDL is parsed for recognizable constructs

Constructs mapped to the specific FPGAs technology

Components of the design are placed on the FPGA using Simulated Annealing

Wires are connected between the components, using an algorithm called Pathfinder

Page 22: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Automatic PAR of a 96 node Ring Network

Page 23: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Manual PAR of a 96 Node Ring Network

Page 24: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Ring Network Pre and Post PlanAhead Results

8 16 32 48 64 80 96 1280

50

100

150

200

250

Ring Pre and Post PlanAhead Results

Pre-PlanAheadPost-PlanAhead

Number of Nodes

Maxi

mum Frequency (

Mhz)

Page 25: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Star Network Pre and Post PlanAhead Results

8 16 32 48 64 80 960

50

100

150

200

250

Star Pre and Post PlanAhead Results

Pre-PlanAheadPost-PlanAhead

Number of Nodes

Maxi

mum Frequency (

MHz)

Page 26: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Mesh Network Pre and Post PlanAhead Results

8 16 32 48175

180

185

190

195

200

205

210Mesh Pre and Post PlanAhead Results

Pre-PlanAheadPost-PlanAhead

Number of Nodes

Maxi

mum Frequency (

MHz)

Page 27: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Conclusions

Xilinx 12.1 Tools offer significant improvements in the PAR of NoCs on FPGAs

The analytical model proposed by Lee et al[1] does accuratly predict the performance of Star, and Fully Connected Networks

Using manual PAR it is possible to improve the performance of NoCs on FPGAs

Page 28: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Future Work

Compare the performance of the Xilinx 10.1 tools suite and the Xilinx 12.1 tools suite for link widths of 16, and 32 bits

Build Star and Fully Connected networks with link widths of 16, and 32 bits

Create manual implementations for Torus and Hyper Cube topologies

Page 29: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Acknowledgements

Dr. Lesley Shannon

Dr. Ash Parameswaran

Michael Sjoerdsma

Viewers Like you!

Page 30: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

References

[1] J. Lee. “An Analytical Model Describing The Performance Of Application-Specific Networks-On-Chip On Field-Programmable Gate Arrays” M.A.Sc. thesis, Simon Fraser University, Canada, 2007.[2] Xilinx. “Virtex-II Pro and Virtex-II Pro X Platform FPGAs: Complete Data Sheet”. 2010. Available: http://www.xilinx.com/support/documentation/data_sheets/ds083.pdf[3] Xilinx. “Virtex-4 Family Overview”. 2010. Available: http://www.xilinx.com/support/documentation/data_sheets/ds112.pdf [4] Xilinx. “Virtex-5 Family Overview”. 2010. Available: http://www.xilinx.com/support/documentation/data_sheets/ds100.pdf[5] Xilinx. “Virtex-6 Family Overview”. 2010. Available: http://www.xilinx.com/support/documentation/data_sheets/ds150.pdf[6] Xilinx. “Virtex-7 Product Table”. 2010. Available: http://www.xilinx.com/publications/prod_mktg/Virtex7-Product-Table.pdf[7] Xilinx. “What's New in Xilinx ISE Design Suite 12”. 2010. Available: http://www.xilinx.com/support/documentation/sw_manuals/xilinx12_1/whatsnew.htm#121

Page 31: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

References Cont…

[8] Cisco Systems Inc. “Fiber Distributed Data Interface”. 2010. Available: http://docwiki.cisco.com/wiki/Fiber_Distributed_Data_Interface[9] Cisco Systems Inc. “Token Ring/IEEE 802.5”. 2010. Available: http://docwiki.cisco.com/wiki/Token_Ring/IEEE_802.5[10] Cisco Systems Inc. “Ethernet Technologies”. 2010. Available: http://docwiki.cisco.com/wiki/Ethernet_Technologies[11] Kompics. “Distributed System Launcher”. 2010. Available: http://kompics.sics.se/trac/wiki/DistributedSystemLauncher[12] T. Kranenburg, R. van Leuken. “MB-LITE: A robust, light-weight soft-core implementation of the MicroBlaze architecture”, DATE, France, 2010.[13] K Eguro, S. Hauck, A. Sharma. “Architecture -Adaptive Range Limit Windowing for Simulated Annealing FPGA Placement”, DAC, United States, 2005.[14] G. Grewal, M. O’Cleirigh, M. Wineberg. “An Evolutionary Approach to Behavioral-Level Synthesis”, CEC, Australia, 2003.

Page 32: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

References Cont…

[15] C Legl, B Wurth, K. Eckl. “A Boolean Approach to Performance-Directed Technology Mapping for LUT-Based FPGA Designs”, DAC, United States, 1996.[16]S Chin, S Wilton. “An Analytical Model Relating Fpga Architecture And Place And Route Runtime”, FPL, Czech Republic, 2009.[17]R Gindin, I Cidon, I Keidar. “NoC-Based FPGA: Architecture and Routing”, NOCS, United States, 2007.

Page 33: An Analysis of Network-on-Chip Implementations on Field Programmable Gate Arrays

Questions?