Module R R RRR R RRRRR RR R R R R Efficient Link Capacity and QoS Design for Wormhole...

Preview:

Citation preview

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

Module

R

R

R R R

R

RR R R R

R R

R

Module

R

R

R

Efficient Link Capacity and QoS Design for Wormhole

Network-on-Chip

Zvika Guz, Isask’har Walter, Evgeny Bolotin, Israel Cidon, Ran Ginosar and

Avinoam Kolodny

Technion, Israel Institute of Technology

DATE’06 NoC Capacity Allocation 2

Problem Essence

How much capacity [bits/sec] should be assigned to each link? - All flows must meet delay requirements - Minimize total resources

R

R

R R R

R

RR R R R

R R

R

R

R

R

DATE’06 NoC Capacity Allocation 3

Outline

Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model

- Capacity allocation algorithm Design examples Summary

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 4

Outline

Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model- capacity allocation algorithm

Design examples Summary

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 5

IP1

Inte

rfac

e

IP2

Wormhole Switching

Interface

Suits on chip interconnect Small number of buffers Low latency Virtual Channels

- interleaving packets on the same link

DATE’06 NoC Capacity Allocation 6

Wormhole Switching

Suits on chip interconnect Small number of buffers Low latency Virtual Channels

- interleaving packets on the same link

IP1

Inte

rfac

e

Interface

IP3IP2Interface

DATE’06 NoC Capacity Allocation 7

Outline

Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model

- Capacity allocation algorithm Design examples Summary

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 8

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

ModuleModule

Module

Module

NoC Design Flow

Define inter-module traffic

Place modules

Allocate link capacities

Verify QoS and cost

R

R

R R R

R

RR R R R

R RR

R

R

R

R

R

RR

R

R

R

R

R

R

R R

R R

R

R

R

R

R

RR

R

R

DATE’06 NoC Capacity Allocation 9

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

NoC Design Flow

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

Define inter-module traffic

Place modules

Allocate link capacities

Verify QoS and cost

Too low capacity results in poor QoS Too high capacity wastes power/area

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 10

Capacity Allocation Problem

Simulation takes too long a simulation based solution is not scalable

If no simulations are used:- How to extract flows’ delays? - How to reassign capacity?

Our solution:- Analytical model to forecast QoS- Capacity allocation algorithm that exploit the model

DATE’06 NoC Capacity Allocation 11

Outline

Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model

- Capacity allocation algorithm Design examples Summary

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 12

Delay Analysis

s1

d2

s2

d1

R

R

R R R

R

RR R R R

R R

R

R

R

R

Approximate per-flow latencies Given:

- Network topology- Link capacities- Communication demands

DATE’06 NoC Capacity Allocation 13

Because they assume:- Symmetrical communication demands - No virtual channels- Identical link capacity!

Generally, they calculate the delay of an“average flow”- A per-flow analysis is needed

Why Previous Models Do Not Apply?

DATE’06 NoC Capacity Allocation 14

IP1

Inte

rfac

e

IP2Interface

Wormhole Delay Analysis

The delivery resembles a pipeline pass

Packet transmission can be divided into two separated phases:- Path acquisition- Packet delivery

We focus on packet delivery phase

DATE’06 NoC Capacity Allocation 15

IP1

Inte

rfac

e

IP2Interface

Packet delivery time is dominated by the slowest link- Transmission rate- Link sharing

Packet Delivery Time

Low-capacity link

DATE’06 NoC Capacity Allocation 16

IP1

Inte

rfac

e

Interface Interface

IP2

Packet Delivery Time

Packet delivery time is dominated by the slowest link- Transmission rate- Link sharing

IP3

DATE’06 NoC Capacity Allocation 17

Analysis Basics

Determines the flow’s effective bandwidth Per link

Account for interleaving

tt

DATE’06 NoC Capacity Allocation 18

- mean time to deliver a flit of flow i over link j [sec] - capacity of link j [bits per sec] - flit length [bits/flit] - total flit injection rate of all flows sharing link j

except for flow i [flits/sec]

Single Hop Flow, no Sharing

1

1ij

jl

tC

t

ijt

jC

ij

l

DATE’06 NoC Capacity Allocation 19

- mean time to deliver a flit of flow i over link j [sec] - capacity of link j [bits per sec] - flit length [bits/flit] - total flit injection rate of all flows sharing link j

except for flow i [flits/sec]

1

1ij i

j jl

tC

ijt

jC

ij

l

Bandwidth used by

other flows on link j

Single Hop Flow, with Sharing

tt

DATE’06 NoC Capacity Allocation 20

The Convoy Effect

Consider inter-link dependencies - Wormhole backpressure - Traffic jams down the road

1

1ij i

j jl

tC

| ( , )ij

i ii i k kj j i

k k k

l tt t

C dist j k

Link Load

Account for all subsequent hops Basic delay

weighted by distance

DATE’06 NoC Capacity Allocation 21

Weakest link dominates packet delivery time

Total Packet Transmission Time

- mean packet latency for flow i [sec]iT

max( | )i i i ijT m t j

Packet size[flits/packet]

Account for weakest link

=

- mean packet latency for flow i [sec]

DATE’06 NoC Capacity Allocation 22

Outline

Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model

- Capacity allocation algorithm Design examples Summary

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 23

Greedy, iterative algorithm

Capacity Allocation Algorithm

For each src-dst pair: Use delay model to identify most sensitive link

Increase its capacity Repeat until delay requirements are met

DATE’06 NoC Capacity Allocation 24

Outline

Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model

- Capacity allocation algorithm Design examples Summary

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 25

Capacity Allocation – Example#1

Before optimization

After optimization00

01

02

03

10

11

12

13

20

21

22

23

30

31

32

33

Total capacity re

duced by

7%

Uniform traffic with identical requirements Uniform allocation: 74.4Gbit/sec Capacity allocation algorithm: 69Gbit/sec

DATE’06 26

After optimization

Before optimization

00

01

02

03

10

11

12

13

20

21

22

23

Capacity Allocation – Example#2

A SoC-like system- Heterogeneous traffic demands and delay requirements

Uniform allocation: 41.8Gbit/sec

Total capacity re

duced by

30%

Capacity allocation algorithm: 28.7Gbit/sec

DATE’06 NoC Capacity Allocation 27

Outline

Wormhole based NoC The problem of link capacity allocation Solution:

- Wormhole delay model

- Capacity allocation algorithm Design Examples Summary

Module

Module

Module

Module

Module

Module

Module

Module

Module Module Module

Module

Module

Module

R

R

R R R

R

RR R R R

R R

R

R

R

R

Module

DATE’06 NoC Capacity Allocation 28

Summary

SoCs need non uniform link capacities- Capacity allocation

Wormhole delay analysis- Heterogeneous link capacities - Heterogeneous communication demands- Multiple VCs

Greedy allocation algorithm Design examples

- NoC cost considerably reduced

DATE’06 NoC Capacity Allocation 29

Questions?

M odule

M odule M odule

M odule M odule

M odule M odule

M odule

M odule

M odule

M odule

M odule

QNoCResearch

GroupGroup

ResearchQNoC

DATE’06 NoC Capacity Allocation 30

Backup

DATE’06 NoC Capacity Allocation 31

Grid topology Packet-switched Wormhole switching Fixed path XY routing Heterogeneous link capacities Quality-of-Service

QNoC Architecture

Module

Module

Module

Module

Module

Module

Module

Module

Module

Module

ModuleModule Module Module Module

ModuleModule Module Module Module

ModuleModule Module Module Module

R

R

R

R

R R

R

R

R

RR R R R

RR R R R

RR R R R

R

Router Link

E. Bolotin, I. Cidon, R. Ginosar, A. Kolodny, “QoS Architecture and Design Process for Cost-Effective Network on Chip”, Journal of Systems Architecture, 2004

DATE’06 NoC Capacity Allocation 32

Analysis and Simulation vs. Load

No

rmal

ized

Del

ay

Utilization

Analytical model was validated using simulations- Different link capacities- Different communication

demands

Analysis Validation

DATE’06 NoC Capacity Allocation 33

Slack Elimination

Packet Delay Slack

Sla

ck]%

[

Flow

Recommended