Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker,...

Preview:

Citation preview

Adaptive Backpressure:Efficient Buffer Management for

On-Chip Networks

Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Stanford University

Presenter: Han LiuUniversity of California, San Diego

Han Liu 2

Background

• NoCs become huge– Hundreds of cores on a single die

• Currently using: Input-queued routers– Input buffer resources become significant

• Input buffer sharing is attractive in NoCs– Pros: Improves area and power efficiency– Cons: facilitates spread of congestion

04/29/13

Han Liu 3

Overview

• Adaptive Backpressure mitigates performance degradation by avoiding unproductive use of buffer space in the presence of congestion

• Avoid downsides of buffer sharing while maintaining benefits in benign case

04/29/13

Han Liu 4

Motivation

• Assumption: buffers are good– More flexible routing– Helps traffic waiting closer to the destination

• Is this always true?– Energy, area efficiency– Implementation difficulty

04/29/13

Han Liu 5

Train Example

04/29/13

San Diego(Source)

Denver(buffer)

Boston(Destination)

Buffers are good

Han Liu 6

Motivation

• Static buffer vs Dynamic buffer management

04/29/13

Wasted buffer

Static

Dynamic

VC1

VC2

VC1

VC2

Han Liu 7

Dynamic Buffer Management

• Buffer space is expensive resource in NoCs– 30-35% network power (MIT RAW, UT TRIPS)

• Dynamic management increases utilization by sharing buffer space among multiple VCs– Optimize use of expensive buffer resources– Decrease incremental cost of VCs

⇒Improved area and power efficiency⇒25% more throughput or 34% less power

[Nicopoulos’06]

04/29/13

Han Liu 8

Sharing

• Pros– Economic– Efficient

• Cons– Inconvenient– Trouble

04/29/13

Han Liu 9

Boarder Example

04/29/13

HWY5 HWY805

Mexico

US

Han Liu 10

Buffer Monopolization

• Blocked flits from congested VC accumulate in buffer⇒Effective buffer size reduced for other VCs

⇒Performance degradation (latency / throughput)⇒Congestion spreads across VCs (flows / apps / VMs / …)

04/29/13

VC 0

VC 1

Han Liu 11

Adaptive Backpressure

Goal:• Avoid unproductive use of buffer space in

dynamic buffer management• But allow sharing when beneficial

Approach:• Match arrival and departure rate for each VC by

regulating credit availability (backpressure)• Derive quota from credit round trip times04/29/13

Han Liu 12

Buffer Monopolization

04/29/13

VC 0

VC 1

• Want a way to regulate unlimited credits supply to congested VC1– Give VC0 more credits and buffer space

Han Liu 13

Quota Motivation (1)

Tcrt,0

Without congestion, full throughput

requires Tcrt,0 credits

Router 0 Router 1 Router 0 Router 1

04/29/13

Creditstall

Insufficient credit supply causes idle cycle downstream

Idlecycle

time

Han Liu 14

Quota Motivation (2)

Congestionstall

Creditstall

Matching stalls avoids unproductive

buffer occupancy

Router 0 Router 1 Router 0 Router 1

Excessdrained

04/29/13

Queuing stall

Queuing stall

Tcrt,0+TstallCongestionstall

Queuing stall

Queuing stall

Queuing stall

Queuing stall

Excessflits

Congestion stallcauses unproductive

buffer occupancy

Excessflits

time

Han Liu 15

Quota Algorithm

04/29/13

• VC’s quota value = Throughput * RRTmin - Throughput of upstream router is hard to

measure-> Compute quota values based on observefd

RTT for individual credits

Han Liu 16

Quota Heuristic

• Track credit RTT for each output VC• RTT=RTTmin ⇒ set quota to RTTmin

– No downstream congestion⇒Allow one flit in each cycle of RTT interval

• RTT>RTTmin ⇒ subtract difference from RTTmin

– Each congestion and queuing stall adds to RTT⇒Allow one credit stall per downstream stall

04/29/13

Han Liu 17

Quota Equation

• Q = max(Tcrt,base - (Tcrt,obs - Tcrt,base), 1 )= max(2 * Tcrt,base - Tcrt,obs , 1)

– When Tcrt,obs is large, Q is small

– Qmin = 1 in order to guarantee that quota values can continue to be updated

04/29/13

Han Liu 18

Implementation

• Network design determines RTTmin for each link• Track RTT for single in-flight credit per VC• Update quota value upon return• Switch allocator masks all VCs that exceed quota

⇒Simple extension to existing flow control logic⇒No additional signaling required⇒< 5% overhead for 16x64b buffer with 4 VCs

04/29/13

Han Liu 19

Evaluation Methodology

• BookSim 2.0• 8x8 2D mesh, 64-bit channels, DOR• 16-slot input buffers, 4 VCs• Combined VC and switch allocation• Synthetic traffic and application benchmarks• Compare ABP to unrestricted sharing

04/29/13

Han Liu 20

Network Stability (1)

• For adversarial traffic, throughput in Mesh is unstable at high load– Traffic merging causes starvation– Tree saturation causes widespread congestion

• ABP improves stability– Throttles sources that inject at very high rate– Efficient buffer use reduces tree saturation

⇒Faster recovery from transient congestion04/29/13

Han Liu 21

Network Stability (2)[tornado traffic]

6.3x

04/29/13

Han Liu 22

Network Stability (3)[foreground traffic at 50% injection rate]

3.3x

-13%saturation rate

04/29/13

Han Liu 23

Performance Isolation (1)

• Inject two classes of traffic into network– Shared buffer space, separate VCs

⇒Sharing causes interference between classes (leads to latency problem)

• ABP reduces interference– Contains effects of congestion within a class

⇒Better isolation between workloads, VMs, …

04/29/13

Han Liu 24

Performance Isolation (2)[uniform random foreground traffic]

[hotspot background traffic][uniform random background traffic]

-33% -38%

04/29/13

Han Liu 25

Performance Isolation (3)[50% uniform random background traffic]

-31%

w/o background

04/29/13

Han Liu 26

Application Performance

[12.5% injection rate for streaming traffic]

-31%

w/o background

04/29/13

Han Liu 27

Conclusions

• Sharing improves buffer utilization, but can lead to undesired interference effects

• Adaptive Backpressure regulates credit flow to avoid unproductive use of shared buffer space

• Mitigates performance degradation in presence of adversarial traffic

• But maintains key benefits of buffer sharing under benign conditions

04/29/13

Han Liu 28

THE ENDThank you for your attention!

04/29/13

Question?

Recommended