28
Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker , Nan Jiang, George Michelogiannakis, William J. Dally Stanford University Presenter: Han Liu University of California, San Diego

Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Embed Size (px)

Citation preview

Page 1: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Adaptive Backpressure:Efficient Buffer Management for

On-Chip Networks

Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Stanford University

Presenter: Han LiuUniversity of California, San Diego

Page 2: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 2

Background

• NoCs become huge– Hundreds of cores on a single die

• Currently using: Input-queued routers– Input buffer resources become significant

• Input buffer sharing is attractive in NoCs– Pros: Improves area and power efficiency– Cons: facilitates spread of congestion

04/29/13

Page 3: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 3

Overview

• Adaptive Backpressure mitigates performance degradation by avoiding unproductive use of buffer space in the presence of congestion

• Avoid downsides of buffer sharing while maintaining benefits in benign case

04/29/13

Page 4: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 4

Motivation

• Assumption: buffers are good– More flexible routing– Helps traffic waiting closer to the destination

• Is this always true?– Energy, area efficiency– Implementation difficulty

04/29/13

Page 5: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 5

Train Example

04/29/13

San Diego(Source)

Denver(buffer)

Boston(Destination)

Buffers are good

Page 6: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 6

Motivation

• Static buffer vs Dynamic buffer management

04/29/13

Wasted buffer

Static

Dynamic

VC1

VC2

VC1

VC2

Page 7: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 7

Dynamic Buffer Management

• Buffer space is expensive resource in NoCs– 30-35% network power (MIT RAW, UT TRIPS)

• Dynamic management increases utilization by sharing buffer space among multiple VCs– Optimize use of expensive buffer resources– Decrease incremental cost of VCs

⇒Improved area and power efficiency⇒25% more throughput or 34% less power

[Nicopoulos’06]

04/29/13

Page 8: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 8

Sharing

• Pros– Economic– Efficient

• Cons– Inconvenient– Trouble

04/29/13

Page 9: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 9

Boarder Example

04/29/13

HWY5 HWY805

Mexico

US

Page 10: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 10

Buffer Monopolization

• Blocked flits from congested VC accumulate in buffer⇒Effective buffer size reduced for other VCs

⇒Performance degradation (latency / throughput)⇒Congestion spreads across VCs (flows / apps / VMs / …)

04/29/13

VC 0

VC 1

Page 11: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 11

Adaptive Backpressure

Goal:• Avoid unproductive use of buffer space in

dynamic buffer management• But allow sharing when beneficial

Approach:• Match arrival and departure rate for each VC by

regulating credit availability (backpressure)• Derive quota from credit round trip times04/29/13

Page 12: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 12

Buffer Monopolization

04/29/13

VC 0

VC 1

• Want a way to regulate unlimited credits supply to congested VC1– Give VC0 more credits and buffer space

Page 13: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 13

Quota Motivation (1)

Tcrt,0

Without congestion, full throughput

requires Tcrt,0 credits

Router 0 Router 1 Router 0 Router 1

04/29/13

Creditstall

Insufficient credit supply causes idle cycle downstream

Idlecycle

time

Page 14: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 14

Quota Motivation (2)

Congestionstall

Creditstall

Matching stalls avoids unproductive

buffer occupancy

Router 0 Router 1 Router 0 Router 1

Excessdrained

04/29/13

Queuing stall

Queuing stall

Tcrt,0+TstallCongestionstall

Queuing stall

Queuing stall

Queuing stall

Queuing stall

Excessflits

Congestion stallcauses unproductive

buffer occupancy

Excessflits

time

Page 15: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 15

Quota Algorithm

04/29/13

• VC’s quota value = Throughput * RRTmin - Throughput of upstream router is hard to

measure-> Compute quota values based on observefd

RTT for individual credits

Page 16: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 16

Quota Heuristic

• Track credit RTT for each output VC• RTT=RTTmin ⇒ set quota to RTTmin

– No downstream congestion⇒Allow one flit in each cycle of RTT interval

• RTT>RTTmin ⇒ subtract difference from RTTmin

– Each congestion and queuing stall adds to RTT⇒Allow one credit stall per downstream stall

04/29/13

Page 17: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 17

Quota Equation

• Q = max(Tcrt,base - (Tcrt,obs - Tcrt,base), 1 )= max(2 * Tcrt,base - Tcrt,obs , 1)

– When Tcrt,obs is large, Q is small

– Qmin = 1 in order to guarantee that quota values can continue to be updated

04/29/13

Page 18: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 18

Implementation

• Network design determines RTTmin for each link• Track RTT for single in-flight credit per VC• Update quota value upon return• Switch allocator masks all VCs that exceed quota

⇒Simple extension to existing flow control logic⇒No additional signaling required⇒< 5% overhead for 16x64b buffer with 4 VCs

04/29/13

Page 19: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 19

Evaluation Methodology

• BookSim 2.0• 8x8 2D mesh, 64-bit channels, DOR• 16-slot input buffers, 4 VCs• Combined VC and switch allocation• Synthetic traffic and application benchmarks• Compare ABP to unrestricted sharing

04/29/13

Page 20: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 20

Network Stability (1)

• For adversarial traffic, throughput in Mesh is unstable at high load– Traffic merging causes starvation– Tree saturation causes widespread congestion

• ABP improves stability– Throttles sources that inject at very high rate– Efficient buffer use reduces tree saturation

⇒Faster recovery from transient congestion04/29/13

Page 21: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 21

Network Stability (2)[tornado traffic]

6.3x

04/29/13

Page 22: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 22

Network Stability (3)[foreground traffic at 50% injection rate]

3.3x

-13%saturation rate

04/29/13

Page 23: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 23

Performance Isolation (1)

• Inject two classes of traffic into network– Shared buffer space, separate VCs

⇒Sharing causes interference between classes (leads to latency problem)

• ABP reduces interference– Contains effects of congestion within a class

⇒Better isolation between workloads, VMs, …

04/29/13

Page 24: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 24

Performance Isolation (2)[uniform random foreground traffic]

[hotspot background traffic][uniform random background traffic]

-33% -38%

04/29/13

Page 25: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 25

Performance Isolation (3)[50% uniform random background traffic]

-31%

w/o background

04/29/13

Page 26: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 26

Application Performance

[12.5% injection rate for streaming traffic]

-31%

w/o background

04/29/13

Page 27: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 27

Conclusions

• Sharing improves buffer utilization, but can lead to undesired interference effects

• Adaptive Backpressure regulates credit flow to avoid unproductive use of shared buffer space

• Mitigates performance degradation in presence of adversarial traffic

• But maintains key benefits of buffer sharing under benign conditions

04/29/13

Page 28: Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks Authors: Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally

Han Liu 28

THE ENDThank you for your attention!

04/29/13

Question?