26
1 Advanced Computer Networks 263‐3501‐00 Flow Control Patrick Stuedi, Qin Yin, Timothy Roscoe Spring Semester 2015 © Oriana Riva, Department of Computer Science | ETH Zürich

Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

1

Advanced Computer Networks263‐3501‐00

Flow Control

Patrick Stuedi, Qin Yin, Timothy Roscoe

Spring Semester 2015

© Oriana Riva, Department of Computer Science | ETH Zürich

Page 2: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

2

Today

Flow Control Store-and-forward, cut-through, wormhole Head-of-line blocking

Infiniband

Lossless Ethernet

Page 3: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

3

Flow Control Basics

Page 4: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

4

Where to best put flow control

So far we have discussed the TCP/IP/Ethernet stack TCP flow-control: avoid receiver buffer overflow TCP congestion-control: avoid switch buffer overflow

TCP's congestion-control is reactive First loose packets, then adjust data rate

Is reactive congestion-control a good choice at 1 Gbit/s data rate? 10 Gbit/s data rate? 100 Gbit/s data rate?

Page 5: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

5

Supercomputer interconnect technologies (2003-2014)

Infiniband Low latency, high bandwidth

interconnect technology Layers similar to TCP/IP: physical,

link, routing, transport Key differentiating factors:

link-level flow control, kernel-bypass, new network semantics

Infiniband FDR10G Ethernet    Infiniband QDR

Infiniband         Gbit/s Ethernet

Infiniband DDR

Page 6: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

6

Supercomputer interconnect technologies (2003-2014)

Infiniband Low latency, high bandwidth

interconnect technology Layers similar to TCP/IP: physical,

link, routing, transport Key differentiating factors:

link-level flow control, kernel-bypass, new network semantics

Infiniband FDR10G Ethernet    Infiniband QDR

Infiniband         Gbit/s Ethernet

Infiniband DDR

todayweek 8

Page 7: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

7

Flow control

Determines how a network's resources are allocated to packets traversing the network

Flow-control network resources: Channel bandwidth Buffer capacity Control state (e.g., current input/output association)

Goal: Allocate resources in an efficient manner to achieve a high fraction of

the network's ideal bandwidth Deliver packets with low, predictable latency

Buffer capacityChannel bandwidth

Control state

Page 8: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

8

Flow control mechanisms

Bufferless: Drop or misroute packets if outgoing channel at switch is blocked

Circuit switching: Only headers are buffered and reserve path Pause header forwarding if path is not available

Buffered: Decouples channel allocation in time

red packet cannot be forward on requested channel because channel is occupied by the blue packet 

route red packet on uppper channel instead and re­route it later

Page 9: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

9

Flow control mechanisms

Bufferless: Drop or misroute packets if outgoing channel at switch is blocked

Circuit switching: Only headers are buffered and reserve path Pause header forwarding if path is not available

Buffered: Decouples channel allocation in time

red packet cannot be forward on requested channel because channel is occupied by the blue packet 

route red packet on uppper channel instead and re­route it later

will be discussed in week 12:

scheduling

today

Page 10: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

10

Flits

Packets are MTU-sized Typically several 1000 bytes Switch buffering and forwarding often works at a smaller

granularity (several bytes)

Flit – FLow control unIT Packets are divided into flits by the hardware Typically no extra headers

Head flitBody flit

Tail flit

Packet

Page 11: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

11

Time-space diagram:bufferless flow control

H B B B T

N

H B B B T

H B B B

N

H B

N

H B B B T

H B B B T

H B B B T

A

A

A

A

F

R

F

R

F

R

F

R

0

1

2

3

channel

reverse channel forward channel

head flit body flitACK

tail flit

NACK

0 1 2 3 4 6 7 8 9 10 11 12 13 14 15 16 175

cycle

cannot acquire channel 3, dropping packet

Page 12: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

12

Bufferless flow control using timeouts

H B B B T0

1

2

3channel

0 1 2 3 4 6 7 8 9 10 11 12 13 14 15 16 175

cycle

H B B B T

H B B B T

H B B B T

H B B B T

H B B B T

H B B B T

A

A

A

Packet is unable to acquire channel 3 in cycle 3 and is dropped

Preceeding channels continue to transmit the packet until the

tail flit is received

Eventually a timeout triggers retransmission

A

18 19 20 21

timeout

Page 13: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

13

Pros/cons of bufferless flow control

Simple

Inefficient: Valuable bandwidth used for packet that are dropped

later Misrouting does not drop packets, but may lead to

instability (packet may never reach destination)

Page 14: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

14

Buffered flow control: overview

Store-and-forward Channel and buffer allocation on a per packet-basis Receives full packets into buffer and forwards them after they have

been received High latency (each switch waits for full packet)

Cut-through Channel and buffer allocation on a per packet-basis Forwards packet as soon as first (header) flit arrives and outgoing

resources are available Low latency but still blocks the channel for the duration of a whole

packet transmission

Wormhole Buffers are allocated on a per-flit basis Low latency and efficient buffer usage

Page 15: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

15

Store-and-forward flow control

H B B B T0

1

2

3channel

0 1 2 3 4 6 7 8 9 10 11 12 13 14 15 16 175

cycle

H B B B T

H B B B T

Packet must acquire full packet buffer at next hop, plus channel

bandwidth before forwarding

The entire packet is transmitted over the channel before proceeding to

the next channel

High latency!

18 19 20 21

H B B B T

Page 16: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

16

Cut-through flow control

H B B B T0

1

2

3channel

0 1 2 3 4 6 7 8 9 105

cycle

H B B B T

H B B B T

Packet must acquire full packet buffer at next hop, and channel

bandwidth before forwarding

But: each flit of the packet is forwarded as soon as it is received

H B B B T

H B B B T0

1

2

3channel

0 1 2 3 4 6 7 8 9 105

cycle

H B B B T

H B B B T

H B B B T

with contentionwithout contention

Page 17: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

17

Wormhole routing

Just like cut-through, but with buffers allocated to flits (not to packets)

A head must acquire three resources before forwarding Channel control state at switch One flit buffer at next switch One flit of channel bandwidth

Consumes much less buffer space than cut-through flow control But does not improve channel utilization: once first flit has

acquired control state, no other packet can use it

Page 18: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

18

Head-of-line (HoL) blocking in Wormhole switching

Intended routing of packets: A is going from Node-1 to Node-4 B is going from Node-0 to Node-5

Situation: B is block at Node-3 Cannot acquire flit buffer at Node-5

HoL blocking: A cannot progress to Node-4 even though all channels are idle No intermixing of channels possible in Wormhole B is locking channel from Node-0 to Node-3

Page 19: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

19

Virtual channels

Each switch has multiple virtual channels per physical channel

Each virtual channel contains Buffer space Channel state (e.g, state of outgoing channel,

input/output association, etc.)

A head must acquire three resources before forwarding A virtual channel on the next switch (including buffer

space) Channel bandwidth at the switch

Page 20: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

20

Virtual channel flow control

By having multiple virtual channels per physical channel, and not waste the resource when one packet is idle

Example: A can proceed even though B is blocked at Node-3

Page 21: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

21

Buffer management

How to communicate the availability of buffers between switches?

Common types used today: Credit-based

- Switch keeps count of number of free flit buffers per switch (credits)

- Counter decreased when sending at upstream switch- Stop sending when counter reaches zero- Downstream switch sends back signal to increment

credit counter when buffer is freed (forwarding) One/off

- Downstream switches send “on” or “off” flag to start and stop incoming flit stream

Page 22: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

22

Credit-based flow control

Node 1 Node 2

t1

t2

t3

t4

t5

tim

e

Flit

Flit

Flit

Credit

Credit

Credit

Process

Process

credit

roundtr

ip d

ela

y t

crt

Need enough buffers to hide round trip time: With only one flit buffer: throughput = Lf / tcrt Lf: flit length

With F buffers: throughput = F*Lf / tcrt F: #flits

Page 23: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

23

On/off flow control

Node 1 Node 2

t1

t5

t9

tim

e

credit

roundtr

ip d

ela

y t

rt

T2: Switch sends “off” if its free buffer count falls below a limit Foff

T6: Switch send “on” if its free buffer count rises above Fon b: bandwidth

Need to prevent additional flits from overflowing: Foff >= trt * b / Lf Lf: flit length

Flit

s

Flit

s

Flit

sOn

Of

process

process

t2

t3

t4

t6

t7

t8

Page 24: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

24

Infiniband

Interconnect technology designed for high-bandwidth, low-latency Bandwidth: Up to 100Gbit/s (EDR, Mellanox ConnectX-4) RTT latencies: 1-2 us

Layered architecture Physical layer Link layer: credit-based flow control, virtual lanes Network layer Transport: reliable/unreliable, connection/datagram

ConnectX­4

Page 25: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

25

Converged Enhanced Ethernet (CEE)

Also known as “Lossless Ethernet” Combines a set of optional Ethernet standards into one

umbrella

Main pieces: Priority Flow Control (PFC):

- PFC Pause/Start flow control for individual priority classes Quantized Congestion Notification (QCN):

- Detect congestion- Switches send congestion notification message to end host- Reduces the need to use the hammer (PFC Pause)

Page 26: Advanced Computer Networks - Systems Group · 7 Flow control Determines how a network's resources are allocated to packets traversing the network Flow-control network resources: Channel

26

Next time (after Easter)

End host optimizations User-level networking RDMA