Multipath TCP

Preview:

DESCRIPTION

Multipath TCP. Christoph Paasch Université c atholique de Louvain. Costin Raiciu University Politehnica of Bucharest. Joint work with: Mark Handley , Damon Wischik, University College London Olivier Bonaventure , Sébastien Barré , Université catholique de Louvain - PowerPoint PPT Presentation

Citation preview

Multipath TCPCostin Raiciu

University Politehnica of Bucharest

Joint work with: Mark Handley, Damon Wischik, University College LondonOlivier Bonaventure, Sébastien Barré, Université catholique de Louvain and many many others

Christoph PaaschUniversité catholique de Louvain

Thanks to

Networks are becoming multipath

Mobile devices have multiple wireless connections

Networks are becoming multipath

Networks are becoming multipath

Networks are becoming multipath

Datacenters have redundant topologies

Networks are becoming multipath

Servers are multi-homed

Client

How do we use these networks?TCP.

Used by most applications, offers byte-oriented reliable delivery,adjusts load to network conditions

[Labovits et al – Internet Interdomain traffic – Sigcomm 2010]

TCP is single path

A TCP connection Uses a single-path in the network regardless of network topology

Is tied to the source and destination addresses of the endpoints

Mismatch between network and transport

creates problems

Poor Performance for Mobile Users

3G celltower

Poor Performance for Mobile Users

3G celltower

Poor Performance for Mobile Users

3G celltower

Poor Performance for Mobile Users

3G celltower

Offload to WiFi

Poor Performance for Mobile Users

3G celltower

All ongoing TCP connections die

Collisions in datacenters

[Fares et al - A Scalable, Commodity Data Center Network Architecture - Sigcomm 2008]

Single-path TCP collisions reduce throughput

[Raiciu et. Al – Sigcomm 2011]

Multipath TCP

Multipath TCP (MPTCP) is an evolution of TCP that can effectively use multiple paths within a single transport connection

• Supports unmodified applications

• Works over today’s networks

• Standardized at the IETF (almost there)

Multipath TCP components

Connection setupSending data over multiple pathsEncoding control informationDealing with (many) middleboxesCongestion control

[Raiciu et. al – NSDI 2012]

[Wischik et. al – NSDI 2011]

Multipath TCP components

Connection setupSending data over multiple pathsEncoding control informationDealing with (many) middleboxesCongestion control

[Raiciu et. al – NSDI 2012]

[Wischik et. al – NSDI 2011]

MPTCP Connection Management

SYN

MP_CAPABLE X

MPTCP Connection Management

SYN/ACK MP_CAPABLE Y

MPTCP Connection ManagementSUBFLOW 1CWNDSnd.SEQNORcv.SEQNO

FLOW Y

MPTCP Connection Management

SYN JOIN Y

SUBFLOW 1CWNDSnd.SEQNORcv.SEQNO

FLOW Y

MPTCP Connection Management

SYN/ACK

JOIN X

SUBFLOW 1CWNDSnd.SEQNORcv.SEQNO

FLOW Y

MPTCP Connection Management

SUBFLOW 2CWNDSnd.SEQNORcv.SEQNO

SUBFLOW 1CWNDSnd.SEQNORcv.SEQNO

FLOW Y

TCP Packet Header

Source Port Destination Port

Sequence Number

Acknowledgment Number

Receive WindowHeaderLength Reserved Code bits

Checksum

Options

Urgent Pointer

Data

Bit 0 Bit 15 Bit 16 Bit 31

20Bytes

0 - 40Bytes

TCP Packet Header

Source Port Destination Port

Sequence Number

Acknowledgment Number

Receive WindowHeaderLength Reserved Code bits

Checksum

Options

Urgent Pointer

Data

Bit 0 Bit 15 Bit 16 Bit 31

20Bytes

0 - 40Bytes

Sequence Numbers

Packets go multiple paths.– Need sequence numbers to put them back in sequence.– Need sequence numbers to infer loss on a single path.

Options:– One sequence space shared across all paths?– One sequence space per path, plus an extra one to put

data back in the correct order at the receiver?

Sequence Numbers

• One sequence space per path is preferable.– Loss inference is more reliable.– Some firewalls/proxies expect to see all the sequence

numbers on a path.

• Outer TCP header holds subflow sequence numbers.– Where do we put the data sequence numbers?

MPTCP Packet Header

Source Port Destination Port

Sequence Number

Acknowledgment Number

Receive WindowHeaderLength Reserved Code bits

Checksum

Options

Urgent Pointer

Data

Bit 0 Bit 15 Bit 16 Bit 31

20Bytes

0 - 40Bytes

SubflowSubflow

Subflow Subflow

Data sequence number Data ACK

MPTCP Operation

DATASEQ1000

DSEQ10000

options

……

SUBFLOW 2CWNDSnd.SEQNORcv.SEQNO

SUBFLOW 1CWNDSnd.SEQNORcv.SEQNO

FLOW Y

MPTCP Operation

DATASEQ1000

DSEQ10000

options

……

SUBFLOW 2CWNDSnd.SEQNORcv.SEQNO

SUBFLOW 1CWNDSnd.SEQNORcv.SEQNO

FLOW Y

MPTCP Operation

DATASEQ1000

DSEQ10000

options

……

DATASEQ5000

DSEQ11000

options

……

SUBFLOW 2CWNDSnd.SEQNORcv.SEQNO

SUBFLOW 1CWNDSnd.SEQNORcv.SEQNO

FLOW Y

MPTCP Operation

DATASEQ1000

DSEQ10000

options

……

DATASEQ5000

DSEQ11000

options

……

SUBFLOW 2CWNDSnd.SEQNORcv.SEQNO

SUBFLOW 1CWNDSnd.SEQNORcv.SEQNO

FLOW Y

MPTCP Operation

DATASEQ1000

DSEQ10000

options

……

DATASEQ5000

DSEQ11000

options

……

SUBFLOW 2CWNDSnd.SEQNORcv.SEQNO

SUBFLOW 1CWNDSnd.SEQNORcv.SEQNO

FLOW Y

MPTCP OperationACK2000

Data ACK11000

… …

SUBFLOW 2CWNDSnd.SEQNORcv.SEQNO

SUBFLOW 1CWNDSnd.SEQNORcv.SEQNO

DATASEQ5000

DSEQ11000

options

……

FLOW Y

MPTCP Operation

DATASEQ5000

DSEQ11000

options

……

SUBFLOW 2CWNDSnd.SEQNORcv.SEQNO

SUBFLOW 1CWNDSnd.SEQNORcv.SEQNO

FLOW Y

MPTCP Operation

DATASEQ5000

DSEQ11000

options

……

SUBFLOW 2CWNDSnd.SEQNORcv.SEQNO

SUBFLOW 1CWNDSnd.SEQNORcv.SEQNO

FLOW Y

MPTCP Operation

DATASEQ2000

DSEQ11000

options

……

SUBFLOW 2CWNDSnd.SEQNORcv.SEQNO

SUBFLOW 1CWNDSnd.SEQNORcv.SEQNO

FLOW Y

Multipath TCPCongestion Control

42Packet switching ‘pools’ circuits.Multipath ‘pools’ linksTCP controls how a link is shared.How should a pool be shared?

Two circuits A link Two separate links

A pool of links

To be fair, Multipath TCP should take as much capacity as TCP at a bottleneck link, no matter how many paths it is using.

Design goal 1:Multipath TCP should be fair to regular TCP at shared bottlenecks

To be fair, Multipath TCP should take as much capacity as TCP at a bottleneck link, no matter how many subflows it is using.

A multipath TCP flow with two subflows

Regular TCP

43

To be fair, Multipath TCP should take as much capacity as TCP at a bottleneck link, no matter how many paths it is using.

Design goal 2:MPTCP should use efficient paths

Each flow has a choice of a 1-hop and a 2-hop path. How should split its traffic?

12Mb/s

12Mb/s

12Mb/s

44

To be fair, Multipath TCP should take as much capacity as TCP at a bottleneck link, no matter how many paths it is using.

Design goal 2:MPTCP should use efficient paths

If each flow split its traffic 1:1 ...

8Mb/s

8Mb/s

8Mb/s

12Mb/s

12Mb/s

12Mb/s

45

To be fair, Multipath TCP should take as much capacity as TCP at a bottleneck link, no matter how many paths it is using.

Design goal 2:MPTCP should use efficient paths

If each flow split its traffic 2:1 ...

9Mb/s

9Mb/s

9Mb/s

12Mb/s

12Mb/s

12Mb/s

46

To be fair, Multipath TCP should take as much capacity as TCP at a bottleneck link, no matter how many paths it is using.

Design goal 2:MPTCP should use efficient paths

• If each flow split its traffic 4:1 ...

10Mb/s

10Mb/s

10Mb/s

12Mb/s

12Mb/s

12Mb/s

47

To be fair, Multipath TCP should take as much capacity as TCP at a bottleneck link, no matter how many paths it is using.

Design goal 2:MPTCP should use efficient paths

• If each flow split its traffic ∞:1 ...

12Mb/s

12Mb/s

12Mb/s

12Mb/s

12Mb/s

12Mb/s

48

To be fair, Multipath TCP should take as much capacity as TCP at a bottleneck link, no matter how many paths it is using.

Design goal 3:MPTCP should get at least as much as TCP on the best path

Design Goal 2 says to send all your traffic on the least congested path, in this case 3G. But this has high RTT, hence it will give low throughput.

wifi path: high loss, small RTT

3G path: low loss, high RTT

49

Maintain a congestion window w.

Increase w for each ACK, by 1/w

Decrease w for each drop, by w/2

How does TCP congestion control work?50

Maintain a congestion window wr, one window for each path, where r ∊ R ranges over the set of available paths.

Increase wr for each ACK on path r, by

Decrease wr for each drop on path r, by wr /2

51

How does MPTCP congestion control work?

αwr

r

Maintain a congestion window wr, one window for each path, where r ∊ R ranges over the set of available paths.

Increase wr for each ACK on path r, by

Decrease wr for each drop on path r, by wr /2

52

How does MPTCP congestion control work?

αwr

r

∑Goal 2

Maintain a congestion window wr, one window for each path, where r ∊ R ranges over the set of available paths.

Increase wr for each ACK on path r, by

Decrease wr for each drop on path r, by wr /2

53

How does MPTCP congestion control work?

αwr

r

∑Goals 1&3

Maintain a congestion window wr, one window for each path, where r ∊ R ranges over the set of available paths.

Increase wr for each ACK on path r, by

Decrease wr for each drop on path r, by wr

/2

54

How does MPTCP congestion control work?

Applications of Multipath TCP

At a multihomed web server, MPTCP tries to share the ‘pooled access capacity’ fairly.

56

100Mb/s

100Mb/s

2 TCPs @ 50Mb/s

4 TCPs@ 25Mb/s

At a multihomed web server, MPTCP tries to share the ‘pooled access capacity’ fairly.

57

100Mb/s

100Mb/s

2 TCPs @ 33Mb/s

1 MPTCP @ 33Mb/s

4 TCPs@ 25Mb/s

At a multihomed web server, MPTCP tries to share the ‘pooled access capacity’ fairly.

58

100Mb/s

100Mb/s

2 TCPs @ 25Mb/s

2 MPTCPs @ 25Mb/s

4 TCPs@ 25Mb/s

The total capacity, 200Mb/s, is shared out evenly between all 8 flows.

At a multihomed web server, MPTCP tries to share the ‘pooled access capacity’ fairly.

59

100Mb/s

100Mb/s

2 TCPs @ 22Mb/s

3 MPTCPs @ 22Mb/s

4 TCPs@ 22Mb/s

The total capacity, 200Mb/s, is shared out evenly between all 9 flows.

It’s as if they were all sharing a single 200Mb/s link. The two links can be said to form a 200Mb/s pool.

At a multihomed web server, MPTCP tries to share the ‘pooled access capacity’ fairly.

60

100Mb/s

100Mb/s

2 TCPs @ 20Mb/s

4 MPTCPs @ 20Mb/s

4 TCPs@ 20Mb/s

The total capacity, 200Mb/s, is shared out evenly between all 10 flows.

It’s as if they were all sharing a single 200Mb/s link. The two links can be said to form a 200Mb/s pool.

At a multihomed web server, MPTCP tries to share the ‘pooled access capacity’ fairly.

61

100Mb/s

100Mb/s

5 TCPs

First 0, then 10 MPTCPs

15 TCPs

time [min]

throughput per flow [Mb/s]

We confirmed in experiments that MPTCP nearly manages to pool the capacity of the two access links.Setup: two 100Mb/s access links, 10ms delay, first 20 flows, then 30.

At a multihomed web server, MPTCP tries to share the ‘pooled access capacity’ fairly.

62

100Mb/s

100Mb/s

5 TCPs

First 0, then 10 MPTCPs

15 TCPs

MPTCP makes a collection of links behave like a single large pool of capacity —i.e. if the total capacity is C, and there are n flows, each flow gets throughput C/n.

Multipath TCP can pool datacenter networks

Instead of using one path for each flow, use many random paths

Don’t worry about collisions.

Just don’t send (much) traffic on colliding paths

Multipath TCP in data centers

Multipath TCP in data centers

MPTCP better utilizes the FatTree network

MPTCP on EC2

• Amazon EC2: infrastructure as a service– We can borrow virtual machines by the hour– These run in Amazon data centers worldwide– We can boot our own kernel

• A few availability zones have multipath topologies– 2-8 paths available between hosts not on the same

machine or in the same rack– Available via ECMP

Amazon EC2 Experiment

• 40 medium CPU instances running MPTCP• For 12 hours, we sequentially ran all-to-all

iperf cycling through:– TCP– MPTCP (2 and 4 subflows)

MPTCP improves performance on EC2

SameRack

Implementing Multipath TCP

in the Linux Kernel

Linux Kernel MPTCP About 10000 lines of code in the Linux Kernel Initially started by Sébastien Barré Now, 3 actively working on Linux Kernel MPTCP

Christoph Paasch Fabien Duchêne Gregory Detal

Freely available at http://mptcp.info.ucl.ac.be

MPTCP-session creation

Application creates regular TCP-sockets

MPTCP-session creation

The Kernel creates the Meta-socket

MPTCP creating new subflows

The Kernel handles the different MPTCP subflows

MPTCP Performance with apache

100 simultaneous HTTP-Requests, total of 100000

MPTCP Performance with apache

100 simultaneous HTTP-Requests, total of 100000

MPTCP Performance with apache

100 simultaneous HTTP-Requests, total of 100000

MPTCP on multicore architectures Flow-to-core affinity steers all packets from one TCP-flow to

the same core. MPTCP has lots of L1/L2 cache-misses because the individual

subflows are steered to different CPU-cores

MPTCP on multicore architectures

MPTCP on multicore architectures Solution: Send all packets from the same MPTCP-session to the

same CPU-core Based on Receive-Flow-Steering implementation in Linux

(Author: Tom Herbert from Google)

MPTCP on multicore architectures

Multipath TCP on

Mobile Devices

MPTCP over WiFi/3G

TCP over WiFi/3G

MPTCP over WiFi/3G

MPTCP over WiFi/3G

MPTCP over WiFi/3G

MPTCP over WiFi/3G

MPTCP over WiFi/3G

MPTCP over WiFi/3G

MPTCP over WiFi/3G

WiFi to 3G handover with Multipath TCP

A mobile node may lose its WiFi connection.

Regular TCP will break!

Some applications support recovering from a broken TCP (HTTP-Header Range)

Thanks to the REMOVE_ADDR-option, MPTCP is able to handle this without the need for application support.

WiFi to 3G handover with Multipath TCP

WiFi to 3G handover with Multipath TCP

WiFi to 3G handover with Multipath TCP

WiFi to 3G handover with Multipath TCP

WiFi to 3G handover with Multipath TCP

WiFi to 3G handover with Multipath TCP

WiFi to 3G handover with Multipath TCP

Related Work

Multipath TCP has been proposed many times before– First by Huitema (1995),CMT, pTCP, M-TCP, …

You can solve mobility differently– At different layer: Mobile IP, HTTP range – At transport layer: Migrate TCP, SCTP

You can deal with datacenter collisions differently– Hedera (Openflow + centralized scheduling)

Multipath topologies need multipath transportMultipath TCP can be used

by unchanged applicationsover today’s networks

MPTCP moves traffic away from congestion,making a collection of links behave like asingle pooled resource

Backup Slides

Packet-level ECMP in datacenters

Maintain a congestion window wr, one window for each path, where r ∊ R ranges over the set of available paths.

Increase wr for each ACK on path r, by

Decrease wr for each drop on path r, by wr /2

107

How does MPTCP congestion control work?

Design goals 1&3:At any potential bottleneck S that path r might be in,look at the best that a single-path TCP could get,and compare to what I’m getting.

Maintain a congestion window wr, one window for each path, where r ∊ R ranges over the set of available paths.

Increase wr for each ACK on path r, by

Decrease wr for each drop on path r, by wr /2

108

How does MPTCP congestion control work?

Design goal 2:We want to shift traffic away from congestion.

To achieve this, we increase windows in proportion to their size.

Recommended