60
Tackling Parallelization Challenges of Kernel Network Stack for Container Overlay Networks Jiaxin Lei*, Kun Suo + , Hui Lu*, Jia Rao + * SUNY at Binghamton + University of Texas at Arlington

Tackling Parallelization Challenges of Kernel Network

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Tackling Parallelization Challenges of Kernel Network Stack for Container Overlay Networks

Jiaxin Lei*, Kun Suo+, Hui Lu*, Jia Rao+

* SUNY at Binghamton+ University of Texas at Arlington

Containers Are Widely Adopted by Industry

•OS level virtualization

•Lightweight

•Higher consolidation density

•Lower operational cost

Containers Are Widely Adopted by Industry

•OS level virtualization

•Lightweight

•Higher consolidation density

•Lower operational cost

Containers Are Widely Adopted by Industry

•OS level virtualization

•Lightweight

•Higher consolidation density

•Lower operational cost

Overlay Networks Are the Technique For Containers Connectivity•Typical overlay network solutions:

Docker Overlay, Flannel, Calico, Weave

•They are generally built upon the tunneling approach like using VxLAN protocol.

Overlay Networks Are the Technique For Containers Connectivity

Inner IP Header

Inner Ethernet Header Payload

Original L2 Frame

•Typical overlay network solutions: Docker Overlay, Flannel, Calico, Weave

•They are generally built upon the tunneling approach like using VxLAN protocol.

Overlay Networks Are the Technique For Containers Connectivity

Inner IP Header

Inner Ethernet Header Payload

Original L2 Frame

VxLAN Header

Outer UDP Header

Outer IP Header

Outer Ethernet Header FCS

VxLAN Encapsulated Packet

•Typical overlay network solutions: Docker Overlay, Flannel, Calico, Weave

•They are generally built upon the tunneling approach like using VxLAN protocol.

Overlay Networks Are the Technique For Containers Connectivity

Inner IP Header

Inner Ethernet Header Payload

Original L2 Frame

VxLAN Header

Outer UDP Header

Outer IP Header

Outer Ethernet Header FCS

VxLAN Encapsulated Packet

VxLAN Flags Reserved VNI Reserved

VxLAN Network Identifier

•Typical overlay network solutions: Docker Overlay, Flannel, Calico, Weave

•They are generally built upon the tunneling approach like using VxLAN protocol.

Overlay Networks Are the Technique For Containers Connectivity

Inner IP Header

Inner Ethernet Header Payload

Original L2 Frame

VxLAN Header

Outer UDP Header

Outer IP Header

Outer Ethernet Header FCS

VxLAN Encapsulated Packet

VxLAN Flags Reserved VNI Reserved

VxLAN Network Identifier

•Typical overlay network solutions: Docker Overlay, Flannel, Calico, Weave

•They are generally built upon the tunneling approach like using VxLAN protocol.

Overlay Networks Are the Technique For Containers Connectivity

Inner IP Header

Inner Ethernet Header Payload

Original L2 Frame

VxLAN Header

Outer UDP Header

Outer IP Header

Outer Ethernet Header FCS

VxLAN Encapsulated Packet

VxLAN Flags Reserved VNI Reserved

VxLAN Network Identifier

•Typical overlay network solutions: Docker Overlay, Flannel, Calico, Weave

•They are generally built upon the tunneling approach like using VxLAN protocol.

Network Packet Processing Path•Prolonged network packet processing path

•Additional virtual devices overhead

Network Packet Processing Path

Receiving Side

•Prolonged network packet processing path

•Additional virtual devices overhead

Native

Network Packet Processing Path

NIC

Receiving Side Kernel Space

Container Applications

•Prolonged network packet processing path

•Additional virtual devices overhead

Native

User Space

Network Packet Processing Path

NIC

Receiving Side

Container Applications

IRQ

•Prolonged network packet processing path

•Additional virtual devices overhead

Kernel Space User Space

Native

Network Packet Processing Path

NIC

Receiving Side

Network Stack

Container Applications

IRQ SoftIRQ

•Prolonged network packet processing path

•Additional virtual devices overhead

Kernel Space User Space

Native

Network Packet Processing Path

NIC

Receiving Side

Network Stack

Container Applications

IRQ SoftIRQ

•Prolonged network packet processing path

•Additional virtual devices overhead

Kernel Space User Space

Native

Network Packet Processing Path

NIC

Receiving Side

Network Stack

Container Applications

IRQ SoftIRQ

•Prolonged network packet processing path

•Additional virtual devices overhead

Kernel Space User Space

Native

Network Packet Processing Path

NIC

Receiving Side

Network Stack

Container Applications

IRQ SoftIRQ

•Prolonged network packet processing path

•Additional virtual devices overhead

Kernel Space User Space

Native

Network Packet Processing Path

NIC

Receiving Side

Network Stack

Container Applications

IRQ SoftIRQ

•Prolonged network packet processing path

•Additional virtual devices overhead

Kernel Space User Space

Native

Network Packet Processing Path

Container Applications

NIC

Receiving Side

•Prolonged network packet processing path

•Additional virtual devices overhead

Overlay

Kernel Space User Space

Network Packet Processing Path

Container Applications

NIC

Receiving SideIRQ SoftIRQ

Network Stack

•Prolonged network packet processing path

•Additional virtual devices overhead

Overlay

Kernel Space User Space

Network Packet Processing Path

Container Applications

NIC

Receiving SideIRQ

VxLANSoftIRQ

Network StackSoftIRQ

•Prolonged network packet processing path

•Additional virtual devices overhead

Kernel Space User Space

Overlay

Network Packet Processing Path

Container Applications

NIC

Receiving SideIRQ

VxLANSoftIRQ

Network StackSoftIRQ

vBridge Veth

Packet decapsulation

•Prolonged network packet processing path

•Additional virtual devices overhead

Kernel Space User Space

Overlay

Network Packet Processing Path

Container Applications

NIC

Receiving SideIRQ

VxLANSoftIRQ

Network StackSoftIRQ

vBridge Veth Network StackSoftIRQ

Packet decapsulation

•Prolonged network packet processing path

•Additional virtual devices overhead

Kernel Space User Space

Overlay

Network Packet Processing Path

Container Applications

NIC

Receiving SideIRQ

VxLANSoftIRQ

Network StackSoftIRQ

vBridge Veth Network StackSoftIRQ

Packet decapsulation

•Prolonged network packet processing path

•Additional virtual devices overhead

Kernel Space User Space

Overlay

Network Packet Processing Path

Container Applications

NIC

Receiving SideIRQ

VxLANSoftIRQ

Network StackSoftIRQ

vBridge Veth Network StackSoftIRQ

Packet decapsulation

•Prolonged network packet processing path

•Additional virtual devices overhead

Kernel Space User Space

Overlay

Network Packet Processing Path

Container Applications

NIC

Receiving SideIRQ

VxLANSoftIRQ

Network StackSoftIRQ

vBridge Veth Network StackSoftIRQ

Packet decapsulation

•Prolonged network packet processing path

•Additional virtual devices overhead

Kernel Space User Space

Overlay

Container Applications

NIC

Receiving SideIRQ

VxLANSoftIRQ

Network StackSoftIRQ

vBridge Veth Network StackSoftIRQ

Existing Optimizations for Packet Processing

Packet decapsulation

Kernel Space User Space

Container Applications

NIC

Receiving SideIRQ

VxLANSoftIRQ

Network StackSoftIRQ

vBridge Veth Network StackSoftIRQ

Existing Optimizations for Packet Processing

Packet decapsulation

IRQ coalescing GRO

Kernel Space User Space

Container Applications

NIC

Receiving Side User Space

IRQVxLAN

SoftIRQNetwork Stack

SoftIRQ

vBridge Veth Network StackSoftIRQ

Existing Optimizations for Packet Processing

Packet decapsulation

IRQ coalescing GRO RPSMulti-queue

Kernel Space

Experimental SettingsHardware Software

CPU Memory NIC Throughput CPU Usage2.2 GHz 64GB 40Gbps iPerf3 mpstat10 cores Multi-queue

Experimental SettingsHardware Software

CPU Memory NIC Throughput CPU Usage2.2 GHz 64GB 40Gbps iPerf3 mpstat10 cores Multi-queue

S1 - Native S2 - Linux Overlay S3 - Docker Overlay

Experimental Settings

Host1 Host2

eth0

iPerf3 iPerf3

eth0

Hardware SoftwareCPU Memory NIC Throughput CPU Usage

2.2 GHz 64GB 40Gbps iPerf3 mpstat10 cores Multi-queue

S1 - Native S2 - Linux Overlay S3 - Docker Overlay

Experimental Settings

Host1 Host2 Host1 Host2

eth0

VxLANiPerf3 iPerf3

iPerf3 iPerf3

eth0 eth0 eth0

VxLAN

Hardware SoftwareCPU Memory NIC Throughput CPU Usage

2.2 GHz 64GB 40Gbps iPerf3 mpstat10 cores Multi-queue

S1 - Native S2 - Linux Overlay S3 - Docker Overlay

Experimental Settings

S3 - Docker Overlay

Host1 Host2 Host1 Host2 Host1 Host2

eth0

VxLAN

vBridge

Veth0

iPerf3

iPerf3 Container

iPerf3 Container

iPerf3

iPerf3 iPerf3

eth0 eth0 eth0 eth0 eth0

VxLAN VxLAN VxLAN

vBridge

Veth0

S2 - Linux Overlay

Hardware SoftwareCPU Memory NIC Throughput CPU Usage

2.2 GHz 64GB 40Gbps iPerf3 mpstat10 cores Multi-queue

S1 - Native

Single Flow Performance

TCP and UDP Throughputs under Three Different Cases

Thro

ughp

ut (G

b/s)

0

5

10

15

20

25

S1 S2 S3

6.46.5

23 TCP

• TCP Throughput of Docker Overlay case drops 72% compared with native case.

Single Flow Performance

TCP and UDP Throughputs under Three Different Cases

Thro

ughp

ut (G

b/s)

0

5

10

15

20

25

S1 S2 S3

6.46.5

23 TCP

• TCP Throughput of Docker Overlay case drops 72% compared with native case.

• UDP Throughput drops 58%.

Single Flow Performance

TCP and UDP Throughputs under Three Different Cases

Thro

ughp

ut (G

b/s)

0

5

10

15

20

25

S1 S2 S3

3.94.7

9.36.46.5

23 TCPUDP

Single Flow Performance

CPU Usage under Three Different Cases for TCP

Tota

l CPU

Usa

ge

0%

2%

4%

6%

8%

10%

S1 S2 S3

User

System

SoftIRQ

* 5% indicates one cpu core is fully saturated.

Single Flow Performance

• Packet processing overhead fully saturates one cpu core in two overlay cases.

CPU Usage under Three Different Cases for TCP

Tota

l CPU

Usa

ge

0%

2%

4%

6%

8%

10%

S1 S2 S3

User

System

SoftIRQ

~5%

* 5% indicates one cpu core is fully saturated.

Single Flow Performance

CPU Usage under Three Different Cases for TCP

Tota

l CPU

Usa

ge

0%

2%

4%

6%

8%

10%

S1 S2 S3

User

System

SoftIRQ

~5% • Packet processing overhead fully saturates one cpu core in two overlay cases.

• Current solutions can’t scale single flow performance.

* 5% indicates one cpu core is fully saturated.

Multiple Flows Performance

0

10

20

30

40

1 2 3 4 5 6 ... 10 20 40 80

Native

Pair Number of Iperf Connection

Thro

ughp

ut (G

b/s)

Multiple Flows Performance

0

10

20

30

40

1 2 3 4 5 6 ... 10 20 40 80

Native

Pair Number of Iperf Connection

Thro

ughp

ut (G

b/s)

• Native case quickly reaches ~37 Gbps under TCP with only 2 pairs.

~37Gbps

Multiple Flows Performance

0

10

20

30

40

1 2 3 4 5 6 ... 10 20 40 80

NativeLinux OverlayDocker Overlay

Pair Number of Iperf Connection

Thro

ughp

ut (G

b/s)

• Native case quickly reaches ~37 Gbps under TCP with only 2 pairs.

• In two overlay cases, TCP throughput grows slowly.

Multiple Flows Performance

0

10

20

30

40

1 2 3 4 5 6 ... 10 20 40 80

NativeLinux OverlayDocker Overlay

Pair Number of Iperf Connection

Thro

ughp

ut (G

b/s)

• Native case quickly reaches ~37 Gbps under TCP with only 2 pairs.

• In two overlay cases, TCP throughput grows slowly.

Multiple Flows Performance

• Under the same throughput (e.g., 40 Gbps), overlay networks consume much more CPU resources (e.g., around 2.5 times) than the native case.

0%10%20%30%40%50%60%70%

1 2 3 4 5 6 ... 10 20 40 80 1 2 3 4 5 6 ... 10 20 40 80 1 2 3 4 5 6 ... 10 20 40 80

UserSystemSoftIRQ

Native Linux Overlay Docker Overlay Pair Number of Iperf Connection

Tota

l CPU

Usa

ge~2.5 times

Multiple Flows Performance

• Bad scalability is largely due to the inefficient interplay of many tasks.

0%10%20%30%40%50%60%70%

1 2 3 4 5 6 ... 10 20 40 80 1 2 3 4 5 6 ... 10 20 40 80 1 2 3 4 5 6 ... 10 20 40 80

UserSystemSoftIRQ

Native Linux Overlay Docker Overlay Pair Number of Iperf Connection

Tota

l CPU

Usa

ge~2.5 times

Small Packet Performance

0

60,000

120,000

180,000

240,000

64B128B

256B512B

1KB2KB

4KB8KB

NativeLinux OverlayDocker Overlay

Pack

et N

umbe

r /s

Packet Size of Iperf Connection

Small Packet Performance

0

60,000

120,000

180,000

240,000

64B128B

256B512B

1KB2KB

4KB8KB

NativeLinux OverlayDocker Overlay

Pack

et N

umbe

r /s

Packet Size of Iperf Connection

• Docker overlay achieves as low as 50% packet processing rate of that in the native case.

Interrupt Number with Varying Packet Sizes

0

10,000

20,000

30,000

40,000

64B128B256B512B1KB2KB4KB8KB

64B128B256B512B1KB2KB4KB8KB

NativeLinux OverlayDocker Overlay

IRQ SoftIRQ Packet Size of Iperf Connection

IRQ

/s (S

oftIR

Q/s

)

IRQ SoftIRQ Packet Size of Iperf Connection

IRQ

/s (S

oftIR

Q/s

)

0

100,000

200,000

300,000

400,000

64B128B256B512B1KB2KB4KB8KB

64B128B256B512B1KB2KB4KB8KB

NativeLinux OverlayDocker Overlay

Interrupt number for TCP Interrupt number for UDP

• IRQ number increases dramatically in the Docker overlay UDP case — 10x of that in the TCP case.

Interrupt Number with Varying Packet Sizes

0

10,000

20,000

30,000

40,000

64B128B256B512B1KB2KB4KB8KB

64B128B256B512B1KB2KB4KB8KB

NativeLinux OverlayDocker Overlay

IRQ SoftIRQ Packet Size of Iperf Connection

IRQ

/s (S

oftIR

Q/s

)

IRQ SoftIRQ Packet Size of Iperf Connection

IRQ

/s (S

oftIR

Q/s

)

0

100,000

200,000

300,000

400,000

64B128B256B512B1KB2KB4KB8KB

64B128B256B512B1KB2KB4KB8KB

NativeLinux OverlayDocker Overlay

Interrupt number for TCP Interrupt number for UDP

• IRQ number increases dramatically in the Docker overlay UDP case — 10x of that in the TCP case.

• 3x softIRQ numbers are observed in Docker Overlay case compared with the IRQ numbers.

Insights and Conclusions

Insights and Conclusions• Kernel does not provide per-packet level parallelization.

Insights and Conclusions• Kernel does not provide per-packet level parallelization. • Kernel does not efficiently handle various packet processing tasks.

Insights and Conclusions• Kernel does not provide per-packet level parallelization. • Kernel does not efficiently handle various packet processing tasks. • Bottlenecks become more severe for small packets.

Insights and Conclusions• Kernel does not provide per-packet level parallelization. • Kernel does not efficiently handle various packet processing tasks. • Bottlenecks become more severe for small packets.

Thinking about future works:

Insights and Conclusions• Kernel does not provide per-packet level parallelization. • Kernel does not efficiently handle various packet processing tasks. • Bottlenecks become more severe for small packets.

Thinking about future works:• Is it feasible to provide packet-level parallelization for a single network

flow?

Insights and Conclusions• Kernel does not provide per-packet level parallelization. • Kernel does not efficiently handle various packet processing tasks. • Bottlenecks become more severe for small packets.

Thinking about future works:• Is it feasible to provide packet-level parallelization for a single network

flow?

• How can the kernel perform a better isolation among multiple flows especially for efficiently utilizing shared hardware resources?

Insights and Conclusions• Kernel does not provide per-packet level parallelization. • Kernel does not efficiently handle various packet processing tasks. • Bottlenecks become more severe for small packets.

Thinking about future works:• Is it feasible to provide packet-level parallelization for a single network

flow?

• How can the kernel perform a better isolation among multiple flows especially for efficiently utilizing shared hardware resources?

• Can the packets be further coalesced with optimized network path for reduced interrupts and context switches?

Thank you!