Upload
truongnhi
View
219
Download
0
Embed Size (px)
Citation preview
Chair of Network Architectures and ServicesDepartment of InformaticsTechnical University of Munich
Performance Implications of Packet Filtering with Linux eBPF
Dominik Scholz, Daniel Raumer, Paul Emmerich,Alexander Kurtz, Krzysztof Lesiak and Georg Carle
Chair of Network Architectures and ServicesDepartment of Informatics
Technical University of Munich
History of Extended Berkeley Packet Filter (eBPF)Recent Hot Topic
• 1992: BPF developed for UNIX• Packet filtering, e.g. tcpdump
• 2014: eBPF introduced into Linux Kernel• Network monitoring• Network traffic manipulation• Non-networking purposes
• Tracing• Security auditing• . . .
• Since then:• Continuous (performance) improvements [1]• “super powers have finally come to Linux” [2]• Offloading support, e.g. Netronome SmartNIC [3]• At Host Dataplane Acceleration Tutorial @ SIGCOMM 2018 [4]
[1] A thorough introduction to eBPF, https://lwn.net/Articles/740157/[2] Brendan Gregg, BPF: Tracing and More, https://www.youtube.com/watch?v=JRFNIKUROPE[3] NetroNews, August 2018, http://hosted.verticalresponse.com/183413/a79f667b58/1413119999/c60e793082/[4] ACM SIGCOMM 2018 Morning Tutorial on Host Dataplane Acceleration (HDA), https://conferences.sigcomm.org/sigcomm/2018/tutorial-hda.html
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 2
Outline
Extended Berkeley Packet Filter
Use Case: Packet Filtering
Case Study I: eXpress Data Path (XDP)
Case Study II: Socket-attached Filtering
Conclusion
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 3
Extended Berkeley Packet Filter (eBPF)
What is it?
• User space program• Run in virtual machine in kernel space (“sandboxed”)• Dynamically interpreted (default) or compiled just-in-time (JIT)
Limitations
• Static verification• No backward jumps (loops)• Maximum of 4096 instructions: Cannot compromise/block kernel
• Data access: key-value stores (maps)• Memory region set up before program is loaded• Key size, value type, max. number of entries predetermined: Secure data access between user space and kernel space
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 4
Study: Packet FilteringLayers of Packet Filters
NIC
Driver
OS
Apps
Hardwarelevel
Networklevel
Systemlevel
Applicationlevel
HW-filter
DMA
Poll routines
NAPI
Network stack
Transport prot.
Applications
• Traffic addressed for a specific application
: Application “knows best”, high penalty for dropping packets
• Hooks into packet processing of network stack• e.g. iptables or nftables
: Requires root access, system-specific knowledge
• Before network stack processing
: Dropping packets with low overhead in software
• Hardware offloading and filtering• Dedicated platforms based on FPGAs or SmartNICs
: High-performance, ideal for coarse filtering (DoS)
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 5
Study: Packet FilteringLayers of Packet Filters
NIC
Driver
OS
Apps
Hardwarelevel
Networklevel
Systemlevel
Applicationlevel
HW-filter
DMA
Poll routines
NAPI
Network stack
Transport prot.
Applications
• Traffic addressed for a specific application
: Application “knows best”, high penalty for dropping packets
• Hooks into packet processing of network stack• e.g. iptables or nftables
: Requires root access, system-specific knowledge
• Before network stack processing
: Dropping packets with low overhead in software
• Hardware offloading and filtering• Dedicated platforms based on FPGAs or SmartNICs
: High-performance, ideal for coarse filtering (DoS)
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 5
Study: Packet FilteringLayers of Packet Filters
NIC
Driver
OS
Apps
Hardwarelevel
Networklevel
Systemlevel
Applicationlevel
HW-filter
DMA
Poll routines
NAPI
Network stack
Transport prot.
Applications
• Traffic addressed for a specific application
: Application “knows best”, high penalty for dropping packets
• Hooks into packet processing of network stack• e.g. iptables or nftables
: Requires root access, system-specific knowledge
• Before network stack processing
: Dropping packets with low overhead in software
• Hardware offloading and filtering• Dedicated platforms based on FPGAs or SmartNICs
: High-performance, ideal for coarse filtering (DoS)
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 5
Study: Packet FilteringLayers of Packet Filters
NIC
Driver
OS
Apps
Hardwarelevel
Networklevel
Systemlevel
Applicationlevel
HW-filter
DMA
Poll routines
NAPI
Network stack
Transport prot.
Applications
• Traffic addressed for a specific application
: Application “knows best”, high penalty for dropping packets
• Hooks into packet processing of network stack• e.g. iptables or nftables
: Requires root access, system-specific knowledge
• Before network stack processing
: Dropping packets with low overhead in software
• Hardware offloading and filtering• Dedicated platforms based on FPGAs or SmartNICs
: High-performance, ideal for coarse filtering (DoS)
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 5
Use Case: Packet FilteringCommon Scenario – State of the Art
NIC
Driver
OS
Apps
Hardwarelevel
Networklevel
Systemlevel
Applicationlevel
Centralized Firewalle.g. iptables, nftables
HW-filter
DMA
Poll routines
NAPI
Network stack
Transport prot.
Applications
• Traffic addressed for a specific application
: Application “knows best”, high penalty for dropping packets
• Hooks into packet processing of network stack• e.g. iptables or nftables
: Requires root access, system-specific knowledge
• Before network stack processing
: Dropping packets with low overhead in software
• Hardware offloading and filtering• Dedicated platforms based on FPGAs or SmartNICs
: High-performance, ideal for coarse filtering (DoS)
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 6
Use Case: Packet FilteringPerformance Baseline
16 32 64 128 256 5120
0.5
1
1.5
Number of Rules [#]
Processed
Packets[M
pps]
nftables iptables
Maximum packet rate
0
20
40nftables
0
20
40
RelativeProbability[%
]
iptables
20 40 60 80 100 120 140 160 180 2000
20
40
Latency [µs]
No Firewall
Latency distribution at 0.03 Mpps
Performance sufficient for today’s applications?Limitations: Centralized, complex ruleset, requiring root access
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 7
Use Case: Packet FilteringPerformance Baseline
16 32 64 128 256 5120
0.5
1
1.5
Number of Rules [#]
Processed
Packets[M
pps]
nftables iptables
Maximum packet rate
0
20
40nftables
0
20
40
RelativeProbability[%
]
iptables
20 40 60 80 100 120 140 160 180 2000
20
40
Latency [µs]
No Firewall
Latency distribution at 0.03 Mpps
Performance sufficient for today’s applications?Limitations: Centralized, complex ruleset, requiring root access
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 7
Use Case: Packet FilteringPerformance Baseline
16 32 64 128 256 5120
0.5
1
1.5
Number of Rules [#]
Processed
Packets[M
pps]
nftables iptables
Maximum packet rate
0
20
40nftables
0
20
40
RelativeProbability[%
]
iptables
20 40 60 80 100 120 140 160 180 2000
20
40
Latency [µs]
No Firewall
Latency distribution at 0.03 Mpps
Performance sufficient for today’s applications?Limitations: Centralized, complex ruleset, requiring root access
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 7
Use Case: Packet Filtering (using Commodity Hardware)Possibilities with eBPF
NIC
Driver
OS
Apps
Hardwarelevel
Networklevel
Systemlevel
Applicationlevel
XDP
HW-filter
DMA
Poll routines
NAPI
Network stack
Transport prot.
Application FWs
Application FWsApplication FWsApplications
eXpress Data Path
• First line of defense• Coarse but efficient filtering
: Protection against DoS attacks
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 8
Case Study I: eXpress Data Path (XDP)Overview
Source: https://www.iovisor.org/technology/xdp
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 9
Case Study I: eXpress Data Path (XDP)Measurement Setup
XDP LoadGenJI
JI
• XDP program:• Drop if port is blacklisted• Otherwise, forward to outgoing interface: Excludes network stack
• Load generator:• MoonGen [7]• Generates n UDP flows
• Just-in-time compiler enabled• Traffic pinned to single core• Hyper-threading and Turbo Boost disabled
[7] https://github.com/emmericp/MoonGen
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 10
Case Study I: eXpress Data Path (XDP)Performance Baseline
0 2 4 6 8 10 12 140
5
10
15
Offered Rate [Mpps]
Pro
cess
edP
acke
ts[M
pp
s]
10GbE line-rate 90% dropped
50% dropped 10% dropped
Packet filtering performance
2 4 6 8 100
20
40
60
80
100
Offered Rate [Mpps]
CP
Ulo
ad[%
]
idle kernelixgbe bpf helper
bpf prog
Whitebox measurement – Profiling
• Drop everything: 10 Mpps• Drop x-Percent: 6.4 Mpps to 7.2 Mpps
• Kernel < 5% CPU time
: 5 to 10 times performance increase compared to in-kernel filtering
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 11
Case Study I: eXpress Data Path (XDP)Performance Baseline
0 2 4 6 8 10 12 140
5
10
15
Offered Rate [Mpps]
Pro
cess
edP
acke
ts[M
pp
s]
10GbE line-rate 90% dropped
50% dropped 10% dropped
Packet filtering performance
2 4 6 8 100
20
40
60
80
100
Offered Rate [Mpps]
CP
Ulo
ad
[%]
idle kernelixgbe bpf helper
bpf prog
Whitebox measurement – Profiling
• Drop everything: 10 Mpps• Drop x-Percent: 6.4 Mpps to 7.2 Mpps• Kernel < 5% CPU time
: 5 to 10 times performance increase compared to in-kernel filtering
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 11
Case Study I: eXpress Data Path (XDP)Latency
0 1 2 3 4 5 6 7100
101
102
103
Offered Rate [Mpps]
Late
ncy
[log
(µs)
]
50th 90th 99th 99.9th 99.99th
Latency percentiles (90 % passing traffic)
0
5
100.4 Mpps
0
5
10
RelativeProbability[%
]
3.9 Mpps
0 50 100 150 200 250 300 350 400 450 5000
5
10
Latency [µs]
6.25 Mpps
852 853 8540
1Outlier
Selected histrograms
Comparison
• iptables: 55µs (median), 110µs (99.99th %ile)• nftables: 82µs (median), 154µs (99.99th %ile)
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 12
Use Case: Packet Filtering (using Commodity Hardware)Possibilities with eBPF
NIC
Driver
OS
Apps
Hardwarelevel
Networklevel
Systemlevel
Applicationlevel
Socket eBPF
HW-filter
DMA
Poll routines
NAPI
Network stack
Transport prot.
Application FWs
Application FWsApplication FWsApplications
Application Firewall
• Socket attached filtering• Application (developer) can extract desired traffic
: Fine-grained and flexible filtering
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 13
Case Study II: Socket Attached FilteringApplication Firewall
Steps
• Write filter in C• Add eBPF virtual machine to socket• Run filter in virtual machine
: Drop or forward to application
Technical Implementation
• BPF Compiler Collection• systemd socket activation
: Added command-line interface and wrapper functions [8]
: E.g. port-knocking simple to implement
What have we gained?
• Application developer knows best• Rule-set shipped with application• Independent of system administrator• Small rule-set per application/socket• Not interfering with rule-set of another socket
: Optimized for application
: Reduced complexity, less error-prone
[8] https://github.com/AlexanderKurtz/alfwrapper
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 14
Case Study II: Socket Attached FilteringApplication Firewall
Steps
• Write filter in C• Add eBPF virtual machine to socket• Run filter in virtual machine
: Drop or forward to application
Technical Implementation
• BPF Compiler Collection• systemd socket activation
: Added command-line interface and wrapper functions [8]
: E.g. port-knocking simple to implement
What have we gained?
• Application developer knows best• Rule-set shipped with application• Independent of system administrator• Small rule-set per application/socket• Not interfering with rule-set of another socket
: Optimized for application
: Reduced complexity, less error-prone
[8] https://github.com/AlexanderKurtz/alfwrapper
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 14
Case Study II: Socket Attached FilteringApplication Firewall
Steps
• Write filter in C• Add eBPF virtual machine to socket• Run filter in virtual machine
: Drop or forward to application
Technical Implementation
• BPF Compiler Collection• systemd socket activation
: Added command-line interface and wrapper functions [8]
: E.g. port-knocking simple to implement
What have we gained?
• Application developer knows best• Rule-set shipped with application• Independent of system administrator• Small rule-set per application/socket• Not interfering with rule-set of another socket
: Optimized for application
: Reduced complexity, less error-prone
[8] https://github.com/AlexanderKurtz/alfwrapper
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 14
Case Study II: Socket Attached FilteringMeasurement Setup
Differences
• Application interested in throughput• (Stateful) applications more difficult to benchmark
Solution
• iperf as load generator• Benchmark using loopback device
0 10 20 30 40 50 600
1
2
3
4
MTU [kB]
Through
put
[GB/s]
Baseline
Baseline transmission speeds
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 15
Case Study II: Socket Attached FilteringMeasurement Setup
Differences
• Application interested in throughput• (Stateful) applications more difficult to benchmark
Solution
• iperf as load generator• Benchmark using loopback device
0 10 20 30 40 50 600
1
2
3
4
MTU [kB]
Through
put
[GB/s]
Baseline
Baseline transmission speeds
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 15
Case Study II: Socket Attached FilteringMeasurement Setup
Differences
• Application interested in throughput• (Stateful) applications more difficult to benchmark
Solution
• iperf as load generator• Benchmark using loopback device
0 10 20 30 40 50 600
1
2
3
4
MTU [kB]
Throughput
[GB/s]
Baseline
Baseline transmission speeds
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 15
Case Study II: Socket Attached FilteringPerformance
Interface filtering
• Simple operation• Whitelisting: Matching rule at x-th position
0 20 40 60 80 1000
25
50
75
100
x-th Rule Matching Traffic [#]
Rel
.T
hro
ugh
put
[%]
65536 MTU 9000 MTU1500 MTU 1280 MTU
Subnet filtering
• Complex operation• Limits number of rules
0 5 10 15 200
25
50
75
100
x-th Rule Matching Traffic [#]
Rel
.T
hro
ugh
put
[%]
65536 MTU 9000 MTU1500 MTU 1280 MTU
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 16
Case Study II: Socket Attached FilteringPerformance
Interface filtering
• Simple operation• Whitelisting: Matching rule at x-th position
0 20 40 60 80 1000
25
50
75
100
x-th Rule Matching Traffic [#]
Rel
.T
hro
ugh
put
[%]
65536 MTU 9000 MTU1500 MTU 1280 MTU
Subnet filtering
• Complex operation• Limits number of rules
0 5 10 15 200
25
50
75
100
x-th Rule Matching Traffic [#]
Rel
.T
hro
ugh
put
[%]
65536 MTU 9000 MTU1500 MTU 1280 MTU
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 16
Conclusion
eBPF allows to break up traditional packet filtering
• Adds flexibility• High-performance• But: limitations
: Many applications can benefit from eBPF
Future developments
• Improvements for eBPF just-in-time compiler• Hardware accelerators and offloading capabilities• P4 programming language
NIC
Driver
OS
Apps
Hardwarelevel
Networklevel
Systemlevel
Applicationlevel
XDP
Socket eBPF
HW-filter
DMA
Poll routines
NAPI
Network stack
Transport prot.
Application FWs
Application FWsApplication FWsApplications
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 17
Conclusion
eBPF allows to break up traditional packet filtering
• Adds flexibility• High-performance• But: limitations
: Many applications can benefit from eBPF
Future developments
• Improvements for eBPF just-in-time compiler• Hardware accelerators and offloading capabilities• P4 programming language
NIC
Driver
OS
Apps
Hardwarelevel
Networklevel
Systemlevel
Applicationlevel
XDP
Socket eBPF
HW-filter
DMA
Poll routines
NAPI
Network stack
Transport prot.
Application FWs
Application FWsApplication FWsApplications
D. Scholz, D. Raumer, P. Emmerich, A. Kurtz, K. Lesiak, G. Carle — Performance Implications of Packet Filtering with Linux eBPF 17