Upload
docker-inc
View
468
Download
3
Embed Size (px)
Citation preview
CiliumNetworking & Security for Containers with BPF & XDP
Docker Distributed Systems SummitThomas Graf
The Network becomes the Application bus
We have to deal with networks that ...
○ contain millions of endpoints
○ are noisy (nMpps)
○ are insecure with multiple tenants
○ operate unreliably
○ are constantly evolving WRT protocols
Cilium Architecture
What is BPF?
BPF Code Generation at Container Startup
● Generate networking code at container startup
○ Tailored to each individual container
○ Leads to minimal code required
⇒ faster
⇒ smaller attack surface (unikernel like)
● Majority of configuration (IP, MAC, ports, ... ) becomes
constant, the compiler can optimize heavily
● Regeneration at runtime without breaking connections
Make all tasks globally addressable on the Internet● Global IPv6 addresses
○ No NAT!
○ Native IPv4/NAT46 + NAT for compat
● Host scope address allocator
○ Lockless allocation
● Task mobility
○ ILA
Scaling Policy Specification● How to specify policy for millions of endpoints?
● Decouple policy specification from addressing
○ IP+port ACLs are unsuitable for containers
○ Policy specification based on container labels
Frontend BackendLB
FE BELB
LBFE
FE BE
LB
Scaling Policy Specification● How to specify policy for millions of endpoints?
● Decouple policy specification from addressing
○ IP+port ACLs are unsuitable for containers
○ Policy specification based on container labels
Frontend BackendLB
FE BELB
LBFE
FE BE
LB
Prod
Frontend BackendLB
FE BELBQA
Prod
QA
Prodrequires
requires QA
QA
Scaling Policy Enforcement
● Distributed fixed cost policy enforcement
○ Per-CPU BPF-map hashtable
FE
BE
LB Prod
QA
Prod
Prod
FE
BE
LB
QA
QA
10111213141516
Cluster Wide Label ID Table: This ID is carried in the network packet and used to reconstruct the label context at the receiving host.
Policy enforcement cost is reduced to a single hashtable lookup regardless of complexity.
Extensibility & Safety in the Kernel
● Decouple datapath functionality from kernel version
○ Support new protocols
○ Add arbitrary statistics
○ Safety guaranteed by Verifier
● All at runtime for already running containers
Scaling the Delivery of Cat Pictures● Distributed L3/L4 LB w/ DSR
● Like IPVS but completely programmable
● LB for N-S, E-W & Intra-node
FE
BE
LB
LBECMP
FE
FE
BE
BE
BE
Small HTTP GET
Large Cat Pictures/Videos
Performance
Demo
Q&A
Start hacking on BPF for containers:
https://github.com/cilium/cilium
Slack: Twitter
cilium.slack.com @tgraf__
Thank You
● L3 forwarding (IPv6 & IPv4)
● Host connectivity
● Encapsulation
(VXLAN/Geneve/GRE)
● ICMPv6 & ICMP generation
● NDisc & ARP responder
● Access Control
● Port mapping
● Connection tracking
● L3/L4 Load balancer w/ DSR
● Statistics
● Events (perf ring buffer)
● Debugging framework
● NAT46
Building Blocks