Low-Latency, Deterministic Networking with Standard Linux ... · Coming soon... 0 5 10 15 20 25...

Preview:

Citation preview

Low-Latency, Deterministic Networking with StandardLinux using XDP Sockets

Bjorn Topel, bjorn.topel@intel.com, @bjorntopelMagnus Karlsson, magnus.karlsson@intel.com

Embedded Linux Conference Europe, Lyon, 2019

Coming soon...

0

5

10

15

20

25

rxdrop txonly l2fwd

Mpps,

64

B

AF_XDPAF_PACKETv3

16.70

1.57

21.79

1.24

12.68

1.01

Results have bee estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Software and workloads used in performance tests may have been optimized forperformance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information

and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance/datacenter.

ELCE 2019 2

Legal Disclaimer

• Intel technologies may require enabled hardware, specific software, or services activation. Check with your system manufacturer or retailer.

• No computer system can be absolutely secure.

• Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration willaffect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more completeinformation about performance and benchmark results, visit www.intel.com/benchmarks.

• Cost reduction scenarios described are intended as examples of how a given Intel- based product, in the specified circumstances andconfigurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.

• All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel productspecifications and roadmaps.

• No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

• Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced website and confirm whether referenced data are accurate.

• Intel, the Intel logo, and other Intel product and solution names in this presentation are trademarks of Intel.

• Other names and brands may be claimed as the property of others.

• c©2019 Intel Corporation.

ELCE 2019 3

$ whoami

Bjorn Topel

@bjorntopelbjorn.topel@intel.com

Linux kernel networking hacker @ IntelAF XDP and RISC-V BPF JIT maintainer

ELCE 2019 4

XDP sockets?

ELCE 2019 5

The big picture

Network hardware

Linu

x ke

rnel

BPF

map

s

XDP

Network stack

Device driver

Queueing and forwarding TC BPF

TCP/UDP

AF_INET

Use

rspa

ce Control plane

AF_

XD

P

AF_

PACK

ET

Virtu

al d

evic

es

Applications

IP layer

Build sk_buff

VMs and containers

Drop

Licensed under a Creative Commons Attribution-ShareAlike 4.0 International License cba.

ELCE 2019 6

Good ol’ sockets

/* Cooked INET sockets */

fd = socket(PF_INET, SOCK_DGRAM, 0);

bind(fd, addr);

for (;;)

receive_packet(fd, buff);

send_packet(fd, buff);

/* Raw XDP sockets */

fd = socket(PF_XDP, SOCK_RAW, 0);

/* mmap/posix_memalign/malloc */

pktbuffs = alloc_buffs();

setsockopt(fd, SOL_XDP,

XDP_MEM_REG, pktbuffs);

setsockopt(fd, SOL_XDP,

XDP_{RX,TX,FILL,COMPLETE}_RING,

ring_size);

/* map kernel rings */

{rx,tx,f,c}_ring = mmap(..., fd, ...);

bind(fd, {"eth0", qid});

for (;;)

read_process_send_packets(fd);

ELCE 2019 7

Descriptors

$ grep -A 5 Rx/Tx\ descriptor include/uapi/linux/if_xdp.h

/* Rx/Tx descriptor */

struct xdp_desc {

__u64 addr;

__u32 len;

__u32 options;

};

ELCE 2019 8

Rings, rings, and more rings

Rxstruct xdp desc

consumer (user)

producer (kernel)

4entries

Txstruct xdp desc

consumer (kernel)

producer (user)

Fillu64

consumer (kernel)

producer (user)

Completionu64

consumer (user)

producer (kernel)

ELCE 2019 9

Syscalls

ELCE 2019 10

Pitfalls

ELCE 2019 11

Docs and Samples

Docs: Documentation/networking/af xdp.rst

Samples: samples/bpf/xdpsock user.c

libbpf: tools/lib/bpf/*

ELCE 2019 12

Your Driver Here

ELCE 2019 13

Softirqs, NAPI, and SPSC rings

ELCE 2019 14

XDP modes

ELCE 2019 15

AF XDP zero-copy driver support

/* from include/linux/netdevice.h */

enum bpf_netdev_command {

...

XDP_SETUP_XSK_UMEM,

};

struct netdev_bpf {

enum bpf_netdev_command command;

union {

...

/* XDP_SETUP_XSK_UMEM */

struct {

struct xdp_umem *umem;

u16 queue_id;

} xsk;

};

};

...

int (*ndo_bpf)(struct net_device *dev, struct netdev_bpf *bpf);

int (*ndo_xdp_xmit)(struct net_device *dev, int n, struct xdp_frame **xdp,

u32 flags);

int (*ndo_xsk_wakeup)(struct net_device *dev, u32 queue_id, u32 flags);

ELCE 2019 16

Code

net/xdp/* kernel/bpf/xskmap.c

drivers/net/ethernet/intel/i40e/*

drivers/net/ethernet/intel/ice/*

drivers/net/ethernet/intel/ixgbe/*

drivers/net/ethernet/mellanox/mlx5/*

...soon Broadcom

ELCE 2019 17

Test setup

Linux pre-5.5 (bpf-next) non-preemptive ‘‘mitigations=on’’

Intel Xeon Gold 6154 CPU @ 3.00GHz (Skylake)

Intel XL710 40GbE (i40e) NIC

1 Rx HW queue, 1 Tx HW queue

“two cores”: kernel and userland processing on different cores

“one core”: kernel and userland processing on same core

IXIA packet load generator

Latency is end-to-end, measured at load generator

ELCE 2019 18

AF PACKETv3 vs AF XDP, throughput, two cores

0

5

10

15

20

25

rxdrop txonly l2fwd

Mpps,

64

B

AF_XDPAF_PACKETv3

16.70

1.57

21.79

1.24

12.68

1.01

Results have bee estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Software and workloads used in performance tests may have been optimized forperformance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information

and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance/datacenter.

ELCE 2019 19

AF PACKETv3 vs AF XDP, throughput, one core

0

2

4

6

8

10

12

14

16

rxdrop txonly l2fwd

Mpps,

64

B

AF_XDPAF_PACKETv3

11.03

0.82

14.24

1.24

8.73

0.009

Results have bee estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Software and workloads used in performance tests may have been optimized forperformance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information

and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance/datacenter.

ELCE 2019 20

AF PACKETv3 vs AF XDP, e2e latency, normal

0

2000

4000

6000

8000

10000

12000

14000

16000

AF_PACKETv3 AF_XDP

late

ncy

us

maxavgmin

16019

7694

553 99 12 6

Results have bee estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Software and workloads used in performance tests may have been optimized forperformance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information

and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance/datacenter.

ELCE 2019 21

AF PACKETv3 vs AF XDP, e2e latency, fire hose

0

200

400

600

800

1000

1200

AF_PACKETv3 AF_XDP

late

ncy

ms

maxavgmin

1161.66

871.59

74.532.14 0.14 0.05

Results have bee estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Software and workloads used in performance tests may have been optimized forperformance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information

and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance/datacenter.

ELCE 2019 22

Thanks!

• Ilias Apalodimas

• Daniel Borkmann

• Jesper Dangaard Brouer

• Maciej Fijalkowski

• Andy Gospodarek

• Toke Høiland-Jørgensen

• Jakub Kicinski

• Kevin Laatz

• Jonathan Lemon

• Ciara Loftus

• Ilya Maximets

• Maxim Mikityanskiy

• David S. Miller

• Bruce Richardson

• Sridhar Samudrala

• Alexei Starovoitov

• All the companies/people hackingXDP!

...and Linus Torvalds for releasing his kernel to the public!

ELCE 2019 23

Recommended