24
VyOS VXLAN and Linux Device Driver 2014/11/2 VyOS users meeting #2 Ryo Nakamura [email protected] VyOS の VXLAN の Linux ののののののの

VyOS Users Meeting #2, VyOSのVXLANの話

  • Upload
    upaa

  • View
    3.301

  • Download
    0

Embed Size (px)

DESCRIPTION

第2回VyOS Useres Meeting, VyOS VXLANの資料です。

Citation preview

VyOS VXLAN and Linux Device Driver

2014/11/2

VyOS users meeting #2

Ryo Nakamura

[email protected]

VyOSの VXLANと Linuxのデバドラの話

Virtual eXtensible LAN• An Ethernet over IP overlay. RFC7348.

– Ethernet frame is encapsulated in IP + UDP + VXLAN headers.

– VXLAN header contains 24bit Virtual Network Identifier (VNI) field.

2^24 L2 segments can be multiplexed in one VXLAN overlay network

domain.

– Unicast traffic is encapsulated in IP Unicast.

– BUM traffic is encapsulated in IP Multicast.

• Multicast based VTEP learning is described in RFC, Sec 4.

– Many vendors propose and use their original control planes.

– Of course, I know that Multicast is difficult in actual environments, but

they don’t have INTEROPERBILITY :(

2

Multicast based VTEP learning

OuterIP Src AOuterIP Dst MSrcMAC : 1DstMAC : FF

VTEP:A

VTEP:C

VTEP:B

VTEP:DNode:1

Node:3

Node:4

Node:2Node 1 send arp request Node 4

3

Node 1 is in VTEP A !!

Multicast based VTEP learning

VTEP:A

VTEP:C

VTEP:B

VTEP:DNode:1Node:4

Node:2

OuterIP Src DOuterIP Dst ASrcMAC : 4DstMAC : 1

Node 4 send arp reply to Node 1

4Node:3

Node 4 is in VTEP D !! Node 1 is in

VTEP A !!

VyOS VXLAN support

• 2014/9/20, merged.

5

Linux kernel version issue

• Linux VXLAN Driver is appeared in kernel 3.7

– 2012/9/24, first patch was contributed to netdev.

– I was really looking forward to Vyatta Core with

kernel 3.7 and later.

• Kernel version of VyOS Helium is 3.13.11 !!

– HooooooooOOOO!!! WrrrrryyyyyyYYYYYYYYYY !!!!!!!!

– Hydrogen is kernel 3.3

6

VyOS VXLAN CLI• Under the interfaces section

– set interfaces vxlan vxlan0

– set interfaces vxlan vxlan0 group 239.0.0.1

– set interfaces vxlan vxlan0 vni 0

– and basic interface operations

• IPv4/v6 routing

• bridge-group

• policy

interfaces { vxlan vxlan0 { group 239.0.0.1 vni 0 }}

7

Operation example

interfaces { vxlan vxlan0 { address 172.16.0.1/24 group 239.0.0.10 ip { ospf { cost 10 } } vni 0 }}

protocols { ospf { area 0 { network 172.16.0.0/24 } }}

8

Operation example

vyos@vyos:~$ show interfaces vxlan vxlan0 vxlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default link/ether b2:74:c9:fa:1d:fd brd ff:ff:ff:ff:ff:ff inet 172.16.0.1/24 brd 172.16.0.255 scope global vxlan0 valid_lft forever preferred_lft forever inet6 fe80::b074:c9ff:fefa:1dfd/64 scope link valid_lft forever preferred_lft forever

RX: bytes packets errors dropped overrun mcast 0 0 0 0 0 0 TX: bytes packets errors dropped carrier collisions 2446 25 0 0 0 0

9

Operation example

vyos@vyos:~$ show ip ospf interface vxlan0 vxlan0 is up ifindex 3, MTU 1500 bytes, BW 0 Kbit <UP,BROADCAST,RUNNING,MULTICAST> Internet Address 172.16.0.1/24, Broadcast 172.16.0.255, Area 0.0.0.0 MTU mismatch detection:enabled Router ID 10.10.20.189, Network Type BROADCAST, Cost: 10 Transmit Delay is 1 sec, State DR, Priority 1 Designated Router (ID) 10.10.20.189, Interface Address 172.16.0.1 No backup designated router on this network Multicast group memberships: OSPFAllRouters OSPFDesignatedRouters Timer intervals configured, Hello 10s, Dead 40s, Wait 40s, Retransmit 5 Hello due in 7.900s Neighbor Count is 0, Adjacent neighbor count is 0

10

node.def• VXLAN interface name

– Different number from VNI can be used to an interface

name. But, I think it is really confusing :(

val_help: <vxlanN>; VXLAN interface namesyntax:expression: pattern $VAR(@) "vxlan[0-9]+$"

11

node.def (cont’d)• REQUIRED

– A vxlan overlay network is identified by VNI.

– Multicast Group Address is required to encapsulate BUM

Traffic in IP Multicast.

Group Address can be reused for other VNI.

commit:expression: $VAR(./group/) != ""; \ "Must configure vxlan group for $VAR(@)"commit:expression: $VAR(./vni/) != ""; \ "Must configure vxlan vni for $VAR(@)“

12

node.def (cont’d)

• create interface

VXLAN_VNI="id $VAR(./vni/@)" VXLAN_GROUP="group $VAR(./group/@)" VXLAN_TTL="ttl 16"

if [ ! $VAR(./link/) == "" ]; then VXLAN_DEV="dev $VAR(./link/@)" fi

ip link add name $VAR(@) type vxlan \ $VXLAN_VNI $VXLAN_GROUP $VXLAN_TTL $VXLAN_DEV ip link set $VAR(@) up

touch /tmp/vxlan-$VAR(@)-create

skimped work...

underlay device

And, execute iproute2

13

Change vni or group of existing vxlan interfaces

• Sorry, it is not supported.

• Changing group or vni requires

delete and re-create the vxlan

interface.

14

VXLAN in Linux• ip link add type vxlan

– Pseudo ethernet interface : vxlanX

– Interfaces are connected to each vxlan overlay network corresponding

to a VNI (vxlan_dev and FDB / VNI)

– Namespace is supported

Linux Kernel

vxlan1

FDB

vxlan0

kernel udp socket

FDB

udp_sk(sk)->encap_rcv = vxlan_udp_encap_recv

netif_rx(skb)

iptunnel_xmit()

struct net_device

15

How to specify attributes• ip link add type vxlan id 0 group X

– Netlink API : An API to communicate to Kernel

– NETLINK_ROUTE, NETLINK_NETFILTER and more

Linux Kernel

Userland Application

Netlink Socketsocket(AF_NETLINK, SOCK_RAW, netlink_family)

Interface

routing table

Netfilter

struct nlmsghdrand rtattr etc

16

How to specify attributes (cont’d)

• ip link add type vxlan id 0 group X

– RTNETLINK : routing socket

• RTM_NEWLINK message is sent with attributes related

to VXLAN (see man ip-link)

int do_iplink(int argc, char **argv){ if (argc > 0) { if (iplink_have_newlink()) { if (matches(*argv, "add") == 0) return iplink_modify(RTM_NEWLINK, NLM_F_CREATE|NLM_F_EXCL, argc-1, argv+1);

iproute2 package is a good text book of

Netlink !!

17

Attributes of vlxan interface• id : Virtual Network Identifier

• dev : Uunderlay device (in VyOS, link)

• group : Multicast group address

• remote : An unicast IP address of VTEP for BUM traffic

• local : Source IP address of encapsulated packet

• ttl : TTL of encapsulated packet

• port : Source port range of encapsulated packet

But, these attributes can be only specified when a pseudo interface is created !!

18

How to specify attributes (cont’d)

• VXLAN driver kernel-source/drivers/net/vxlan.c

– RTM messages are processed by rtnl_link_ops

static struct rtnl_link_ops vxlan_link_ops __read_mostly = { .kind = "vxlan", .maxtype = IFLA_VXLAN_MAX, .policy = vxlan_policy, .priv_size = sizeof(struct vxlan_dev), .setup = vxlan_setup, .validate = vxlan_validate, .newlink = vxlan_newlink, .dellink = vxlan_dellink, .get_size = vxlan_get_size, .fill_info = vxlan_fill_info,};

vxlan_newlink () is calledwhen RTM_NEWLINKis received

19

vxlan_newlink ()

• Codes can not be pasted... too long...

1. Parse attributes

2. Set up parsed parameters to vxlan_dev

3. register_netdeivce

20

And, you can see vxlan0asano2:/home/upa % ifconfig vxlan0 vxlan0 Link encap:Ethernet HWaddr 02:0a:1e:ad:7f:31 inet6 addr: fe80::a:1eff:fead:7f31/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:7 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:690 (690.0 B)

asano2:/home/upa % ip -d link show dev vxlan09: vxlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ether 02:0a:1e:ad:7f:31 brd ff:ff:ff:ff:ff:ff promiscuity 0 vxlan id 0 group 239.0.0.1 srcport 32768 61000 dstport 8472 ageing 300

asano2:/home/upa % bridge fdb show dev vxlan000:00:00:00:00:00 dst 239.0.0.1 self permanent

21

As a result• vxlan parameters can not be changed after

pseudo interface is created.

• Do you have good ideas ?

– I have just only one idea.

• Use Generic Netlink like l2tp driver

• Generic Netlink is a mechanism to add user defined

netlink family dynamically.

• It requires patches to vxlan driver and iproute2...

22

Future work ?• Change destination port ?

– Default is 8472 (OTV). 4789 is assigned for VXLAN by IANA

– It can be changed through module_param.

But it requires rmmod/insmod when port is changed.

Of course, all pseudo interfaces are removed...

• Support “remote” attribute

– Easy. Is it needed for the community ?

23

Overlay is the Only Way!!

Thanks!

[email protected]

24