28
L3 + VXLAN Made Practical

L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

Embed Size (px)

Citation preview

Page 1: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

L3 + VXLAN Made Practical

Page 2: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

Who We AreNolan Leake

Cofounder, CTOCumulus Networks

Chet BurgessVice President, EngineeringMetacloud

Page 3: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

Today, most non-SDN controller based OpenStack deployments use L2 networks.

Page 4: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

Traditional Enterprise Network Design

L3

L2

Access

Aggregation

Core

VRRP VRRP

STP STP

ECMP

Page 5: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

What’s wrong with L2?

• Aggregation tier must be highly available/redundant

• Aggregate/Core scalability– MAC/ARP table limits, VLAN exhaustion, East-West

choke points

• Wasted capacity (STP blocking ports)• Proprietary protocols/extensions– MLAG, vPC, etc

Page 6: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

How do we make it better?

Page 7: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

L3: A better design• IP Fabrics Are Ubiquitous

– Proven at scale (The Internet, massive datacenter clusters)

• Simple Feature Set– no alphabet soup of L2 protocols

• Scalable L2/L3 Boundary• ECMP – Equal Cost Multi-Path

– Each link is active at all times– Maximize link utilization– Predictable latency– Better failure handling

Page 8: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

L3: A better design

LEAF

SPINE

Page 9: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

Pure L3 is great for maximizing connectivity, but what about segregation of projects?

Page 10: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

VXLAN: Virtual eXtensible LAN• IETF Draft Standard– http://www.ietf.org/id/draft-mahalingam-dutt-dcops-vxlan

-09.txt

• A type of network overlay technology that encapsulates L2 frames as UDP packets

Page 11: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

VXLAN: Virtual eXtensible LAN

Page 12: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

VXLAN: Virtual eXtensible LAN• VNI – VXLAN Network Identifier– 24 bit number (16M+ unique identifiers)– Part of the VXLAN Header– Similar to VLAN ID– Limits broadcast domain

• VTEP – VXLAN Tunnel End Point– Originator and/or terminator of VXLAN tunnel for a

specific VNI– Outer DIP/Outer SIP

Page 13: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

VXLAN: Virtual eXtensible LAN• Sending a packet– ARP table is checked for IP/MAC/Interface mapping– L2 FDB is checked to determine IP of destination VTEP for

destination MAC on source VTEP

Page 14: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

VXLAN: Virtual eXtensible LAN• Sending a packet– Packet is encapsulated for destination VTEP with

configured VNI and sent to destination– Destination VTEP un-encapsulates the packet and the inner

packet is then processed by the receiver

Page 15: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

How do VTEPs handle BUM (Broadcast, Unknown Unicast, Multicast)?

Page 16: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

BUM• All BUM type packets (ex. ARP, DHCP, multicast) are

flooded to all VTEPs associated with the same VNI.• Flooding can be handled 2 ways– Packets are sent to a multicast address that all VTEPs are

subscribers of– Packets are sent to a central service node that then floods

the packets to all VTEPs found in its local DB for the matching VNI

Page 17: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

VXLAN: Virtual eXtensible LAN• Well supported in most modern Linux Distros– Linux Kernel 3.10+

• Linux uses UDP port 8472 instead of IANA issued 4789

– iproute2 3.7+

• Configured using ip link command

Page 18: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

How do we use this with OpenStack?

Page 19: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

nova-network• Clients needed L3+VXLAN for their existing nova-

network based big data deployments (hadoop).– Neutron already supports VXLAN and should work with L3

as well (we didn’t have time to test it).

• Full VXLAN support in nova-network• Unicast VXLAN service node for BUM flooding

Page 20: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

VXLAN Service Node• Unicast service for BUM flooding• Eliminates the need for multicast• Python based• 2 Components– VXSND – VXLAN Service Node Daemon– VXRD – VXLAN Registration Daemon

• Will be open sourced in the near future.

Page 21: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

VXSND• Listens for VXLAN BUM packets from VTEPs• Learns VTEP and VNI endpoints from BUM packets• Relays BUM packets to all known VTEPs for given VNI• Supports registration/replication from other VXSND

daemons or VXRD

Page 22: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

VXRD• Monitors local interfaces on hypervisors• Sends VTEP+VNI registration packet to VXSND node

for all local VTEPs.

Page 23: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

Software Gateway• We’re still getting in/out of the VXLAN network using

a software gateway– Lower performance– Extra servers

• All nova-net (or neutron’s l3agent) is doing is configuring VXLANs, bridges and iptables NAT.– What if we had a hardware switch that could accelerate

these standard Linux network features with an ASIC?

Page 24: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

Cumulus Linux• Cumulus Linux– Linux Distribution for HW switches (Debian based)– Hardware accelerated Linux kernel forwarding using ASICs– Just like a Linux server with 32 40G NICs, but ~100x faster

• Standard Linux Tools– Ifconfig, ip route, iptables, brctl, dnsmasq, etc

Page 25: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

Demo

Page 26: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

Next Steps (nova-network VXLAN)• nova-network– Blueprint to add VXLAN support to nova-network Juno

coming soon.

• VXSND/VXRD– Update VXRD to monitor netlink for VTEP add/delete– Improve concurrency and scalability of VXSND– Support for tiered replication (TOR, spine, etc)– Goal is to open source the product before Paris summit.

Page 27: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

Next Steps (nova-network on Switches)• Hack: ASIC can’t route in/out of VXLAN tunnel– Next gin ASICs can– Worked around by looping a cable between two ports

• Packets take a second trip through the switch

• Hack: Cumulus Linux doesn’t support NAT– I hacked in just enough NAT support for floating IPs =)

• Limitation: ASIC can only NAT 512 IPs. /23– Next gen ASICs will likely have larger tables

Page 28: L3 + VXLAN Made Practical. OpenStack Summit Spring 2014 Who We Are Nolan Leake Cofounder, CTO Cumulus Networks Chet Burgess Vice President, Engineering

OpenStack Summit Spring 2014

Q&A