August 2016
Open Ethernet: an open-source approach to modern
network designAlexander Petrovskiy, System engineer
© 2016 Mellanox Technologies 2
Paradigm shift example in the history of IT…
From Mainframes era…
• Closed architecture
• Proprietary interfaces
• Proprietary software
• Vertical ”Scale-up” design
To PC-world
• Open, extendable architecture
• Standard interfaces
• Large variety of software
• Horizontal ”Scale-out” design
* PDP-11/70 minicomputer * Racks of OCP servers in Facebook
© 2016 Mellanox Technologies 3
Paradigm shift nowadays
Wall street IT Web-scale IT
CLOS Fabric
Scale-out
Distributed architecture
Automation, DevOps
Open source software
Hyperconverged
Software-defined
Scale-up
Centralized architecture
Legacy management tools
Proprietary software
Virtualized
Hardware-defined
“By 2017, Web-scale IT will be an
architectural approach found
operating in 50 percent of
global enterprises…”
Gartner, March 5, 2014
© 2016 Mellanox Technologies 4
Closed platform- Vendor lock-in
- Expensive
- Slow upgrade cycle
Open Ethernet initiative = Open Source in networking
Ability to choose- The best hardware
- The best software
- Rapid deployment and
upgrades
hardware
operating system
app app app
Disaggregated approach:- ONIE, SDK API, SAI
SONiC
Integrated approach:
© 2016 Mellanox Technologies 5
Open HW platform and open APIs
1. Open HW for networking devices
- Choose your OS, switch HW platform and apps
- Open Compute Project
2. Standard-based, open API
- OCP ONIE, SAI
- Linux Switchdev
3. Open interfaces for management and
orchestration
- REST API
- Chef, Puppet, Ansible
Standard-based protocols and open
protocol implementations
1. Routing: Quagga, Bird
2. Switching: OpenMLAG
3. SDN: OpenFlow, Overlays (VXLAN)
Open Ethernet - The key components
Open Ethernet initiative to speed-up innovation
+NOSs
© 2016 Mellanox Technologies 6
Spectrum 100G Ethernet ASIC – HW platform for Open Ethernet
Leader in performance• Non-blocking 6.4Tb/s switching
• Sub-300ns port-to-port latency• Zero Packet Loss• RDMA over Converged Ethernet
Cloud-scale• Virtual network at scale• Bandwidth optimization
• Flexible SDN pipeline
Features• 32 ports of 100 / 56 / 40GbE• 64 ports of 50 / 25 / 10GbE
• L2, L3 and ACL-based forwarding• Overlay support (VXLAN, NVGRE, Geneve)
and MPLS• QoS and Congestion control
• Dynamically shared buffer
© 2016 Mellanox Technologies 7
Mellanox Spectrum ASIC – the best in the industry
50
60
70
80
90
100
Packet Size (Bytes)
Broadcom
50
60
70
80
90
100
Packet Size (Bytes)
Spectrum
LatencyFairness Zero Packet Loss
Ultra low latency for any traffic type for all packet sizes
Fully shared buffer for fair bandwidth allocation
True wire-speed performance for all packet sizes
BroadcomSpectrum
www.zeropacketloss.com
www.Mellanox.com/tolly
• Tolly report uncovers fundamental differences in architecture and
performance between Mellanox Spectrum and Broadcom Tomahawk
• Majority of networking vendors are building their switches based on
Broadcom – Cisco, Juniper, Extreme, Arista, Huawei and many others
© 2016 Mellanox Technologies 8
Ability to choice a Network OS on Spectrum - is the reality
6 Spectrum switches:
• Cumulus Linux
• Mellanox OS
• Microsoft ACS (SONiC)
• HP OpenSwitch (future)
• MetaSwitch (future)
• BaiduOS (future)
OCP Summit March 2016 – Live Demo
© 2016 Mellanox Technologies 9
Open Compute Project
Open HW
Switch Abstraction Interfaces
Any NOS
Open Network Linux
Unified Platform
Interfaces
Testing and Benchmarking
Suits
Open Network Install
Environment
The choice of building blocks via
standard interfaces
Ecosystem of vendors and
consumers
Mix and match at any level
Choose any component of IT
architecture at any level (HW, SW,
platform):
• Server
• Storage
• Networking
• Rack
• Management
• Data Center physical environment
© 2016 Mellanox Technologies 10
ONIE – Openness of Operating System
ONIE – Open Network Install Environment
• Boot loader + Linux kernel + BusyBox
• Similar approach to OS installation on a server
Choice - any Net-OS on any hardware
First power up
(or upgrade)
ONIE boot on local CPU
Connect to management
network
Download NetOS installer via FTP or TFTP
Execute NetOS
installer
Install the NetOS
Boot from NetOS
ONIE Operation
© 2016 Mellanox Technologies 11
What is SAI? User-space ASIC abstraction
OCP spec to abstract switch ASIC management
interfaces
• Shim layer over the SDK API
• Vendor-independent control and management
- Init, startup/shutdown, Port, L2, L3, ACL, QoS
- vendor-specific features and extensions
• Is developing by OCP community
• Leaders: Microsoft, Dell, Facebook, Broadcom, Intel,
Mellanox
• Provides SW interoperability with any HW,
supporting SAI
• Can be used to develop your own NOS
Mellanox involvement
• Co-authoring the spec from its initiation
• SAI in part of Spectrum SDK
• SAI-based NOS:
- OpenSwitch
- SONiC
© 2016 Mellanox Technologies 12
User Space
Kernel
Hardware
Mellanox Spectrum ASIC
Linux Network Stack
Linux Network Drivers
mlxsw(Mellanox Switch Drivers)
iproute2 utilities(tc, bridge, ip, etc.)
3rd party applications / NOS(Quagga, OpenFlow, etc.)
User applications
What is Switchdev? In-Kernel ASIC abstraction
© 2016 Mellanox Technologies 13
Switchdev architecture
Linux kernel driver, allowing network datapath
offload into HW (ASIC)
Is developing by Linux community (Netdev group),
part of Linux kernel
Fully open solution, Mellanox contributed and
developing Spectrum driver to Switchdev
Uses standard Linux Netdev APIs
• Open-source HW offload API of Linux network stack
• Allows to install generic Linux distro on Spectrum
switch using SAI and achieve wire-rate networking
performance for L2/L3
mlxsw_pci
mlxsw_core
mlxsw_spectrum
Port netdev
sw1p1
Port netdev
sw1p2
Port netdev
sw1pN
Switchdev infrastructure
bridge
(L2)
tc
(Traffic Control)
ip
(L3)
FDB
APIs
Flow
APIsFIB
APIs
Operations Notifications
User Space
Kernel
Spectrum
systemp1 p2 pN
© 2016 Mellanox Technologies 14
Switchdev example (Linux is running on Spectrum)
panther # brctrl addbr br0
panther # brctrl addif br0 eth29
panther # brctrl addif br0 eth28
host-2 # iperf –s –i1
host-1 # iperf -c 192.168.1.1 -i1 -P 8
------------------------------------------------------------
Client connecting to 192.168.1.1, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 9] local 192.168.1.2 port 38848 connected with 192.168.1.1 port 5001
...
[ 7] local 192.168.1.2 port 38844 connected with 192.168.1.1 port 5001
[ 10] local 192.168.1.2 port 38846 connected with 192.168.1.1 port 5001
[ ID] Interval Transfer Bandwidth
[ 9] 1.0- 2.0 sec 1.25 GBytes 10.7 Gbits/sec
[ 3] 1.0- 2.0 sec 1.28 GBytes 11.0 Gbits/sec
…
[ 7] 1.0- 2.0 sec 1.44 GBytes 12.4 Gbits/sec
[ 10] 1.0- 2.0 sec 1.36 GBytes 11.7 Gbits/sec
[SUM] 1.0- 2.0 sec 10.5 GBytes 90.6 Gbits/sec
© 2016 Mellanox Technologies 15
Industry’s First Open Source MLAG
Open MLAG enhances SDN
• First common MLAG implementation proposal
• Hardware agnostic
• Enabling MLAG interoperability
MLAG• Synchronization of switches for creation of LAG
entities that span ports of both switches
• A host (or switch) connected to the pair transparently runs a standard LAG
• Load balancing, faster failure recovery SONiC
© 2016 Mellanox Technologies 16
The only Multi-Host Ethernet adapter for Facebook Yosemite server
Industry first 25/50Gb Ethernet adapter for Facebook Leopard server
Industry first 50Gb Ethernet adapter for OpenPower Rackspace Barreleye server
Mellanox OCP adapters
Yosemite Leopard Barreleye
50GbE
Multihost NIC 50GbE
NIC
25GbE
NIC
© 2016 Mellanox Technologies 17
Facebook Yosemite: disaggregated Multi-Host server
Facebook Yosemite - Quad Core Multi-host server platform
100Gb/s
Copper
Cable
ConnectX-4
OCP2.0 100Gb/s
Multi-Core Adapter
Multi-Host
PCI-Express
Host Connections
100Gb/E
QSFP
Yosemite Sled
With Quad CPU Cards
© 2016 Mellanox Technologies 18
OCP • Open SDK on GitHub
• OCP Switch Hardware
• OCP Adapters
Open mLAG • Contributed to GitHub
SAI,
Switchdev
• Switch Abstraction Interface
• Enables MSFT ACS (SONiC)
• Mellanox Switchdev driver in upstream
ONIE• All Spectrum
switches have ONIE
Any
NOS
• MLNX-OS
• Cumulus
• OPS, SONiC (future)
Mellanox contribution into Open Ethernet and Open Networking
© 2016 Mellanox Technologies 19
1. Choose Your Software/Hardware Components
2. Choose The Most Effective Hardware
Summary
Thank you!
Q&A