25
Progress on SDN, OVS and dynamic circuits Ramiro Voicu California Institute of Technology LHCOPN-LHCONE meeting Amsterdam, October 28-29, 2015

Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

Progress on SDN, OVS and dynamic circuits

Ramiro VoicuCalifornia Institute of Technology

LHCOPN-LHCONE meetingAmsterdam, October 28-29, 2015

Page 2: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

LHC Collisions at 13 TeVImage source: CMS

Page 3: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity, even at the start of LHC Run2

WLCG Dashboard Transfer Throughput

CMS

ATLAS

ALICE

LHCb

June 2015 October 2015

15GB/s 20GB/s

Page 4: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

PhEDEx and Dynamic CircuitsUsingdynamiccircuitsinPhEDEx allowsformoredeterministicworkflows,usefulforco-schedulingCPUwithdatamovementIntegratingcircuitawarenessintotheFileDownload agent:

• Applicationisbackendagnostic;NomodificationstoPhEDEx DB• AllcontrollogicisintheFileDownload agent• TransparentforallotherPhEDEx instances

PhEDEx throughput on a shared path

(with 5 Gbps of UDP cross traffic)

Seamless switchoverNo interruption of service

sandy01-amswood1-ams

T2_ANSE_Amsterdam

sandy01-gva hermes2

T2_ANSE_Geneva

HighspeedWANcircuit

Sharedpath

PhEDEx throughput on a dedicated path

1h moving average

1h moving average

FDTsustainedrates:~1500MB/sAverageover24hrs:~1360MB/s

Phed

exRa

teM

B/s

Page 5: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

Integrating NSI Circuits with PhEDEx: Control Logic

1. PhEDEx requests a circuit between sites A and B; waits for confirmation

2. Wrapper gets a vector of source and destination IPs of all servers involved in the transfer, via an SRM plugin

3. Wrapper passes this information to the OF controller4. PhEDEx receives the confirmation of the circuit, informs the OF

controller that a circuit has been established between the two sites5. OF controller adds routing information in the OF switches that direct

all traffic on the subnet to the circuit

Page 6: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

Advanced Network Services for Experiments (ANSE) Integrating PhEDEx with Dynamic Circuits for CMS

• BuildingontheAutoGOLEFabricofOpenExchangePointsandNSIstandardvirtualcircuitsOSCARS+NSIcircuitsareusedtocreateWANpathswithreservedbandwidthacrosstheAutoGOLE fabric

• OFflow-matchingisdoneonspecificsubnetstorouteonlythedesiredtraffic

• Openflow alsocanbeusedtoselectpathsoutsidethefabric

Lapatadescu, Wildish, Mughal, Bunn, Legrand, Newman

ANSE: 1st real life application integration of NSI and the AutoGOLE fabric with PhEDEx for CMS

Page 7: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

Open vSwitch OVS

q “Open vSwitch is a production quality, multilayer virtual switch”

q OpenFlow protocol support (1.3)q Kernel and user space forwarding enginesq Runs on all major Linux distributions used in HEPq NIC bondingq Fine grained QoS supportq Ingress qdisc, HFSC, HTBq Used in a variety of hardware platforms (e.g. Pica8) and

software appliances like Mininetq Interoperates with OpenStack

q OVN (Open Virtual Network): Virtualized network “implemented on top of a tunnel-based (VXLAN, NVGRE, IPsec., etc) overlay network”

Page 8: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

Using OVS for end-host orchestrationIntegrating PhEDEx with Dynamic Circuits for CMS

Standard OpenFlow (or OVSDB) protocol for end-hostnetwork orchestration (no need for custom SB protocol)Simple procedure to migrate to OVS on the end-host. SDN controller not required in the initial deployment phaseHost type (storage, compute) dynamically discovered using OF identification string

Use SDN controller to create an overlay networkfrom circuit end-point (Border Router) to the storage

Page 9: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

• OVS2.3.1withstockRH6.xkernel

• OVSbridgedinterfaceachievedsameperformanceashardware(10Gbps)

SDN QoS: Traffic Shaping withOpen vSwitch (OVS)

• egressrate-limit• BasedonLinuxkernel:

• HTB(HierarchicalTokenBucket)

• HFSC(HierarchicalFair-ServiceCurve)

Page 10: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

• OVS2.4withstockkernel• NSIcircuitCaltech->UMICH(~60ms)

• Verystableupto7.5Gbps• Fairlygoodshapingabove• 8Gbps (smallinstabilities)

Traffic Shaping withOpen vSwitch (OVS) WAN tests over NSI

Page 11: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

OVS benefitsq Standard OpenFlow (and/or OVSDB) end-host

orchestration q QoS SDN orchestration in non-OpenFlow clustersq OVS works with stock SL/CentOS/RH 6.x kernel used in

HEP; works out-of-the-box on SL7/CC7q OVS bridged interface achieved the same performance as

the hardware (10Gbps)q No CPU overhead when OVS does traffic shaping on the

physical portq Traffic shaping (egress) of outgoing flows may help

performance in such cases when the upstream switch (or ToR) has smaller buffers

https://indico.cern.ch/event/376098/contribution/24/material/slides/1.pdf

Page 12: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

ODL controlling OVS via OVSDB and OF

ODL Controller

q OVSDB – Open vSwitchDatabase Management

q RFC7047q Support as SB protocol

in major SDN controllersq Used to create the virtual

bridgesq Virtual bridges can use

standard OF to speak with the controller

q Normal routing if the controller is down

Page 13: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

OpenFlow topology discovery with non-OpenFlow islands

ODL Controller

Non OF switch

q Same topology used for QoS tests

q OVS instances on the two hosts controlled by a standard ODL(BE)

q Non-OpenFlow switch in the middle

Topology seen by the controller is missing the link between the OVS instances

Page 14: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

OpenFlow topology discovery with non-OpenFlow islands

ODL Controller

Non OF switch

Use the same EtherType0x88cc (LLDP)

Instead of bridge-filteredmulticast MAC (01:80:C2:00:00:0E) use a normal multicast mac address (01:23:00:00:00:01)

Unofficially known as OpenFlow Discovery ProtocolOFDP

Page 15: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

OpenFlow islands over WAN & NSI circuit

ODL @ Caltech

OF island 1

OF island 3(Umich)

OF island 2

NSI Circuit Caltech - UMICH

DTN/FDTServer@UMICH

DTN/FDTServer@Caltech

A. Mughal, I.Legrand

Page 16: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

q Simplified topology management using a “single” controller

q The distributed SDN controller is a redundant setup for HA. The controllers share the same view of the network (support for redundancy embedded in ODL and ONOS)

q OVS can be used to identify the flows right at the edgesq Can also be used for QoS right at the edgeq Standard (and widely supported) protocols and software

components for controlq Seamless topology discovery even with Non-OF devices and/or

NSI(L2) circuits in the middle

Possible architecture with a “single” controller

NSI Circuit

Page 17: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

100G DTN TESTS

Page 18: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

DTN: 100G network tests

NetworkTopology:Serversusedintests:Sc-100G-1andSc100G-2

q Two Identical Haswell Serversq E5-2697 v3q X10DRi

Motherboardq 128GB DDR4

RAMq Mellanox VPI NICsq Qlogic NICsq CentOS 7.1q Dell Z9100 100GE

Switch q 100G CR4 Cables

from Elpeus for switch connections

q 100G CR4 Cables from Mellanox for back to back connections

Page 19: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

100G TCP tests

2 TCP StreamsStable 94Gbps

1 TCP StreamsStable 68Gbps

Line rate (100G) with 4+ TCP Streams

Client #numactl --physcpubind=20--localalloc java-jarfdt.jar-c1.1.1.2–nettest -P1-p7000Server #numactl --physcpubind=20--localalloc java-jarfdt.jar-p7000

CPU: Single Core - 100% utilization

Page 20: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

Pacific Research Platform (DTN Test)

Disk to Disk Transfers (actual files on disk)

UCSD à Caltech = Avg 36Gbps, Peaks of 40GbpsUCLA à Caltech = Avg 35Gbps, Peaks of 36GbpsUCSC à Caltech = Avg 10Gbps

Caltech– FDTServer=4xPCIe Gen3NVMEDrivesUCSD– FIONA=Zpool (16xSSDDrives,Intel530,120G)UCLA– FIONA=Zpool (16xSSDDrives,Intel530,120G)

Several issues (mostly Disk I/O related):NVME Drives:

• Zpool performance Tuning, poor performance compared to individual drives

• Software RAID, sync hangs, system crashes

Page 21: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

Dual DTN Test – Disk performance

Peaks of5.7 GBytes/s

Reading from disk at both UCSD and UCLAThe disk pools at each of the - Zpool of 16 Intel 530 SSD drivesReceiver side at Caltech8 Intel D3700 NVME drives were used (4 each for UCSD and UCLA)Traffic peaked at 5.7GB/sec

At this rate with TCP, CPU cores were fully utilized.

Several issues (mostly Disk I/O related):NVME Drives:

• Zpool performance Tuning, poor performance compared to individual drives• Software RAID, sync hangs, system occasional hangs

Page 22: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

THANK YOU!

QUESTIONS?

Page 23: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

EXTRA SLIDES

Page 24: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

SDN Multipath OpenDaylightDemonstrations at Supercomputing 2014

q 100 Gbit links between Brocade and Extreme switches at Caltech, iCAIR and Vanderbilt booths

q 40 Gbit links from many booth hosts to switches

q Single ODL/Multipath Controller operating in “reactive” mode

§ For matching packets: Controller writes flow rules into switches, with a variety of path selection strategies

§ Unmatched packets “punted” to Controller by switch

Demonstrated:• Successful, high speed, flow path calculation,

selection and writing• OF switch support from vendors• Resilience against changing net topologies [At

layer 1 or 2]• Monitoring and Control

SC14Demo:Caltech/iCAIR/VanderbiltOFlinks

Page 25: Progress on SDN, OVS and dynamic circuits › event › 401680 › contributions › ... · 2018-11-19 · Complex Workflow: the Flow Patterns Have Increased in Scale and Complexity,

Focused Technical Workshop Demo 2015:SDN-Driven Multipath Circuits

- Hardened OESS and OSCARS installations at Caltech, Umich, AmLight- Updated Dell switch firmware to operate stably with OpenFlow

A. MughalJ. Bezerra

- Dynamic circuit paths under SDN control - Prelude to the ANSE architecture: SDN load-balanced, moderated flows

Caltech, Michigan, FIU, ANSP and Rio, with Network Partners:Internet2, CENIC, Merit, FLR, AmLight, ANSP and Rio in Brazil