Download pptx - SDN Controller Challenges

Transcript
Page 1: SDN Controller Challenges

SDN Controller Challenges

Page 2: SDN Controller Challenges

The Story Thus Far

• SDN --- centralize the network’s control plane– The controller is effectively the brain of the

network– Controller determines what to do and tell switches

how to do it.

Page 3: SDN Controller Challenges

The Story Thus Far

Page 4: SDN Controller Challenges

The Story Thus Far

Something Happened!!!!

Page 5: SDN Controller Challenges

The Story Thus Far

Let’s Ask the Brian!!!!

Page 6: SDN Controller Challenges

The Story Thus FarThink about what happen…Maybe come up with a solution

Page 7: SDN Controller Challenges

The Story Thus Far

• Controller runs control function• Control function creates switch state

– F(global network state) Switch state – Global network state can be graph of the network

Tell the network what to do

Page 8: SDN Controller Challenges

Challenges with Centralization

• Single point of failure– Fault tolerance

• Performance bottleneck– Scalability– Efficiency (switch-controller latency)

• Single point for security violations

Page 9: SDN Controller Challenges

Motivation for Distributed Controllers

• Wide-Area-Network– Wide distribution of switches: from USA to Australia.– High latency between one controller and All switches

• Application + Network growth– Higher CPU load for controller– More memory for storing FIB entries and calculations

• High availabilit

Page 10: SDN Controller Challenges

Class Outline

• Fault Tolerance– Google’s B4 paper

• Controller Scalability– Ways to scale the controller– Distributed controllers: Mesh Versus Hierarchy– Implications of controller placement

Page 11: SDN Controller Challenges

Fault Tolerance

Page 12: SDN Controller Challenges

Google’s B4 Network

• Provides connectivity between DC sites• Uses SDN to control edge switches• Goal: high utilization of links• Insight: fine-grained control over edge and

network can lead to higher utilization• Distributed Controllers– One set of controllers for each Data center (site)

Page 13: SDN Controller Challenges

Google’s B4 Network

• Provides connectivity between DC sites• Uses SDN to control edge switches• Goal: high utilization of links• Distributed Controllers– One set of controllers for each Data center (site)

Page 14: SDN Controller Challenges

Fault Tolerance in B4

• Each site runs a set of controller• Paxos is run between controllers in a site to

determine master

Page 15: SDN Controller Challenges

Quick Overview of Paxos• Given N controllers

– 1 Acts as leader, and N-1 as workers– All N controller maintain the same state

• Switches interact with leader• Change doesn’t happen until whole group agrees• Failure of primary

• N-1 work together to elect a new leader(determine new leader)

Network Events

Propagate State changes

Page 16: SDN Controller Challenges

Pros-Cons of Paxos

• Pros– Well understood and studied; gives good FT– Many implementations in the wild– E.g. Zookeeper

• Cons– Time to recover– Impacts through of the put of the entire system

Page 17: SDN Controller Challenges

Controller Scalability

Page 18: SDN Controller Challenges

What limits a controller’s scalability?

• Number of control messages from switch– Depends on the application logic• E.g. MicroTE/Hedera periodically query all switches for

stats• Reactive controller, evaluated in NoX, requires each

switch to send messages for a new flow– Packet-in (if reactive Apps)– Flow stats, Flow_time-outs

Page 19: SDN Controller Challenges

What limits a controller’s scalability?

• Application processing overhead• The controller runs a bunch of application– Similar to: A server running a set of programs– CPU/Memory constraint limit how the app runs

Page 20: SDN Controller Challenges

What limits a controller’s scalability?

• Distance between controller and the switches

Controller 1

Hedera L3 FW

Page 21: SDN Controller Challenges

How to Scale the Controller.• Obvious: add more controllers.• BUT: how about the applications?– Synchronization/concurrency problems. • Who controls which switch?• Who reacts to which events?

Controller 1

Hedera L3 FW

Controller 2

Hedera L3 FW

Controller N

Hedera L3 FW? ?

Stats + Install OF entries

Page 22: SDN Controller Challenges

Medium Sized Networks• Assumption:

– controller can’t store all forwarding table entries in memory – But can process all events and run all apps

• Each controller– Get same network events+ running same app. same output– But store output for only a fraction and config only a fraction

Controller 1

Hedera L3 FW

Controller 2

Hedera L3 FW

Controller N

Hedera L3 FW

Stats + Install OF entries

Page 23: SDN Controller Challenges

Medium Sized Networks: hyperflow

• Each controller– Push state to each controller– Each controller things it’s the only one in the network

Controller 1Hedera L3 FW

Controller 2Hedera L3 FW

Controller NHedera L3 FW

Stats + Install OF entries

Sub-subscribe ssytem

Page 24: SDN Controller Challenges

Large Sized Networks

• Assumptions– Each controller can’t store all the FIB entries– Each controller can’t run the entire application or

handle events

• Need to partition the application– But how?

Page 25: SDN Controller Challenges

Application partition 1

• Approach 1: each controller runs a specific application– How do your resolve conflicts in FW entries– Apps can conflict in the rules they install

Controller 1

Hedera

Controller 2

L3

Controller N

FW

Page 26: SDN Controller Challenges

Application partition 2

• Approach 2: all controllers run the same application but for a subset of devices– Results in a Distributed Mesh control plane

Controller 1

Hedera L3 FW

Controller 2

Hedera L3 FW

Controller N

Hedera L3 FW

Abstract Network view

Page 27: SDN Controller Challenges

Application Partition 2

• Abstract view exchanged with each other– Abstract view reduces the n/w information used

by each controller

Controller 2

Hedera L3 FW

REAL NETWORK

Controller 2’s View of NETWORK

Abstraction Provided byController 1

Abstraction Provided byController N

Page 28: SDN Controller Challenges

ONIX to the SDN Programmer

• Controllers synchronize through a DB or DHT– So each app needs synchronization code.– How do you deal with concurrency.

• How to synchronize between domains.

• How many domains? Or controllers?

• How many switches in a domain?

Page 29: SDN Controller Challenges

Application partition 3

• Approach 3: divide application into local, and global.– Results in a hierarchical control plane

• Global Controller and Local Controllers– Applications that do not need network-wide state• Can be run locally without communicate with other

controllers

Page 30: SDN Controller Challenges

Are Hierarchical Controllers Feasible

• Examples of local applications:– Link Discovery, Learning switch, local policies

• Examples of local portions of a global algo– Data center Traffic engineering

• Elephant flow detection (hedera)• Predictability detection (MicroTE)

• Local apps/controllers have other benefits– High parallelism– Can be run closer to the devices.

Page 31: SDN Controller Challenges

Kandoo: Hierarchical controllers

Controller 1

Hedera L3 FWController 2

Hedera L3 FW

Controller N

Hedera L3 FW

Global ControllerHedera

• 2 levels of controllers: global and local– Local applications are embarrassingly parallel– Local shields global from network events

Page 32: SDN Controller Challenges

Kandoo: Hierarchical controllers

Controller 1

Hedera L3 FWController 2

Hedera L3 FW

Controller N

Hedera L3 FW

Global ControllerHedera

• Local Controllers: run local apps– Returns abstract view to the global controller– Reduces # events sent to global and reduce size of network

seen by

Page 33: SDN Controller Challenges

Kandoo: Hierarchical controllers

Controller 1

Hedera L3 FWController 2

Hedera L3 FW

Controller N

Hedera L3 FW

Global ControllerHedera

• Global Controllers– Runs global apps: AKA apps that need network

wide state

Page 34: SDN Controller Challenges

Hedera Reminder

• Goal: reduce network contention• Insight: contention happens when elephants

share paths.• Solution:– Detect Elephant flows– Place Elephant flows on different flows

Page 35: SDN Controller Challenges

Implementing Hedera in Onix

Controller 1

Hedera: detection +placement

Controller 2

Hedera: detection+placement

• 2 levels of controllers: global and local– Local applications are embarrassingly parallel– Local shields global from network events

StatsStatsFlow

Table entries

Flow Table entries

Exchange TM+detection

Page 36: SDN Controller Challenges

Implementing Hedera in Kandoo

Controller 1Elephant detection

Controller 2 Controller N

Global ControllerHedera: Global placement

• Local Controllers: get stats from networks + elephant detection• Global Controller: decide flow placement + flow installation

Elephant detection Elephant detection

Inform of elephant flows

Stats

Install new flow table entries

Page 37: SDN Controller Challenges

Implementing B4 in Kandoo like architecture

Site ControllerElephant detection

Site Controller 2 Site Controller N

Global Controller

TE+BW allocator

• Local Controllers: get stats from networks + determines demand• Global Controller: calculate paths for traffic

Elephant detection Elephant detection

Install TE Ops

Stats + Install OF entries

TE DB

Inform of Flow demands

Page 38: SDN Controller Challenges

Kandoo to the SDN Programmer

• Think of what is local and what is global– When apps are written, annotate with local flag

• Kandoo will automatically place local – And place global.

• Kandoo restricts messages between global and local controllers– You can’t send OF styles messages – Must send Kandoo style messages

Page 39: SDN Controller Challenges

Summary

• Centralization provide simplicity at the cost of reliability and scalability

• Replication can improve reliability and scalability• For Reliability, Paxos is an option• For Scalability, conqueror and divide – Partition the applications

• Kandoo: Local apps and global apps– Partition the network

• Onix: each controller controls a subset of switches (Domain)


Recommended