Openflow for Cloud Scalability

Openflow Enabled Cloud Scalability

DaoliCloud Company

Beijing & Shanghai, China

www.daolicloud.com

wenbo dot mao at daolicloud dot com

Presentation at

China Future Network

Innovation & Development Forum

and

Global SDN Open Networking Conference

December 8-9, 2014

Nanjing, China

DaoliCloud Company All Rights Reserved © 2011-2014 2

Abstract

Challenge Recent success of Docker containers reveals arrival of

a new era: the number of CPUs is exploding 10-100 folds up, cloud

networking is already in a new movement of scalability upgrade

Question To scale UP or OUT? I.e., UPgrade or OUTgrade?

Answer from DaoliCloud’s practice: Better scale OUT, ,,.and

Openflow can help

Forward plane:

Unicast cable for entities

linking scale-OUT clouds

Control plane of a tenant

a, b: plug

a, c: unplug

b, c: plug

a, b: plug

a, c: unplug

b, c: plug

VEB VEB

VEB: Virtual Ether Bridge

where to do SDN programming


When overlay network needs a scale

• MPLS, VPN, VXLAN, (NV)GRE, IPsec, LISP, STT, Geneve, …,

these well-known solutions share below depicted wrapping

• What about DHCP, ARP, tenant-isolation, firewall, …, etc?

Can such negotiations & discussions be wrapped to scale OUT?

• Unfortunately not, unlikely in any scale OUT sense; let’s analyze

why using, e.g., Google’s Kubernetes networking:

• Wrap within a Docker host for tenant isolation, wrap trans-Docker-

host traffic for L3 connection, Wrap servers for avoiding MAC

population to ToR switch, wrap trans-Openstack traffic, … well, if this

final wrapping is possible at all for independent orchestration …

• See “Technical Backup Material” for more detailed technical analysis

Payload encapsulation to nullify all network functions,

e.g., NAT, and block all world visibility for containersEncapsulation

header label

= control plane

info placed in

forward plane

Underlay

packet

headers

L4/L3/L2

Better “One Cloud Two Openstack”

“一云两治”

7

Cloud scale UP vs. scale OUT

DaoliCloud Company All Rights Reserved © 2011-2014

One cloud one Openstack?

Managing thousands of servers?

Like a “Tower of Babble”?

We have tried to build one

Unfortunately,

mission too hard, if not impossible!

A cloud out of patched Openstacks has

unbound scale, yet with a good service

stability, shrunk bug-fix zone, and grey-

degree release & integration continuity

…OpenStack

…

…

…

OpenStack

Each is

independently

orchestrated

Each has a

moderate size

OpenStack

OpenStack

Any worldwide distributed entity is mapped

to a “Physically Associated Address” (PAA)

e.g., PAA = (MACs, IPs, ContextTag)

L4 Port: Very good candidate for ContextTag

DaoliCloud Company All Rights Reserved © 2011-2014

Important property of PAA:

Within a flow lifetime, PAA can be

uniquely mapped to a Worldwide

entity, that’s why forward plane has

unicast cables between any pair of

entities, no need of encapsulation!

9

Forward plane:

Unicast cable for entities

in independent clouds

Control plane of a tenant

a, b: plug

a, c: unplug

b, c: plug

a, b: plug

a, c: unplug

b, c: plug

VEB VEB

VEB = Virtual Ether Bridge, where

to distribute SDN programming

TSC = Tenant SDN Controller

Scale-OUT knowhow: Openflow coding overlay

Role of Openflow control plane: To agree upon mapping coding between VEBs


Non-encapsulation technology to patch clouds

• Novel and useful improvement to Openflow standard

• L2/L3/L4 header metadata mapping, coding and replacing

technology (Compare figure below with that in Slide 3)

• Random mappings are non-secret; SDN controller can help agree

mappings to connect separate L2s while seeing neither intranets

info

• That’s how notion of Tenant SDN Controller (TSC) rises: TSC

working with independent clouds connect distributed nodes within

for tenant!

• Minus encap, all other virtues of Openflow are kept, e.g., efficient

per flow checking routing table in VEB fastpath, instead of

inefficient per frame checking underlay label (yellow part in Slide 3)

• Extremely efficient: Header metadata replacement operated in nest

eliminates MAC populating in exponential reduction speed! Also no

packet enlargement, no fragmentation, no broadcast via TSC, …

Overlay/underlay

packets headers

L2/L3/L4 mapping

coding & replacing

We use Openstack + Docker for their hopeful future standards potential.

Openstack can of course be replaced with, e.g., vRealize, Kubernetes, Contrail,

Azure, BlueMix, ACI, CloudStack, …, or a mixture of these cloud orchestrators


CDN, LVS, Horizon, … ...

...

... ...

...

Openstack

at Beijing

Openstack

at Shanghai

Openstack

at N. Virginia

Openstack

at Ireland

This is a false “Openstack” doing

no resource provisioning; it is a

single-sign-on web portal +

Tenant SDN Controller (TSC).

It runs very fast since no events

queuing, file write lock, CoW DBs,

nor negotiations among many

resource provisioning modules.

Each true Openstack

below is a completely

independent cloud

orchestration domain

CDN, LVS, Horizon, …

Application:

Patching independently

orchestrated and desirably

small implemented clouds

for unbound scalability

and servicing stability

Come to see

Demo @ Booth 4


Long term value for inter-cloud patching

Our work of patching independently orchestrated Openstacks

originally motivated for stable service operation and maintenance, a

shrunk zone to ease debug (Openstack is a well-known code “tar pit”),

and a grey-degree of Openstack+Docker release integration (new

versions coming out very fast). We have succeeded all these very well.

However our practices have convinced us more …

If cloud remain in today’s status quo of each provider encapsulating its

own connectivity without interoperability, then provider lock-in would be

inevitable, obviously not good for users, in fact, non-scalable cloud is not

good for provider either.

Openflow enabled non-encapsulation based inter-cloud connectivity and

interoperability hence provide very important value to all.


Conclusion

Openflow enabled non-encapsulation overlay networking—

DaoliCloud’s Network Virtualization Infrastructure NVI & Tenant

SDN Controlller TSC technologies—invents a practical solution to

cloud network virtualization to eliminate physical boundaries

between moderate sized, ease of service operated and maintained

clouds, and hence provides the cloud with unbound scalability,

arbitrary elasticity, ease of service maintenance, release continuity,

… desirable properties

It is our belief that a hopeful future inter-cloud interoperability

standard should avoid encapsulation protocols when scaling OUT

Sign-up for free trial account now at

www.daolicloud.com


Technical Backup MaterialInherent problems for cloud networking

MAC address explosion One rack of servers in current CPU

condensity can host 10s of thousands containers. In conventional

flood-&-learn MAC populating, a ToR switch must hold multiple

such numbers of MACs since a cloud should be larger than one

rack. Moreover, can so MAC populated ToR work efficiently, and in

an affordable cost?

L2 broadcast control ARP broadcast is the only practical way to

plug-&-play construct a physical L2. However broadcast has

prohibitively high cost; to build a very large physical L2 is certainly

to look for trouble. In the next slide we shall discuss how current

technologies for L2 broadcast control, and their irrelevance to large

scale cloud networking.

The following cloud networking problems are already bad enough

for the scale of hypervisor-based CPU virtualization; the explosive

scale of container-based CPUs will only worsen the matter


Technical Backup MaterialCloud networking current technologies analysis

Encapsulation protocols in Slide 3 can L3 tunnel connect separate L2s

Key issue They are peer connection protocols: SDN controller must

see both L2s intranets to orchestrate connection. That’s why they’re aka

“large L2” protocols. Enlarging intranets hopelessly kills scalability for

cloud services. Also killed enroute is cloud service interoperability.

Technical assessment

1. To avoid MAC explosion and control L2 broadcast, encap for

servers/hosts; to isolate tenants, encap for each tenant; to patch cloud

for truly large scalability, encap further for IDCs; in general, to connect n

instances, O(n^2) encapsulations are needed.

2. IP connectivity is carefully architected to be connectionless flows so

that forward plane only conducts per flow checking for routing, this very

important architecting is nullified by encap into per packet checking

labelling (yellow header in Slide 3), that’s why encap is inefficient.

3. Encap enlarges packet over MTU (Maximum Transmission Unit), and

hence fragmentation/reassemble, additional cost.

Software

Openflow for Cloud Scalability