Upload
aileen
View
116
Download
3
Embed Size (px)
DESCRIPTION
Sharing the Datacenter Network - Seawall. Alan Shieh Cornell University Srikanth Kandula Albert Greenberg Changhoon Kim Bikas Saha Microsoft Research, Azure, Bing. Presented by WANG Ting. Ability to multiplex is a key driver for the datacenter business - PowerPoint PPT Presentation
Citation preview
Alan ShiehCornell UniversitySrikanth KandulaAlbert GreenbergChanghoon KimBikas SahaMicrosoft Research, Azure, Bing
Sharing the Datacenter Network - Seawall
Presented by WANG Ting
Ability to multiplex is a key driver for the datacenter business
Diverse applications, jobs, and tenants share common infrastructure
Congestion Control at flow granularity (TCP)
The de-facto way to share the network is
Monopolize shared resource• Use many TCP flows• Use more aggressive variants of TCP• Do not react to congestion (UDP)Denial of service attack on VM or rack• Place a malicious VM on the same machine (rack)
as victim• Flood traffic to that VM
Normal Traffic Malicious or Selfish tenant
Problem: Performance interference
Problem:Hard to achieve cluster objectivesEven with well-behaved applications, no good way toAllocate disjoint resources coherently:
Reduce slot != Map slot due to differing # of flowsAdapt allocation as needed:
Boost task that is holding back job due to congestion
Decouple network allocation from application’s traffic profile
Have freedom to do this in datacenters
RequirementsProvide simple, flexible service interface for
tenants Support any protocol or traffic patternNeed not specify bandwidth requirements
Scale to datacenter workloadsO(10^5) VMs and tasks, O(10^4) tenantsO(10^5) new tasks per minute, O(10^3) deployments
per dayUse network efficiently (e.g., work conserving)Operate with commodity network devices
< x Mbps
In-network queuing and rate limiting
Existing mechanisms are insufficient
Not scalable. Slow, cumbersome to reconfigure switches
< x Mbps
Does not provide end-to-end protection; Wasteful in common case
Hard to specify. Overhead. Wasteful in common case.
End host rate limits
Reservations
HV
HV
Basic ideas in SeawallLeverage congestion control loops to adapt
network allocationUtilizes network efficientlyCan control allocations based on policyNeeds no central coordination
Implemented in the hypervisor to enforce policyIsolated from tenant codeAvoids scalability, churn, and
reconfiguration limitations of hardware
Weights: Simple, flexible service model
Weights enable high level policiesPerformance isolationDifferentiated provisioning model
Increase priority of stragglers
Small VM: CPU = 1 coreMemory = 1 GBNetwork weight = 1
Every VM is associated with a weightSeawall allocates bandwidth share in
proportion to weight
Tunnel
Tunnel
Components of Seawall
To control the network usage of endpoints Shims on the forwarding paths at the sender and
receiver One tunnel per VM <source,destination> Periodic congestion feedback (% lost, ECN marked...) Controller adapts allowed rate on each tunnel
Hyp
ervi
sor
Congestion feedback (once every 50ms)
Tunnel
Rate controller
Rate controller
Path-oriented congestion control is not enough
Weight 1
Weight 1
Seawall (link-oriented congestion control)
TCP (path-oriented congestion control)
Weight 1
75%
25%
Weight 1
50%
50%
Effective share increases with #
of tunnels
No change in effective weight
Path-oriented congestion control is not enough
Seawall = Link-oriented congestion controlBuilds on standard congestion control loops
AIMD, CUBIC, DCTCP, MulTCP, MPAT, ...Run in rate limit modeExtend congestion control loops to accept
weight parameterAllocates bandwidth according to per-link
weighted fair shareWorks on commodity hardware
Will show that the combination achieves our goal
50%
50%
For every source VM1. Run a separate distributed control loop
(e.g., AIMD)instance for every active link to generate per-link rate limit
2. Convert per-link rate limits to per-tunnel rate limits
100%
Weight 1
Weight 1
50%
50%
For every source VM1. Run a separate distributed control loop (e.g., AIMD)
instance for every active link to generate per-link rate limit
2. Convert per-link rate limits to per-tunnel rate limitsWeight 1
Weight 1
50%
For every source VM1. Run a separate distributed control loop (e.g., AIMD)
instance for every active link to generate per-link rate limit
2. Convert per-link rate limits to per-tunnel rate limitsWeight 1
Weight 1 Greedy + exponential smoothing
10%25%
15%
Achieving link-oriented control loop
1. How to map paths to links?Easy to get topology in the data centerChanges are rare and easy to disseminate
2. How to obtain link-level congestion feedback?
Such feedback requires switch mods that are not yet available
Use path-congestion feedback (e.g., ECN, losses)
Implementation
Userspace rate controllerKernel datapath shim
(NDIS filter)
Prototype runs on Microsoft Hyper-V root partition and native Windows
Achieving line-rate performanceHow to add congestion control header to
packets?Naïve approach: Use encapsulation, but poses
problemsMore code in shimBreaks hardware optimizations that depend
on header formatBit-stealing: reuse redundant/predictable parts
of existing headers
Other protocols: might need paravirtualization.
IP IP-ID
TCP Timestamp option0x08
0x0a
TSval
TSecr
Seq #
# pa
cket
sSe
q #
Constant
Unused
Evaluation1. Evaluate performance2. Examine protection in presence of
malicious nodes
TestbedXeon L5520 2.26Ghz (4 core Nehalem)1 Gb/s access linksIaaS model: entities = VMs
Performance
Minimal overhead beyond null NDIS filter
(metrics = cpu, memory, throughput)
At Sender
Protection against DoS/selfish traffic
Strategy: UDP flood (red) vs TCP (blue)Equal weights, so ideal share is 50/50
UDP flood is contained
1000 Mbps 430 Mbps
1.5 Mbps
Strategy:Open many TCP connections
Protection against DoS/selfish traffic
Attacker sees little increase with # of flows
Seaw
all
Seaw
all
Seaw
all
Strategy:Open connections to many destinations
Protection against DoS/selfish traffic
Allocation see little change with # of destinations
Seaw
all
Seaw
all
Seaw
all
Related work(Datacenter) Transport protocols
DCTCP, ICTCP, XCP, CUBICNetwork sharing systems
SecondNet, Gatekeeper, CloudPoliceNIC- and switch- based allocation
mechanismsWFQ, DRR, MPLS, VLANs
Industry efforts to improve network / vswitch integration
Congestion Manager
ConclusionShared datacenter network are vulnerable
to selfish, compromised & malicious tenants
Seawall uses hypervisor rate limiters + end-to-end rate controller to provide performance isolation while achieving high performance and efficient network utilization
We develop link-oriented congestion controlUse parameterized control loopsCompose congestion feedback from many
destinations
Thank You!