35
VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong Xia, Chengchen Hu 1

VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

Embed Size (px)

Citation preview

Page 1: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

1

VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in

Virtualized DatacenterXitao Wen, Kai Chen, Yan Chen,

Yongqiang Liu, Yong Xia, Chengchen Hu

Page 2: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

2

Datacenter as Infrastructure

Page 3: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

3

Congestion in DatacenterCore

Aggregation

Edge

Pod 0 Pod 1 Pod 2 Pod 3

10:1~100:1

2:1~10:1

Packet loss! Queuing

delay!

Degrading Throughput!

Page 4: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

4

Congestion in the WildGeneral ApproachesProblem FormulationMain DesignEvaluation

Page 5: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

5

Spatial Pattern

• Unbalanced utilization– Hotspot: Hot links

account for <10% core links [IMC10]

– Spatially unbalanced utilization

Sender

Rece

iver

Page 6: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

6

Temporal Pattern

• Long congestion event– lasts for 10s of minutes– Individual event has clear spatial pattern

Core

Lin

k In

dex

Page 7: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

7

Traffic Stability

• Bursty at a fine granularity– Not predictable at 10s or

100s or milliseconds [IMC10][SIGCOMM09]

• Predictable at timescale of 10s of minutes– 40% to 70% pairwise traffic

can be expected stable– 90%+ predictable traffic

aggregated at core links

Page 8: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

8

General ApproachesProblem FormulationMain DesignEvaluation

Congestion in the Wild

Page 9: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

9

General Approaches

• Network Layer– Increase network bandwidth

• Fat-tree, BCube, OSA…

– Optimize flow routing• Hedera, MicroTE

• Application Layer– Optimize VM placement

• Expensive• Requires to

upgrade entire DC network

• Not scalable• Requires

hardware support

• Depends on rich path diversity

• Scalable• Lightweight deployment• Suitable for existing

over-subscribed network

Page 10: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

10

• Virtualization Layer• VM Live Migration

– Keep continuous service while migrating– 1.1x – 1.4x VM memory transfer

Server

VM

Server

DC Network

VM VM VM

Major Cost!

Background on Virtualized DC

Page 11: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

11

Optimize VM Placement

• Offload traffic from congested link

active VM

idle VM

Page 12: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

12

Congestion in the WildGeneral ApproachesProblem FormulationMain DesignEvaluation

Page 13: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

13

Design Goal

• Mitigate congestion– Maximum link utilization (MLU)

• Controllable migration traffic (i.e. moving VM)– Less than reduced traffic

• Reasonable runtime overhead– Far less than target timescale (10s of mins)

Objective

Constraint

Page 14: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

Problem Statement

• Input– Topology and routing of

physical servers– Traffic matrix among VMs– Current Placement

• Variable & Output– Optimized Placement

• NP-hardness– Proof: reduced from

Quadratic Bottleneck Assignment Problem

14

Page 15: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

15

Related Work

• Optimize VM placement– Server consolidation [SOSP’07]– Fault tolerance [ICS’07]– Network scalability [INFOCOM’10]

Page 16: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

16

Main DesignEvaluation

Congestion in the WildGeneral ApproachesProblem Formulation

Page 17: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

17

Inspiration

Stretch the tie violently, making it loose and less tangled.

Solve each tie gently, by carefully reeving the end out of the tie.

Page 18: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

Two-step Algorithm

• Fast and greedy• Search for localizing

overall traffic • May stuck in local

minimum

• Fine-grained and randomized

• Search for mitigating traffic on the most congested links

• Help avoid local minimum

Simulated Annealing

Multiway θ-Kernighan-Lin

Topology & Routing

Traffic Matrix

Current VM Placement

Optimized VM placement

18

Page 19: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

19

Multiway Θ-Kernighan-Lin (KL)

• Top-down graph cut improvement

• Introduce Θ to limit # of moves

• O(n2log(n))

Page 20: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

20

Multiway Θ-Kernighan-Lin (KL)

• Top-down graph cut improvement

• Introduce Θ to limit # of moves

• O(n2log(n))

Page 21: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

21

Multiway Θ-Kernighan-Lin (KL)

• Top-down graph cut improvement

• Introduce Θ to limit # of moves

• O(n2log(n))

Page 22: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

22

MLU=.60MLU=.53

Simulated Annealing Searching (SA)

• Randomized global searching

• Terminate when obtains satisfied solution, or predefined max depth is reached

Page 23: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

23

Evaluation

Congestion in the WildGeneral ApproachesProblem FormulationMain Design

Page 24: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

24

Methodology

• Baseline Algorithm– Clustering-based algorithm– Pro: best-known static optimality– Con: high runtime and migration overhead

• Metrics– MLU reduction without migration overhead– Overhead

• Migration traffic• Runtime overhead

– Simulation results

Page 25: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

25

MLU Reduction without Overhead

VirtualKnotter demonstrates similar static performance as that of Clustering.

Page 26: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

26

Migration Traffic

VirtualKnotter shows significantly less migration traffic than that of Clustering.

Page 27: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

27

Runtime Overhead

VirtualKnotter demonstrates reasonable runtime overhead.

Page 28: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

28

Simulation Results

53% less congestion

Altogether, VirtualKnotter obtains significant gain on congestion resolving.

Page 29: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

29

Conclusions

• Collaborative VM migration can substantially resolve long-term congestion in DC

• Trade-off between optimality and migration traffic is essential to harvest the benefit

DC networking projects of Northwestern LIST: http://list.cs.northwestern.edu/dcn

Page 30: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

30

Thank you!

Page 31: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

31

Backup

Page 32: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

32

General Approaches

Cost

Hardware

Support

Scalability

Other Dependen

cy

Increase Bandwidth

High Yes Varies

Optimize Routing Low Yes Low Rich path

diversity

Optimize VM

PlacementLow No High

VM deployme

nt

Page 33: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

33

Problem Statement

• Objective– Minimize Maximum Link Utilization (MLU)– “Cool down the hottest spot”

• Constraints– Migration traffic – Server hardware capacity– Inseparable VM

• NP-hardness– Proof: reduced from Quadratic Bottleneck Assignment

Problem

Page 34: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

34

Observation Summary

• Unbalanced jam (spatial)• Long-term congestion (temporal)• Predictable at 10s of minutes scale (stability)

Page 35: VirtualKnotter: Online Virtual Machine Shuffling for Congestion Resolving in Virtualized Datacenter Xitao Wen, Kai Chen, Yan Chen, Yongqiang Liu, Yong

35

Two-step Algorithm

Multiway Θ-Kernighan-Lin Algorithm (KL)• Fast search for approximation

Simulated Annealing Searching (SA)• Fine search for better

solution