Scalable Rule Management for Data Centers · ToR1 ToR2 S1 S2 S3 S4 S5 S6 ToR3 VM2 VM4 P4 P2. Introduction Motivation Design Evaluation Traffic-aware refinement Overhead greedy approach

Scalable Rule Management for

Data Centers

Masoud Moshref, Minlan Yu,

Abhishek Sharma, Ramesh Govindan

4/3/2013

Motivation Motivation Design EvaluationIntroduction

Introduction: Definitions

2

Datacenters use rules to implement management policiesDatacenters use rules to implement management policies

• Access control• Rate limiting• Traffic measurement • Traffic engineering



3

Datacenters use rules to implement management policiesDatacenters use rules to implement management policiesDatacenters use rules to implement management policies

An action on a set of ranges on flow fieldsAn action on a set of ranges on flow fields

Examples:

• Deny

• Accept

• Enqueue

Flow fields examples:

• Src IP / Dst IP

• Protocol

• Src Port / Dst Port



4

Datacenters use rules to implement management policies

R2

R1

Src IP R1: Accept

� SrcIP: 12.0.0.0/7

� DstIP: 10.0.0.0/8

An action on a set of ranges on flow fields

Dst

IP

R2: Deny

� SrcIP: 12.0.0.0/8

� DstIP: 8.0.0.0/6


Current practice

5

Rules are saved on predefined fixed machines

On hypervisors On switches


Machines have limited resources

6

Top-of-Rack switch

Network Interface Card

Software switches on servers


Future datacenters will have many fine-grained rules

7

Regulating VM pair communication

• Access control (CloudPolice)

• Bandwidth allocation (Seawall)

Per flow decision

• Flow measurement for

traffic engineering (MicroTE, Hedera)

VLAN per server

• Traffic management (NetLord, Spain)

1B – 20B rules

10M – 100M rules

1M rules

Introduction ArchitectureMotivation Design Evaluation

Rule location trade-off (resource vs. bandwidth usage)

8

Storing rules at hypervisor incurs CPU overhead

R0


Rule location trade-off (resource vs. bandwidth usage)

9

Storing rules at hypervisor incurs CPU overheadMove the rule to ToR switch and forward traffic

R0


Rule location trade-off: Offload to servers

10

R1


Challenges: Concrete example

11

Src IP

Dst

IP

Agg1 Agg2

ToR1 ToR2

S1 S2 S3 S4 S5 S6

ToR3

R0

R1 R5

R6

R2

R3

R4

0 1 2 3 4 5 6 7

1

2

3

4

5

6

7

0

VM0

VM1

VM2

VM3

VM4

VM5

VM6

VM7


Challenges: Overlapping rules

12

R0

R1 R5

R6

R2

R3

R4

0 1 2 3 4 5 6 7

1

2

3

4

5

6

7

0

Agg1 Agg2

ToR1 ToR2

S1 S2 S3 S4 S5 S6

ToR3

VM2

VM6

R0

R1

R2

R3

R4

0 1 2 3 4 5 6 7

1

2

3

4

5

6

7

0

Source Placement: Saving rules on the source machine means

minimum overhead

Src IP

Dst

IP

R0

R1

R2

R3

R4

0 1 2 3 4 5 6 7

1

2

3

4

5

6

7

0


Agg1 Agg2

ToR1 ToR2

S1 S2 S3 S4 S5 S6

ToR3

Challenges: Overlapping rules

13

R0

R1

R2

R3

R4

0 1 2 3 4 5 6 7

1

2

3

4

5

6

7

0

VM2

VM6

R4

R0

R1

R2

R3

0 1 2 3 4 5 6 7

1

2

3

4

5

6

7

0

If Source Placement is not feasible

Src IP

Dst

IP


Heterogeneous devices

Challenges

14

Respect resource constraints

Minimize traffic overhead

Traffic changesRule changesVM Migration

Preserve the semantics of overlapping rules

Handle Dynamics

Introduction DesignMotivation Evaluation

Contribution: vCRIB, a Virtual Cloud Rule Information Base

15

vCRIB

Proactive rule placement abstraction layer

Optimize traffic given resource constraints & changes

Rules

R1 R2

R3R4 R3


vCRIB design

16

Source Partitioning with Replication

Topology &Routing

Rules

Partitions

Overlapping

Rules

Minimum Traffic Feasible Placement


R0

R 1

R5

R7

R8

R2

R3 R6R

4

Partitioning with cutting

17

P1

Smaller partitions have more flexibility

Cutting causes rule inflation

0 1 2 3 4 5 6 7

1

2

3

4

5

6

7

0

R0

R 1

R8

R3

R4 R

0

R5

R 1

R7

R2

R3R6


Partitioning with replication

18

R0

R3

R1

A7

R6R

0

R 1

R8

R3

R4 R0

R1

R5R

2

R3

R0

R1R

5R

2

R3

R7

R6

Introduce the concept of

similarity to mitigate inflation

�� , �� ∩ �� R0, R1, R3 � 3

P1 (5 rules) P3 (5 rules)P2(5 rules)

0 1 2 3 4 5 6 7

1

2

3

4

5

6

7

00 1 2 3 4 5 6 7

1

2

3

4

5

6

7

0

0 1 2 3 4 5 6 7

1

2

3

4

5

6

7

0

P1 ∪ P2(7 rules)


Per-source partitions

19

R0

R1 R5

R6

R2

R3

R4

0 1 2 3 4 5 6 7

1

2

3

4

5

6

7

0

• Limited resource for forwarding

• No need for replication to

approximate source-placement

• Closer partitions are more similar

Src IP

Dst

IP


vCRIB design: Placement

20


Topology &Routing

Rules

Partitions

PlacementT11

T21 T22

T23

T32

T33

Traffic Overhead

Resource

Constraints



vCRIB design: Placement

21


Topology &Routing

Rules

Partitions


Feasible Placement

Traffic Overhead

Resource

Constraints

Traffic Overhead

Resource-Aware Placement

Traffic-Aware Refinement



FFDS (First Fit Decreasing Similarity)

22

1. Put a random partition on an empty device

2. Add the most similar partitions to the initial partition

until the device is full

Find the lower bound for optimal solution for rules

Prove the algorithm is a 2-approximation of the lower

bound


vCRIB design: Heterogeneous resources

23


Topology &Routing

Rules

Partitions



Feasible Placement

Resource

Heterogeneity

Resource Usage

FunctionTraffic-Aware Refinement


vCRIB design: Traffic-Aware Refinement

24



Partitions

Feasible Placement


Resource Usage

FunctionTraffic Overhead

Topology &Routing

Rules



Traffic-aware refinement

� Overhead greedy approach

1. Pick maximum overhead partition

2. Put it where minimizes the overhead and maintains feasibility

25

Agg1 Agg2

ToR1 ToR2

S1 S2 S3 S4 S5 S6

ToR3

VM2 VM4

P2

P4

Agg1 Agg2

ToR1 ToR2

S1 S2 S3 S4 S5 S6

ToR3

VM2 VM4

P2P4


Traffic-aware refinement

� Overhead greedy approach

1. Pick maximum overhead partition

2. Put it where minimizes the overhead and maintains feasibility

�Problem: Local minima

� Our approach: Benefit greedy

26

Agg1 Agg2

ToR1 ToR2

S1 S2 S3 S4 S5 S6

ToR3

VM2

VM4

P2

P4

Agg1 Agg2

ToR1 ToR2

S1 S2 S3 S4 S5 S6

ToR3

VM2

VM4

P2

P4


vCRIB design: Dynamics

27



Rule/VM Dynamics

Partitions

Feasible Placement


Resource Usage

Function

Major Traffic Changes

Dynamics

Topology &Routing

Rules



vCRIB design

28



Rule/VM Dynamics

Partitions

Feasible Placement


Resource Usage

FunctionTraffic Overhead

Major Traffic Changes

Dynamics

Topology &Routing

Rules

Resource

Constraints

Resource

Heterogeneity

Overlapping

Rules


EvaluationIntroduction Motivation Design

Evaluation

� Comparing vCRIB vs. Source-Placement

� Parameter sensitivity analysis

� Rules in partitions

� Traffic locality

� VMs per server

� Different memory sizes

� Where is the traffic overhead added?

� Traffic-aware refinement for online scenarios

� Heterogeneous resource constraints

� Switch-only scenarios

29


Simulation setup

� 1k servers with 20 VMs per server in a Fat-tree network

� 200k rules generated by ClassBench and random action

� IPs are assigned in two ways:

� Random

� Range

� Flows

� Size follows long-tail distribution

� Local traffic matrix (0.5 same rack, 0.3 same pod, 0.2 interpod)

30

0 1 2 3 4 5 6 70 1 2 3 4 5 6 7


Comparing vCRIB vs. Source-Placement

31

4k_0 4k_4k 4k_6k0

0.05

0.1

0.15

0.2

0.25

0.3

Server memory_Switch memory

Tra

ffic

over

he

ad ra

tio

RangeRandom

Maximum Load is 5K Capacity is 4K

Range is better as similar partitions are from the same source

Random: Average load is 4.2K

Adding more resources helps vCRIB reduce traffic overhead

vCRIB finds low traffic feasible solution


Parameter sensitivity analysis: Rules in partitions

32

Total space

Defined by maximum load on a server

Source placement


0 1000 20000

1000

2000

Average Partition Size

Ave

rag

e S

imila

rity


33

Total space

vCRIB

Source placement


0 1000 20000

1000

2000

Average Partition Size

Ave

rag

e S

imila

rity

Total space

vCRIB


34

Lower traffic overhead for smaller partitions and

more similar ones

vCRIB<10% Traffic

Source placement

A

A


Future work

Conclusion

Conclusion and future work

35

vCRIB allows operators and users to specify rules, and

manages their placement in a way that respects

resource constraints and minimizes traffic overhead.

• Support reactive placement by adding the

controller in the loop

• Break a partition for large number of rules per VM

• Test for other rulesets

Scalable Rule Management for

Data Centers

Masoud Moshref, Minlan Yu,

Abhishek Sharma, Ramesh Govindan

4/3/2013

Documents

Scalable Rule Management for Data Centers · ToR1 ToR2 S1 S2 S3 S4 S5 S6 ToR3 VM2 VM4 P4 P2. Introduction Motivation Design Evaluation Traffic-aware refinement Overhead greedy approach