14
OvS Lookup Optimization Using Two-Layer Table Lookup Intel Labs Sameh Gobriel and Charlie Tai

OvSLookup Optimization - Open vSwitchopenvswitch.org/support/ovscon2016/8/1050-gobriel.pdf · 5 Bloom Filter –Background • With certain false positive rate, bloom filter is used

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

OvS Lookup Optimization Using Two-Layer Table Lookup

Intel LabsSameh Gobriel and Charlie Tai

OvS De Facto Virtual Switch for NFV Environments

2

• General purpose processors with Cache/memory hierarchy can support much larger flow tables.

• Multicores architecture provide a scalable competitive flow classification performance.

• Network appliances use purpose-built H/W & ASICs (e.g., TCAM) for flow classification

• Cost & power consumption are limiting factors to support large number of flows

Monolithic Purpose-built Boxes

NFV

Networking VMs on Standard Servers

3

Open vSwitch Flow Lookup

1. Set of disjoint sub-table with no priority

2. Rule is only inserted into one sub-table (lookup terminates after first match)

3. Lookup is done by sequentially search each sub-table until a match is found

Fig. Vtunes OVS flow lookup process (bypass EMC). Test case: 20 sub-tables, each has 100 rules.

OvS Flow Classification is a bottleneck

4

Two Layer Table Lookup Abstraction for MFC

L Lookups 1 lookup + 1st Level Indirection Overhead

5

Bloom Filter – Background

• With certain false positive rate, bloom filter is used to check if a variable (x,y,z) is a member. Member means the variable has been inserted already.

• We can use bloom filter to check if a flow is inside a sub-table or not, before searching the sub-table.

1 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0

X

Hash1

Hash 2

Y

Hash1Hash 2

Lookup Z Lookup Y

6

Bloom Filter – Lookup Scheme

• Before searching into sub-table, we use bloom filter to check if the masked key (sub-key) is a member of the sub-table or not.

Mask 0

Sub-table

Mask 1

Sub-table

Mask 2

Sub-table

Mask 0

Sub-table

Mask 1

Sub-table

Mask 2

Sub-table

bloomfilter0 bloomfilter1 bloomfilter2By using BF

Cycles Breakdown Assuming L-subtable traversal

7

Using Bloom FilterUsing Current Scheme

Hit Cycles = 395 + L * 150

Operation Cycles Repetition

I/O 210 1

Miniflow Extract 103 1

Hash For Submask 97 L

Subtable Sig. Cmp 53 L

Full Key Comparison 82 1

Operation Cycles Repetition

I/O 210 1

Miniflow Extract 103 1

Hash For Bloom Filter 88 L

Bloom Filter Lookup 30 L

Check Subtable Sig.Cmp

53 1

Full Key Comparison 82 1

Hit Cycles = 448 + L * 118

8

Vector Bloom Filter – Lookup Scheme

• Vector Bloom Filter (or vBF) hashes and stores unmasked full keys (like EMC).

• vBF Filter for each sub-table store encountered full keys corresponding to rules in sub-tables

• A new flow always misses vBF (similar to EMC) but can hit a rule in the sub-table.

Mask 0

Sub-table

Mask 1

Sub-table

Mask 2

Sub-table

vBF0 vBF1 vBF2

By using VBF

Mask 0

Sub-table

Mask 1

Sub-table

Mask 2

Sub-table

bloomfilter0 bloomfilter1 bloomfilter2

Full Unmasked Key (512B)

Vector Bloom Filter – Cost Analysis

9

Using vBF

Using Original Scheme

Hit Cycles = 395 + L * 150

Operation Cycles Repetition

I/O 210 1

Miniflow Extract 103 1

Hash For Submask 97 L

Subtable Sig. Cmp 53 L

Full Key Comparison 82 1

Hit Cycles = 607 + L * 30

Operation Cycles Repetition

I/O 210 1

Miniflow Extract 103 1

Hash For XBloom (full Key)

159 1

Bloom Filter Lookup 30 L

Check Subtable Sig. Cmp 53 1

Full Key Comparison 82 1

Packet header

11110000Flow mask

1010 xxxx0011 xxxx1011 xxxx

rules

11100000

101x xxxx001x xxxx101x xxxx

11000000

10xx xxxx00xx xxxx10xx xxxx

vbf vbf vbf1 2 3

Lookup Cycles Based on Model

10

0

500

1000

1500

2000

2500

3000

3500

4000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Cyc

les

Average Number of Subtables Traversed

OVS BF vBF

~2X

Results and performance figures are for an experimental prototype and is work in progress. The result reflect specific components on a particular test, in specific systems and should not be generalized for actual products. Differences in hardware, software, or configuration will affect actual performance. Results are generated using a model based on processing cycles of Intel Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz using OvS 2.6.0 with 20 sub-table and uniform random traffic.

Counting Bloom Filters to Handle Deletion

Start with an m bit array, filled with 0s.

Hash each item xj in k times. If Hi(xj) = a, add 1 to B[a].

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0B

0 3 0 0 1 0 2 0 0 3 2 1 0 2 1 0B

To delete xj decrement the corresponding counters.

0 2 0 0 0 0 2 0 0 3 2 1 0 1 1 0B

Can obtain a corresponding Bloom filter by reducing to 0/1.

0 1 0 0 0 0 1 0 0 1 1 1 0 1 1 0B

4 bits/counter Probability of Overflow = 6.78 E-17

12

Vector Bloom Filter – Results

• vBF provides significant lookup performance gain when compared with native OvS.

• Gain increases with increasing number of sub-tables.

Fig. 20 subtables, and each su-btable contains various numbers of rules. Note that we disabled EMC for showing the benefits.

1000000

1500000

2000000

2500000

3000000

3500000

40000001 9

25

49

81

12

1

16

9

22

5

28

9

36

1

44

1

52

9

62

5

72

9

84

1

96

1

10

89

12

25

13

69

15

21

16

81

18

49

20

25

22

09

24

01

flows per table

Throughput Comparison

Th

rou

gh

pu

t (P

PS

/Co

re)

vBF

OvS(optimized)

Results and performance figures are for an experimental prototype and is work in progress. The result reflect specific components on a particular test, in specific systems and should not be generalized for actual products. Differences in hardware, software, or configuration will affect actual performance. Results are generated using uniform random traffic with 20 sub-tables running on Intel Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz and using OvS 2.6.0..

~2X~1.6X

13

Conclusion

• Flow Lookup is a performance bottleneck for OvS, especially with increasing number of flows and sub-tables.

• Two layer table architecture optimizes flow lookup in OvS and avoids the sequential search of the sub-tables.

• Vector Bloom Filter (vBF) uses bloom filters as the first layer and can significantly improves lookup performance for OvS.

• Future Work:

• Investigate other technologies to use as the first layer of indirection.

• Realistic traffic pattern and workload

Questions?

[email protected]@intel.com