Memory-Efficient and Scalable Virtual Routers Using FPGA

Preview:

DESCRIPTION

Memory-Efficient and Scalable Virtual Routers Using FPGA. Author : Hoang Le, Thilan Ganegedara and Viktor K. Prasanna Publisher: ACM/SIGDA FPGA '11 Presenter: Zi-Yang Ou Date: 2011/10/12. About FPGA’11. Introduction. - PowerPoint PPT Presentation

Citation preview

1

Memory-Efficient and Scalable Virtual Routers Using FPGA

Author:Hoang Le, Thilan Ganegedara and Viktor K. PrasannaPublisher:ACM/SIGDA FPGA '11Presenter:Zi-Yang OuDate:2011/10/12

About FPGA’11

2

Introduction Network virtualization is a technique to consolidate

multiple networking devices onto a single hardware platform.

To make efficient use of the networking resources.

An abstraction of the network functionality away from the underlying physical network.

Separated and Merged3

Contributions A simple merging algorithm results in the total amount of

required memory to be less sensitive to the number of routing tables, but to the total number of virtual prefixes.

A tree-based architecture for IP lookup in virtual router that achieves high throughput and supports quick update.

Use of external SRAMs to support large virtual routing table of up to 16M virtual prefixes.

A scalable design with linear storage complexity and

resource requirements.

4

Set-Bounded Leaf-Pushing Algorithm In order to use tree search algorithms, the given set of

prefixes needs to be processed to eliminate the overlap

between prefixes. This elimination process results in a set

of disjoint prefixes. 1. Build a trie from the given routing table. 2. Grows the trie to a full tree. 3. Pushes all the prefixes to the leaf nodes.

The prefix tables expand about 1.6 times after leaf pushing. About 90% of the prefixes are leaf prefixes.

5

Set-Bounded Leaf-Pushing Algorithm There are 3 steps involved in the algorithm:

1. Move the leaves of the trie into Set 1

2. Trim the leaf-removed trie

3. Leaf-push the resulting trie and move the leaves into Set 2

NS1 = lN

NS2 = kN(1 − l)

N‘ = NS1 + NS2 = kN + lN(1 − k) = N(k + l − kl) l = 0.9 k = 1.6 N‘ = 1.06N

6

2-3 Tree

7

Merging Algorithm

8

Merging Algorithm

9

Merging Algorithm

10

Merging Algorithm

11

Merging Algorithm

12

Merging Algorithm

13

IP Lookup Algorithm for Virtual Router

14

IP Lookup Algorithm for Virtual Router

15

Memory Requirement

16

Memory Requirement M1 = |S1|(L + LP + LV ID + LNHI + 2LPtr1)

M2 = |S2|(L + LP + LV ID + LNHI + 2LPtr2)

M = M1 +M2 = (|S1| + |S2|)(L + LP + LV ID + LNHI)

+ 2|S1|LPtr1 + 2|S2|LPtr2 M = (|S1| + |S2|)(L + LP + logm + LNHI)

+ |S1|LPtr1 + |S2|LPtr2

|S1| + |S2| = 1.06N , O(N × logm).

17

Overall Architecture

18

Overall Architecture

19

Overall Architecture

20

Memory Management

21

Virtual Routing Table Update

22

Scalability The per-level required memory size grows exponentially

as we go from one level to the next of the tree. Therefore, we can move the last few stages onto external SRAMs.

23

IMPLEMENTATION M = (|S1| + |S2|)(L + LP + LVID + LNHI)

+ |S1|LPtr1 + |S2|LPtr2 M = NS(L + LP + LVID + LNHI + LPtr) MIPv4 = NS(32 + 5 + 5 + 6 + 20) = 68NS MIPv6 = NS(128 + 7 + 5 + 6 + 20) = 164NS

A state-of-the-art FPGA device with 36 Mb of on-chip memory (e.g. Xilinx Virtex6) can support up to 530K prefixes (for IPv4), or up to 220K prefixes (for IPv6),

without using external SRAM.

24

IMPLEMENTATION In our design, external SRAMs can be used to handle even

larger routing tables, by moving the last stages of the pipelines onto external SRAMs.

Thus, the architecture can support up to 16M prefixes,

or 880K prefixes for IPv4 and IPv6, respectively.

25

Experimental Setup

26

Experimental Setup

27

Performance Comparison Candidates : trie-overlapping (A1), trie braiding (A2) Time complexity:

Our algorithm:O(N) A1 : O(NlogN) A2 : O(N^2) Memory efficiency:

Our algorithm: 15 MB for 1.3M prefixes

A1 : 9MB for 290K prefixes

A2 : 4.5MB for 290K prefixes Quick-update capability:

A1 and A2 : reconstruct the entire lookup data structure

28

Recommended