1 Towards Green Routers: Depth- Bounded Multi-Pipeline Architecture for Power-Efficient IP Lookup...

Preview:

Citation preview

1

Towards Green Routers: Depth-Bounded Multi-Pipeline Architecture

forPower-Efficient IP Lookup

Author:Weirong Jiang Viktor K. Prasanna

Publisher:Performance, Computing and Communications Conference, 2008. IPCCC 2008. IEEE International

Presenter:Po Ting Huang

Date:2010/3/16

2

Introduction

Although TCAMs dominate today’s high-end routers, they are not scalable in terms of clock rate and power consumption.

SRAM-based pipeline solutions are considered promising alternatives for high-speed IP lookup engines.

Existing SRAM-based pipeline architectures suffer from high power consumption in the worst cases, due to the large memory size and the long pipeline depth.

3

Outline

Introduction Index Table Pipeline Structure Pipeline Balancing Prefix expansion(Trie Partitioning)

Height Bounded Split(Subtrie-to-Pipeline Mapping) Node-to-Stage Mapping

Architecture Index Table Pipeline Structure Performance

4

Prefix expansion(Trie Partitioning)

initial stride(I): The number of initial bits to be used A larger I:more small subtries, but result in prefix duplication Prefix duplication: results in memory inefficiency and may incre

ase the update cost

5

Prefix expansion(Trie Partitioning)

6

Height Bounded Split(Subtrie-to-Pipeline Mapping)

7

Node-to-Stage Mapping

Leaf reduction:Assuming there were 64 next-hop ports in the router, we found that over 99% of the leaf nodes could be removed after the optimization,resulting in up to 35% less memory used.

8

9

Architecture

10

Index Table

#SRAMA=2^I #TCAM&SRAMB =2^(32-Bh)

Index SRAMs store the information associated with each subtrie: (1) the mapped pipeline ID (2) the ID of the stage where the subtrie’s root is stored (3) the address of the subtrie’s root in that stage.

11

Pipeline Structure

Constraint 1: If node A is an ancestor of node B in a trie, then A must be mapped to a stage preceding the stage to which B is mapped.

Each node in each pipeline stage store three informantion

(1) the prefix or the next-hop pointer (2) the memory address of its child node in the pipeli

ne stage where the child node is stored (3) the distance to the pipeline stage where the child

node is stored.

12

Performance

#pipeline D=4 Initial stride I=12 Determine the height bound

small:can reduce the number of memory accesses for IP lookup

large:large number of subtries requires a large table to index those subtires

13

Selection of Height Bound

Bh=16 H=Bh+1=17

14

Maping performance

15

(13+4+15)*2^14*17*4=4.25MB

each stage 64KB Overall clock rate =1/0.8017=1.25GHz

Throughtput=400Gbps(packet size =40byte)

Recommended