Upload
others
View
17
Download
0
Embed Size (px)
Citation preview
Design principles in parser designGlen GibbDept. of Electrical Engineering
Advisor: Prof. Nick McKeown
Tuesday, May 14, 13
2
Header parsing?
Fiel
d
Fiel
d
Fiel
d
Fiel
d
Fiel
d
2
Header parsing?Identify headers & extract fields
?? ?? ?? ?? ??A CB
Tuesday, May 14, 13
Fiel
d
Fiel
d
Fiel
d
Fiel
d
Fiel
d
2
Header parsing?Identify headers & extract fields
?? ?? ?? ?? ??A CBDest.
Source
Proto.
Tuesday, May 14, 13
Fiel
d
Fiel
d
Fiel
d
Fiel
d
Fiel
d
2
Header parsing?Identify headers & extract fields
?? ?? ?? ?? ??A CB
Next Hop1234
Dest.
Source
Proto.
Tuesday, May 14, 13
Fiel
dDest.
Fiel
d
Fiel
d
Fiel
d
Fiel
d
Fiel
d
2
Header parsing?Identify headers & extract fields
?? ?? ?? ?? ??A CB
Next Hop1234
Dest.
Source
Proto.
Tuesday, May 14, 13
Fiel
dDest.
Fiel
d
Fiel
d
Fiel
d
Fiel
d
Fiel
d
2
Header parsing?Identify headers & extract fields
?? ?? ?? ?? ??A CB
Next Hop1234
Host X can talk to host Y
except via HTTPFirewallALLOWDENY
ALLOW
Dest.
Source
Proto.
Tuesday, May 14, 13
Fiel
dDest.
Fiel
d
Fiel
d
Fiel
d
Fiel
d
Fiel
d
2
Header parsing?Identify headers & extract fields
?? ?? ?? ?? ??A CB
Next Hop1234
Host X can talk to host Y
except via HTTPFirewallALLOWDENY
ALLOW
Dest.
Source
Proto.
Tuesday, May 14, 13
Fiel
dSource
Fiel
dDest.
Fiel
dProto.
Fiel
dDest.
Fiel
d
Fiel
d
Fiel
d
Fiel
d
Fiel
d
2
Header parsing?Identify headers & extract fields
?? ?? ?? ?? ??A CB
Next Hop1234
Host X can talk to host Y
except via HTTPFirewallALLOWDENY
ALLOW
Dest.
Source
Proto.
Tuesday, May 14, 13
Fiel
dSource
Fiel
dDest.
Fiel
dProto.
Fiel
dDest.
Fiel
d
Fiel
d
Fiel
d
Fiel
d
Fiel
d
2
Header parsing?Identify headers & extract fields
> 1 billion packets / secondNew packet every ns
?? ?? ?? ?? ??A CB
Next Hop1234
Host X can talk to host Y
except via HTTPFirewallALLOWDENY
ALLOW
Dest.
Source
Proto.
Tuesday, May 14, 13
Almost no prior work
3
Tuesday, May 14, 13
Leaping Multiple Headers in a Single Bound: Wire-Speed Parsing Using the
Kangaroo SystemC. Kozanitis, J. Huber, S. Singh, & G. Varghese
INFOCOM 2010
4
Programmable parserParses multiple headers per cycle
Receives all headers before parsing → high latency
Tuesday, May 14, 13
400 Gb/s Programmable Packet Parsing on a Single FPGA
M. Attig & G. BrebnerANCS 2011
5
Language to describe header sequencesCompile into efficient designs on FPGA
FPGA-centric — commercial switches are ASICsExtremely deep pipeline (100+ stages)
Tuesday, May 14, 13
6
Neither paper analyzes design trade-offs
or presents design principles
Tuesday, May 14, 13
1. Packet parsing
2. Understanding parser design
3. Providing flexibility
7
Outline
Tuesday, May 14, 13
8
Packet parsingNetwork review
Parsing process
Tuesday, May 14, 13
9
Internet
Tuesday, May 14, 13
9
Tuesday, May 14, 13
9
Tuesday, May 14, 13
10
1
2
3
4
Tuesday, May 14, 13
10
Packet Color
Output Port
⬤ 1
⬤ 2
⬤ 3
⬤ 4
1
2
3
4
Tuesday, May 14, 13
10
Packet Color
Output Port
⬤ 1
⬤ 2
⬤ 3
⬤ 4
1
2
3
4
Tuesday, May 14, 13
11
Packet
Tuesday, May 14, 13
11
Packet PayloadHeader 1 Header 2 Header 3
Tuesday, May 14, 13
11
Packet PayloadHeader 1 Header 2 Header 3 PayloadHeader 1 Header 2 Header 3
Field 1 Field 2 Field 3 ... Field n
Tuesday, May 14, 13
11
(Source Address) (DestinationAddress)
Packet PayloadHeader 1 Header 2 Header 3 PayloadHeader 1 Header 2 Header 3
Field 1 Field 2 Field 3 ... Field n
(Ethernet) (VLAN) (IPv4)
Tuesday, May 14, 13
11
Destination PortA 1B 2C 3D 4
(Source Address) (DestinationAddress)
Packet PayloadHeader 1 Header 2 Header 3 PayloadHeader 1 Header 2 Header 3
Field 1 Field 2 Field 3 ... Field n
(Ethernet) (VLAN) (IPv4)
Tuesday, May 14, 13
Parser
Match Tables
EthernetForwarding
IPRouting
Access Control List
ActionProcessing
Header fields
Packets
In
Queues
Out
12
Tuesday, May 14, 13
Parser
Match Tables
EthernetForwarding
IPRouting
Access Control List
ActionProcessing
Header fields
Packets
In
Queues
Out
12
Src MACDst MACEth Type
VLAN ID Src IPDst IPProtocol
Priority
Src PortDst Port
Ethernet VLAN IP TCP
Tuesday, May 14, 13
Parser
Match Tables
EthernetForwarding
IPRouting
Access Control List
ActionProcessing
Header fields
Packets
In
Queues
Out
12
Src MACDst MACEth Type
VLAN ID Src IPDst IPProtocol
Priority
Src PortDst Port
Ethernet VLAN IP TCP
Src MACDst MACEth TypeVLAN ID
Eth TypeDst IP
Src MACDst MACEth TypeVLAN ID
Src IPDst IPProtocol
Priority
Src PortDst Port
Tuesday, May 14, 13
13
Packet parsingNetwork review
Parsing process
Tuesday, May 14, 13
14
Parsing: identify headers & extract fields
Tuesday, May 14, 13
14
Parsing: identify headers & extract fields
A B C
A D
A B B
Tuesday, May 14, 13
14
Parsing: identify headers & extract fields
A B C
A D
A B B
?? ?? ?? ?? ??
Tuesday, May 14, 13
14
Parsing: identify headers & extract fields
A B C
A D
A B B
?? ?? ?? ?? ??A
Tuesday, May 14, 13
14
Parsing: identify headers & extract fields
A B C
A D
A B B
?? ?? ?? ?? ??A
Next:
B
Tuesday, May 14, 13
14
Parsing: identify headers & extract fields
A B C
A D
A B B
?? ?? ?? ?? ??A
Next:
B
Tuesday, May 14, 13
14
Parsing: identify headers & extract fields
A B C
A D
A B B
?? ?? ?? ?? ??A
Next:
B
B
Tuesday, May 14, 13
14
Parsing: identify headers & extract fields
A B C
A D
A B B
?? ?? ?? ?? ??A
Next:
B
B
Len:
20B
Next:
C
Tuesday, May 14, 13
Fiel
d
14
Parsing: identify headers & extract fields
A B C
A D
A B B
?? ?? ?? ?? ??A
Next:
B
B
Len:
20B
Next:
C
Tuesday, May 14, 13
Fiel
d
14
Parsing: identify headers & extract fields
A B C
A D
A B B
?? ?? ?? ?? ??A
Next:
B
B
Len:
20B
Next:
C
Tuesday, May 14, 13
Fiel
d
Fiel
d
Fiel
d
14
Parsing: identify headers & extract fields
A B C
A D
A B B
?? ?? ?? ?? ??A
Next:
B
B
Len:
20B
Next:
C
Tuesday, May 14, 13
Fiel
d
Fiel
d
Fiel
d
Fiel
d
14
Parsing: identify headers & extract fields
A B C
A D
A B B
?? ?? ?? ?? ??A
Next:
B
CB
Len:
20B
Next:
C
Next:
—
Tuesday, May 14, 13
Fiel
d
Fiel
d
Fiel
d
Fiel
d
Next Hop1234
14
Parsing: identify headers & extract fields
A B C
A D
A B B
?? ?? ?? ?? ??A
Next:
B
CB
Len:
20B
Next:
C
Next:
—
Tuesday, May 14, 13
15
Parse graphs
A
B C
D E
F
Tuesday, May 14, 13
15
Parse graphs
A
B C
D E
F
A C D F
A
B C
D E
F
Tuesday, May 14, 13
AExtract fields: 1, 2
BExtract fields: 2
CExtract fields: 1
DExtract fields: 2, 4
EExtract fields: 2
FExtract fields: 1, 2
15
Parse graphs
Tuesday, May 14, 13
AExtract fields: 1, 2
BExtract fields: 2
CExtract fields: 1
DExtract fields: 2, 4
EExtract fields: 2
FExtract fields: 1, 2
15
Parse graphs
Parse graph isthe state machine
Tuesday, May 14, 13
16
Parse graphs in the fieldEthernet
VLANVLAN
IPv4
GRE
NVGREEthernet
ARP/RARPTCP UDP
VXLAN
Data centerEthernet
IPv4 IPv6
MPLS
MPLS
MPLS
MPLS
MPLS
Service provider
Ethernet
IPv4 IPv6ARP RARP
TCP UDP GRE IPsec ESPIPsec AHSCTP
Enterprise edge
Enterprise
EthernetVLAN
VLAN
IPv4 IPv6
TCP UDP ICMP
ARP/RARP
Tuesday, May 14, 13
16
Ethernet
IPv4
VLAN(802.1Q)
VLAN(802.1Q) MPLS MPLS MPLS MPLS MPLS
IPv6
ARP RARP
VLAN(802.1ad)
PBB(802.1ah)
Ethernet
EoMPLS
ICMP
ICMPv6
TCPUDPGRE IPsec ESP IPsec AH SCTP
VXLANNVGRE IPv4IPv6
Parse graphs in the field
Tuesday, May 14, 13
What makes parsing hard?
17
Tuesday, May 14, 13
What makes parsing hard?• Many headers
• Many paths
• Variable path lengths
Ethernet
IPv4
VLAN(802.1Q)
VLAN(802.1Q) MPLS MPLS MPLS MPLS MPLS
IPv6
ARP RARP
VLAN(802.1ad)
PBB(802.1ah)
Ethernet
EoMPLS
ICMP
ICMPv6
TCPUDPGRE IPsec ESP IPsec AH SCTP
VXLANNVGRE IPv4IPv6
17
Tuesday, May 14, 13
What makes parsing hard?• Many headers
• Many paths
• Variable path lengths
• Variable header lengths
• Header identified by previous
Ethernet
IPv4
VLAN(802.1Q)
VLAN(802.1Q) MPLS MPLS MPLS MPLS MPLS
IPv6
ARP RARP
VLAN(802.1ad)
PBB(802.1ah)
Ethernet
EoMPLS
ICMP
ICMPv6
TCPUDPGRE IPsec ESP IPsec AH SCTP
VXLANNVGRE IPv4IPv6
17
Len:
20B
Len:
20B
Nex
t: IPv4
Nex
t: TCP
PayloadTCPLen: 20-60B
IPv4Len: 20-60B
EthernetLen: 14B
Tuesday, May 14, 13
What makes parsing hard?• Many headers
• Many paths
• Variable path lengths
• Variable header lengths
• Header identified by previous
• Line rate
• Aggressive latency
• Area & power constrained
Ethernet
IPv4
VLAN(802.1Q)
VLAN(802.1Q) MPLS MPLS MPLS MPLS MPLS
IPv6
ARP RARP
VLAN(802.1ad)
PBB(802.1ah)
Ethernet
EoMPLS
ICMP
ICMPv6
TCPUDPGRE IPsec ESP IPsec AH SCTP
VXLANNVGRE IPv4IPv6
17
Len:
20B
Len:
20B
Nex
t: IPv4
Nex
t: TCP
PayloadTCPLen: 20-60B
IPv4Len: 20-60B
EthernetLen: 14B
Tuesday, May 14, 13
What makes parsing hard?• Many headers
• Many paths
• Variable path lengths
• Variable header lengths
• Header identified by previous
• Line rate
• Aggressive latency
• Area & power constrained
Ethernet
IPv4
VLAN(802.1Q)
VLAN(802.1Q) MPLS MPLS MPLS MPLS MPLS
IPv6
ARP RARP
VLAN(802.1ad)
PBB(802.1ah)
Ethernet
EoMPLS
ICMP
ICMPv6
TCPUDPGRE IPsec ESP IPsec AH SCTP
VXLANNVGRE IPv4IPv6
17
64 x 10 Gb/s switch:• 1 billion pkts/sec• 250ns port-to-port• 40W
Len:
20B
Len:
20B
Nex
t: IPv4
Nex
t: TCP
PayloadTCPLen: 20-60B
IPv4Len: 20-60B
EthernetLen: 14B
Tuesday, May 14, 13
Implementing a parser
18
Tuesday, May 14, 13
Implementing a parser
18
A
B C
D E
F
Tuesday, May 14, 13
Packet data Extracted fields
Header types & locations
Implementing a parser
18
Header IdentificationA
B C
D E
F
ExtractedField
Buffer
Field Extraction
Tuesday, May 14, 13
Packet data Extracted fields
Header types & locations
Implementing a parser
18
Header IdentificationA
B C
D E
F
ExtractedField
Buffer
Field Extraction
Access ControlALLOWDENY
ALLOW
Fiel
d(S
ourc
e)
Fiel
d (D
est)
Fiel
d(P
roto
)
Fiel
d (S
ourc
e)
Fiel
d(D
est)
Fiel
d
Fiel
d
Fiel
d(P
roto
)
?? ?? ?? ?? ??
Tuesday, May 14, 13
Packet data Extracted fields
Header types & locations
Implementing a parser
18
Header IdentificationA
B C
D E
F
ExtractedField
Buffer
Field Extraction
Access ControlALLOWDENY
ALLOW
Fiel
d(S
ourc
e)
Fiel
d (D
est)
Fiel
d(P
roto
)
Fiel
d (S
ourc
e)
Fiel
d(D
est)
Fiel
d
Fiel
d
Fiel
d(P
roto
)
?? ?? ?? ?? ??A
Tuesday, May 14, 13
Packet data Extracted fields
Header types & locations
Implementing a parser
18
Header IdentificationA
B C
D E
F
ExtractedField
Buffer
Field Extraction
Access ControlALLOWDENY
ALLOW
Fiel
d(S
ourc
e)
Fiel
d (D
est)
Fiel
d(P
roto
)
Fiel
d (S
ourc
e)
Fiel
d(D
est)
Fiel
d
Fiel
d
Fiel
d(P
roto
)
?? ?? ?? ?? ??A
Extracted Field Buffer
Tuesday, May 14, 13
Packet data Extracted fields
Header types & locations
Implementing a parser
18
Header IdentificationA
B C
D E
F
ExtractedField
Buffer
Field Extraction
Access ControlALLOWDENY
ALLOW
Fiel
d(S
ourc
e)
Fiel
d (D
est)
Fiel
d(P
roto
)
Fiel
d (S
ourc
e)
Fiel
d(D
est)
Fiel
d(P
roto
)
?? ?? ?? ?? ??A CB
Extracted Field Buffer
Tuesday, May 14, 13
Packet data Extracted fields
Header types & locations
Implementing a parser
19
Header IdentificationA
B C
D E
F
ExtractedFieldBuffer
Field Extraction
Tuesday, May 14, 13
Packet data Extracted fields
Header types & locations
Implementing a parser
19
Header Identification
State Machine
A
B C
D E
F
ExtractedFieldBuffer
Field Extraction
Tuesday, May 14, 13
Packet data Extracted fields
Header types & locations
Implementing a parser
19
Header Identification
State Machine
A
B C
D E
F
ExtractedFieldBuffer
Field ExtractionHeader Extract Fields
A A1, A2B B1C C2, C4⋯ ⋯
Tuesday, May 14, 13
Data processing width?
20
?? ?? ?? ?? ?? ??
Tuesday, May 14, 13
Data processing width?
20
?? ?? ?? ?? ?? ??
A
B C
D E
F
Tuesday, May 14, 13
A
BC
D E
F
A
BC
D E
F
Data processing width?
20
?? ?? ?? ?? ?? ??
Packet position (B)0
48
12
Tuesday, May 14, 13
A
BC
D E
F
A
BC
D E
F
Data processing width?
20
?? ?? ?? ?? ?? ??
Packet position (B)0
48
12
Tuesday, May 14, 13
A
BC
D E
F
A
BC
D E
F
Data processing width?
20
?? ?? ?? ?? ?? ??
Packet position (B)0
48
12
4 cycles, 1 decision/cycleTuesday, May 14, 13
A
BC
D E
F
A
BC
D E
F
Data processing width?
20
?? ?? ?? ?? ?? ??
Packet position (B)0
48
12
4 cycles, 1 decision/cycleTuesday, May 14, 13
A
BC
D E
F
A
BC
D E
F
Data processing width?
20
?? ?? ?? ?? ?? ??
Packet position (B)0
48
12
4 cycles, 1 decision/cycle 2 cycles, 2 decisions/cycleTuesday, May 14, 13
21
Tuesday, May 14, 13
21
Processingwidth: 1B
Processingwidth: 2B
Processingwidth: 3B
Processingwidth: 16B
Tuesday, May 14, 13
21
Processingwidth: 1B
Processingwidth: 2B
Processingwidth: 3B
Processingwidth: 16B
Parser constructionPrototype: 2 months
Tuesday, May 14, 13
21
Processingwidth: 1B
Processingwidth: 2B
Processingwidth: 3B
Processingwidth: 16B
Parser constructionPrototype: 2 months
Processingwidth: 1B
Processingwidth: 2B
Processingwidth: 2B
Rate:10 Gb/s
Rate:20 Gb/s
Rate:100 Gb/s
Tuesday, May 14, 13
22
Understandingparser design
Parser generator
Trade-offs in parser design
Tuesday, May 14, 13
23
Parser(Verilog).v
Netlist Layout Reports:area, power, timing
Parser Generator
Clock Processingwidth
Parsegraph
…Parsersper chip
⋮
Synthesis
Tuesday, May 14, 13
24
.v
Netlist Layout Reports:area, power, timing
Parser(Verilog)
Parser Generator
Clock Processingwidth
Parsegraph
…Parsersper chip
⋮
Synthesis
Tuesday, May 14, 13
24
.v
Netlist Layout Reports:area, power, timing
Parser(Verilog)
Parser GeneratorGenesis
[Shacham et. al., IEEE Micro ’10]
Design Instance+
Per-Application ConfigurationA = 1 B = 12
Architectural Template
Clock Processingwidth
Parsegraph
…Parsersper chip
⋮
Synthesis
Tuesday, May 14, 13
24
.v
Netlist Layout Reports:area, power, timing
Parser(Verilog)
Parser GeneratorGenesis
[Shacham et. al., IEEE Micro ’10]
Design Instance+
Per-Application ConfigurationA = 1 B = 12
Architectural Template
Parser architectural template: mixed Perl/Verilog
//; foreach my $header (@headers) {//; my $hdrParser = generate('hdr_parser',//; "hdr_parser_" . $n++,//; Header => $header); `$hdrParser->instantiate()` ( .pkt_data (pkt data),
Clock Processingwidth
Parsegraph
…Parsersper chip
⋮
Synthesis
Tuesday, May 14, 13
25
.v
Parser design
ProcessingWidth
Parser GeneratorParse graph
Tuesday, May 14, 13
25
header {name: ____fields: ____extract: ____next-header: ____
}
...
Parse Graph &Header Formats
.v
Parser design
ProcessingWidth
Parser GeneratorParse graph
Tuesday, May 14, 13
A
BC
D E
F
25
header {name: ____fields: ____extract: ____next-header: ____
}
...
Parse Graph &Header Formats
.v
Parser design
ProcessingWidth
Parser GeneratorParse graph
Tuesday, May 14, 13
A
BC
D E
F
25
header {name: ____fields: ____extract: ____next-header: ____
}
...
Parse Graph &Header Formats
.v
Parser design
ProcessingWidth
Parser GeneratorParse graph
Tuesday, May 14, 13
A
BC
D E
F
25
header {name: ____fields: ____extract: ____next-header: ____
}
...
Parse Graph &Header Formats
.v
Parser design
ProcessingWidth
Parser GeneratorParse graph
A
A→B
A→C
Tuesday, May 14, 13
A
BC
D E
F
25
header {name: ____fields: ____extract: ____next-header: ____
}
...
Parse Graph &Header Formats
.v
Parser design
ProcessingWidth
Parser GeneratorParse graph
A
A→B
A→C
C
C→D
C→E
DD→F
EE→F
Tuesday, May 14, 13
26
A
C D
Next Header
B
Tuesday, May 14, 13
26
A
C D
Next Header
B
A
C D
Next Header
B
Tuesday, May 14, 13
26
A
C D
Next Header
B
A
C D
Next Header
BRequires bufferingto delay processing
Process all data by packet end ⇒ more data some cycles
Tuesday, May 14, 13
27
Meeting throughput needs
Tuesday, May 14, 13
27
Meeting throughput needs
r = f・wthroughput
(rate) frequencydata width
Tuesday, May 14, 13
27
Meeting throughput needs
r = f・wthroughput
(rate) frequencydata width
Parserwidth: w
Parser1
width: w/n
⋮Parser
n
width: w/n
Tuesday, May 14, 13
27
Meeting throughput needs
r = f・wthroughput
(rate) frequencydata width
Parserwidth: w
Parser1
width: w/n
⋮Parser
n
width: w/n
r = n・f・w/n
Tuesday, May 14, 13
28
Understandingparser design
Parser generator
Trade-offs in parser design
Tuesday, May 14, 13
Data processing width?
29
r = n・f・w
Parser1
width: w
⋮Parser
n
width: w
Fixed forswitch
Single instance:Build a single parser of rate r
(r = const n = 1 f 1/w)
Multiple instances:Build multiple parsers with total rate r
(r = const f = const n 1/w)
∝
∝Tuesday, May 14, 13
Single parser instance
30
10 Gb/s Big parse graph
2 4 8 160 M
2 M
4 M
6 M
8 M
0
150
300
450
600
Gat
es
Processing width (B)
Pow
er (
mW
)
GatesPower
Tuesday, May 14, 13
Single parser instance
30
10 Gb/s Big parse graph
2 4 8 160 M
2 M
4 M
6 M
8 M
0
150
300
450
600
Gat
es
Processing width (B)
Pow
er (
mW
)
GatesPower
Tuesday, May 14, 13
Single parser instance
30
10 Gb/s Big parse graph
2 4 8 160 M
2 M
4 M
6 M
8 M
0
150
300
450
600
Gat
es
Processing width (B)
Pow
er (
mW
)
GatesPower
Area: narrow width
Power: slow clockTuesday, May 14, 13
31
Aggregating parsers
10 20 30 40 50 60 70 800M
0.5M
1M
1.5M
2M
0
150
300
450
600
Gat
es
Rate (Gb/s) per instance
Pow
er (
mW
)
640 Gb/s Big parse graph
SizePower
Tuesday, May 14, 13
31
Aggregating parsers
10 20 30 40 50 60 70 800M
0.5M
1M
1.5M
2M
0
150
300
450
600
Gat
es
Rate (Gb/s) per instance
Pow
er (
mW
)
640 Gb/s Big parse graph
SizePower
Tuesday, May 14, 13
31
Aggregating parsers
10 20 30 40 50 60 70 800M
0.5M
1M
1.5M
2M
0
150
300
450
600
Gat
es
Rate (Gb/s) per instance
Pow
er (
mW
)
640 Gb/s Big parse graph
SizePower
Area: independent of instance rate and count
Power: prefer fewer fast parsers
Tuesday, May 14, 13
Parse graph impacts area
32
Tuesday, May 14, 13
Parse graph impacts area
32
Enterprise Enterprise Edge Service Provider Big
Tuesday, May 14, 13
Parse graph impacts area
32
10 Gb/s 20 Gb/s 40 Gb/s 80 Gb/s0 M
0.5 M
1 M
1.5 M
2 M
Gat
es
Rate per instance
Enterprise Enterprise Edge Service Provider Big
Tuesday, May 14, 13
Parse graph impacts area
32
10 Gb/s 20 Gb/s 40 Gb/s 80 Gb/s0 M
0.5 M
1 M
1.5 M
2 M
Gat
es
Rate per instance
640 Gb/s aggregate
Enterprise Enterprise Edge Service Provider Big
Tuesday, May 14, 13
Parse graph impacts area
32
10 Gb/s 20 Gb/s 40 Gb/s 80 Gb/s0 M
0.5 M
1 M
1.5 M
2 M
Gat
es
Rate per instance
640 Gb/s aggregate
Enterprise Enterprise Edge Service Provider Big
Why?
Tuesday, May 14, 13
Extracted fields dominate area
33
0 M
0.5 M
1 M
1.5 M
2 M
Enterprise Enterprise Edge Service Provider Composite
Gat
es
Field Result BufferField ExtractionHeader Identification
640 Gb/s 40 Gb/s per instance
Tuesday, May 14, 13
Extracted fields dominate area
33
0 M
0.5 M
1 M
1.5 M
2 M
Enterprise Enterprise Edge Service Provider Composite
Gat
es
Field Result BufferField ExtractionHeader Identification
672 b 888 b 688 b 1664 b
640 Gb/s 40 Gb/s per instance
Tuesday, May 14, 13
34
672b 888b
688b1672b
Tuesday, May 14, 13
34
0 500 1000 1500 20000 M
0.5 M
1 M
1.5 M
2 M
Gat
es
Field Result Buffer Width (b)
640 Gb/s 40 Gb/s per instance
672b 888b
688b1672b
Tuesday, May 14, 13
34
0 500 1000 1500 20000 M
0.5 M
1 M
1.5 M
2 M
Gat
es
Field Result Buffer Width (b)
640 Gb/s 40 Gb/s per instance
672b 888b
688b1672b
Tuesday, May 14, 13
34
0 500 1000 1500 20000 M
0.5 M
1 M
1.5 M
2 M
Gat
es
Field Result Buffer Width (b)
640 Gb/s 40 Gb/s per instance
672b 888b
688b1672b
3 headersExtracted fields: 1672b
Tuesday, May 14, 13
34
0 500 1000 1500 20000 M
0.5 M
1 M
1.5 M
2 M
Gat
es
Field Result Buffer Width (b)
640 Gb/s 40 Gb/s per instance
672b 888b
688b1672b
3 headersExtracted fields: 1672b
Tuesday, May 14, 13
34
0 500 1000 1500 20000 M
0.5 M
1 M
1.5 M
2 M
Gat
es
Field Result Buffer Width (b)
640 Gb/s 40 Gb/s per instance
672b 888b
688b1672b
3 headersExtracted fields: 1672b
Area determined by
extracted field buffer size
Tuesday, May 14, 13
Design principles
35
Single parser instances area → minimize by reducing width power → minimize by reducing clock
Aggregating instances for throughput area → independent of instance rate & count power → minimize using few fast instances
Extracted field buffer dominates areaArea determined by extracted field size total
Tuesday, May 14, 13
36
Providing flexibilityRMT model
Programmable parser
Generating parse table entries
Tuesday, May 14, 13
37
Parser specific to one parse graph
Tuesday, May 14, 13
Parser
37
Parser specific to one parse graph
Tuesday, May 14, 13
Parser
37
Parser specific to one parse graph
Tuesday, May 14, 13
Parser
37
Parser specific to one parse graph
Tuesday, May 14, 13
Parser
37
Parser specific to one parse graph
S1
Tuesday, May 14, 13
Parser
37
Parser specific to one parse graph
Switch = S1
S1
Tuesday, May 14, 13
38
Tuesday, May 14, 13
38
Parser
Match Tables
EthernetForwarding
IPRouting
Access Control List
ActionProcessing
Header fields
Packets
In
Queues
Out
Tuesday, May 14, 13
38
Parser
Match Tables
EthernetForwarding
IPRouting
Access Control List
ActionProcessing
Header fields
Packets
In
Queues
Out
•CPU•GPU• FPGA
•OpenFlow/SDN?
Tuesday, May 14, 13
39
Parser
Match Tables
EthernetForwarding
IPRouting
Access Control List
ActionProcessing
Header fields
Packets
In
Queues
Out
Tuesday, May 14, 13
39
Parser
Match Tables
EthernetForwarding
IPRouting
Access Control List
ActionProcessing
Header fields
Packets
In
Queues
Out
Multiple Match Table (MMT)
Tuesday, May 14, 13
39
Parser
Match Tables
EthernetForwarding
IPRouting
Access Control List
ActionProcessing
Header fields
Packets
In
Queues
Out
Multiple Match Table (MMT)
Programmable Parser
Reconfigurable Match + Action Tables
Reco
mbi
ne
Packets
In
Queues
Out
Reconfigurable Multiple Table (RMT)
Tuesday, May 14, 13
39
Parser
Match Tables
EthernetForwarding
IPRouting
Access Control List
ActionProcessing
Header fields
Packets
In
Queues
Out
Multiple Match Table (MMT)
Programmable Parser
Reconfigurable Match + Action Tables
Reco
mbi
ne
Packets
In
Queues
Out
Reconfigurable Multiple Table (RMT)
Tuesday, May 14, 13
39
Parser
Match Tables
EthernetForwarding
IPRouting
Access Control List
ActionProcessing
Header fields
Packets
In
Queues
Out
Multiple Match Table (MMT)
Programmable Parser
Reconfigurable Match + Action Tables
Reco
mbi
ne
Packets
In
Queues
Out
Reconfigurable Multiple Table (RMT)
Tuesday, May 14, 13
39
Parser
Match Tables
EthernetForwarding
IPRouting
Access Control List
ActionProcessing
Header fields
Packets
In
Queues
Out
Multiple Match Table (MMT)
Programmable Parser
Reconfigurable Match + Action Tables
Reco
mbi
ne
Packets
In
Queues
Out
Reconfigurable Multiple Table (RMT)
Tuesday, May 14, 13
39
Parser
Match Tables
EthernetForwarding
IPRouting
Access Control List
ActionProcessing
Header fields
Packets
In
Queues
Out
Multiple Match Table (MMT)
Programmable Parser
Reconfigurable Match + Action Tables
Reco
mbi
ne
Packets
In
Queues
Out
Reconfigurable Multiple Table (RMT)
Tuesday, May 14, 13
39
Parser
Match Tables
EthernetForwarding
IPRouting
Access Control List
ActionProcessing
Header fields
Packets
In
Queues
Out
Multiple Match Table (MMT)
Programmable Parser
Reconfigurable Match + Action Tables
Reco
mbi
ne
Packets
In
Queues
Out
Reconfigurable Multiple Table (RMT)
Tuesday, May 14, 13
OutputQueues
Rec
ombi
ne
Match Table Ac
tion Match
Table Actio
n
OUTIN
DAT
AH
EAD
ER
Stage 1 Stage n
40
RMT architecture
Tuesday, May 14, 13
OutputQueues
Rec
ombi
ne
Match Table Ac
tion Match
Table Actio
n
OUTIN
DAT
AH
EAD
ER
Stage 1 Stage n
40
RMT architecture
Data H
Tuesday, May 14, 13
OutputQueues
Rec
ombi
ne
Match Table Ac
tion Match
Table Actio
n
OUTIN
DAT
AH
EAD
ER
Stage 1 Stage n
40
RMT architecture
Data
H
Tuesday, May 14, 13
OutputQueues
Rec
ombi
ne
Match Table Ac
tion Match
Table Actio
n
OUTIN
DAT
AH
EAD
ER
Stage 1 Stage n
40
RMT architecture
Data
Tuesday, May 14, 13
OutputQueues
Rec
ombi
ne
Match Table Ac
tion Match
Table Actio
n
OUTIN
DAT
AH
EAD
ER
Stage 1 Stage n
40
RMT architecture
Data
Tuesday, May 14, 13
OutputQueues
Rec
ombi
ne
Match Table Ac
tion Match
Table Actio
n
OUTIN
DAT
AH
EAD
ER
Stage 1 Stage n
40
RMT architecture
Data
Tuesday, May 14, 13
OutputQueues
Rec
ombi
ne
Match Table Ac
tion Match
Table Actio
n
OUTIN
DAT
AH
EAD
ER
Stage 1 Stage n
40
RMT architecture
Data
Tuesday, May 14, 13
OutputQueues
Rec
ombi
ne
Match Table Ac
tion Match
Table Actio
n
OUTIN
DAT
AH
EAD
ER
Stage 1 Stage n
40
RMT architecture
HData
Tuesday, May 14, 13
OutputQueues
Rec
ombi
ne
Match Table Ac
tion Match
Table Actio
n
OUTIN
DAT
AH
EAD
ER
Stage 1 Stage n
40
RMT architecture
Tuesday, May 14, 13
RMT Match Tables
41
PhysicalStage 1
PhysicalStage 2
PhysicalStage n
Logical Table 1
Logical Table 2
4 5
Logical Table 3 6
Tuesday, May 14, 13
Forwarding Metamorphosis: Fast Programmable Match-Action
Processing in Hardware for SDN
P. Bosshart, G.Gibb, H.S. Kim, G. Varghese,N. McKeown, M. Izzard, F. Mujica & M. Horowitz
SIGCOMM 2013 [to appear]
42
Tuesday, May 14, 13
43
Providing flexibilityRMT model
Programmable parser
Generating parse table entries
Tuesday, May 14, 13
Providing programmability
44
A
BC
D E
F
C
C→D
C→E
DD→F
EE→F
Header Identification
ExtractedFieldBuffer
Field ExtractionHeader Extract Fields
A A1, A2B B1C C2, C4⋯ ⋯
Extracted fieldsPacket data
Header types & locations
Tuesday, May 14, 13
Providing programmability
44
A
BC
D E
F
C
C→D
C→E
DD→F
EE→F
Header Identification
ExtractedFieldBuffer
Field ExtractionHeader Extract Fields
A A1, A2B B1C C2, C4⋯ ⋯
Extracted fieldsPacket data
Header types & locations
Replace hard-coded logic with
programmable logic
Tuesday, May 14, 13
Providing programmability
44
A
BC
D E
F
C
C→D
C→E
DD→F
EE→F
Header Identification
ExtractedFieldBuffer
Field ExtractionHeader Extract Fields
A A1, A2B B1C C2, C4⋯ ⋯
Extracted fieldsPacket data
Header types & locations
Curr. State Match Values Next StateA A1, A2 BB B1 --C C2, C4 D⋯ ⋯ ⋯
Tuesday, May 14, 13
Current State Match Values Next State
A 11 (A→B) BA A→C CC C→D, D→F FC C→E E
45
A
B C
D E
F
Tuesday, May 14, 13
Current State Match Values Next State
A 11 (A→B) BA A→C CC C→D, D→F FC C→E E
45
A
B C
D E
F
Tuesday, May 14, 13
Current State Match Values Next State
A 11 (A→B) BA A→C CC C→D, D→F FC C→E E
45
A
B C
D E
F
Tuesday, May 14, 13
Current State Match Values Next State
A 11 (A→B) BA A→C CC C→D, D→F FC C→E E
45
A
B C
D E
F
Tuesday, May 14, 13
Current State Match Values Next State
A 11 (A→B) BA A→C CC C→D, D→F FC C→E E
45
A
B C
D E
F
Tuesday, May 14, 13
Current State Match Values Next State
A 11 (A→B) BA A→C CC C→D, D→F FC C→E E
45
A
B C
D E
F
Tuesday, May 14, 13
Current State Match Values Next State
A 11 (A→B) BA A→C CC C→D, D→F FC C→E E
45
A
B C
D E
F
Tuesday, May 14, 13
46
Parser state table
Tuesday, May 14, 13
46
Parser state table
Current State
Match Values
Next State
Tuesday, May 14, 13
46
Parser state table
Current State
Match Values
Next State
TCAM or RAM RAM
Tuesday, May 14, 13
46
Parser state table
Current State
Match Values
Next State
Header Length
TCAM or RAM RAM
Tuesday, May 14, 13
46
Parser state table
Current State
Match Values
Next State
Header Length
Next Match Offsets
TCAM or RAM RAM
Tuesday, May 14, 13
46
Parser state table
Current State
Match Values
Next State
Header Length
Next Match Offsets
TCAM or RAM RAM
Next headerlocation
Next matchlocations
Tuesday, May 14, 13
46
Parser state table
Current State
Match Values
Next State
Header Length
Next Match Offsets
TCAM or RAM RAM Optional
Next headerlocation
Next matchlocations
Tuesday, May 14, 13
46
Parser state table
Current State
Match Values
Next State
Header Length
Next Match Offsets
Next Lookup Mask
TCAM or RAM RAM Optional
Next headerlocation
Next matchlocations
Tuesday, May 14, 13
46
Parser state table
Current State
Match Values
Next State
Header Length
Next Match Offsets
Next Lookup Mask
Extract Fields
TCAM or RAM RAM Optional
Next headerlocation
Next matchlocations
Tuesday, May 14, 13
Cost of programmability
47
Extracted Field Buffer Hdr Ident/Field ExtractTCAM (State Table) RAM (State Table)
Tuesday, May 14, 13
Cost of programmability
47
Fixed Programmable0 M
1.75 M
3.5 M
5.25 M
7 M
Gat
es
Extracted Field Buffer Hdr Ident/Field ExtractTCAM (State Table) RAM (State Table)
Tuesday, May 14, 13
Cost of programmability
47
Fixed Programmable0 M
1.75 M
3.5 M
5.25 M
7 M
Gat
es
Extracted Field Buffer Hdr Ident/Field ExtractTCAM (State Table) RAM (State Table)
Tuesday, May 14, 13
Cost of programmability
47
Fixed Programmable0 M
1.75 M
3.5 M
5.25 M
7 M
Gat
es
Extracted Field Buffer Hdr Ident/Field ExtractTCAM (State Table) RAM (State Table)
4.4mm2
2.6mm2
Tuesday, May 14, 13
Cost of programmability
47
Fixed Programmable0 M
1.75 M
3.5 M
5.25 M
7 M
Gat
es
Extracted Field Buffer Hdr Ident/Field ExtractTCAM (State Table) RAM (State Table)
4.4mm2
2.6mm2
Programmability costs 1.5-3x
State table size determines area increase
Tuesday, May 14, 13
Take-aways
48
Cost of programmability1.5-3x fixed parser area
State table dominates additional area area → minimize TCAM and RAMParse graph edge count determines table size
Tuesday, May 14, 13
49
Providing flexibilityRMT model
Programmable parser
Generating parse table entries
Tuesday, May 14, 13
50
Naïve generation of state table entries
Tuesday, May 14, 13
50
1 2 4 6 8 10 12 14 160
37.5
75
112.5
150
TC
AM
tab
le s
ize
(Kb)
Processing width (B)
Naïve generation of state table entries
Tuesday, May 14, 13
50
1 2 4 6 8 10 12 14 160
37.5
75
112.5
150
TC
AM
tab
le s
ize
(Kb)
Processing width (B)
Naïve generation of state table entries
Tuesday, May 14, 13
51
State table entry generationCurrent
State Match Values Next State
A 11 (A→B) BA A→C CC C→D, D→F FC C→E E
A
B C
D E
F
Tuesday, May 14, 13
51
State table entry generationCurrent
State Match Values Next State
A 11 (A→B) BA A→C CC C→D, D→F FC C→E E
A
B C
D E
F
Tuesday, May 14, 13
51
State table entry generation
Merge nodes to minimize edges
Current State Match Values Next State
A 11 (A→B) BA A→C CC C→D, D→F FC C→E E
A
B C
D E
F
Tuesday, May 14, 13
51
State table entry generation
Merge nodes to minimize edges
Problem: graph clustering is NP-hard
Current State Match Values Next State
A 11 (A→B) BA A→C CC C→D, D→F FC C→E E
A
B C
D E
F
Tuesday, May 14, 13
Kangaroo
52
Intuition: iteratively identify minimal edge clustering starting at leaves
Tuesday, May 14, 13
Kangaroo
52
Intuition: iteratively identify minimal edge clustering starting at leaves
Tuesday, May 14, 13
Kangaroo
52
Intuition: iteratively identify minimal edge clustering starting at leaves
Tuesday, May 14, 13
Kangaroo
52
Intuition: iteratively identify minimal edge clustering starting at leaves
Tuesday, May 14, 13
Kangaroo
52
Intuition: iteratively identify minimal edge clustering starting at leaves
Kangaroo’s algorithm:• access to data anywhere in header region• non-minimal solutions for non-trees
Tuesday, May 14, 13
Improving solutionfor non-trees
53
Tuesday, May 14, 13
Improving solutionfor non-trees
53
Tuesday, May 14, 13
Improving solutionfor non-trees
53
Tuesday, May 14, 13
Improving solutionfor non-trees
53
Two independent
solutions
Tuesday, May 14, 13
Improving solutionfor non-trees
53
Two independent
solutions
Solution: solve shared regions independently
Tuesday, May 14, 13
Improving solutionfor non-trees
53
Two independent
solutions
Solution: solve shared regions independently
Tuesday, May 14, 13
Improving solutionfor non-trees
53
Two independent
solutions
Solution: solve shared regions independently
Tuesday, May 14, 13
Improving solutionfor non-trees
53
Two independent
solutions
Solution: solve shared regions independently
Tuesday, May 14, 13
Improving solutionfor non-trees
53
Two independent
solutions
Solution: solve shared regions independently
Tuesday, May 14, 13
Streaming-aware algorithm
54
Kangaroo:
Streaming:
Tuesday, May 14, 13
Streaming-aware algorithm
54
Kangaroo:
Streaming:
Tuesday, May 14, 13
Streaming-aware algorithm
54
Kangaroo:
Streaming:
Tuesday, May 14, 13
Streaming-aware algorithm
54
Kangaroo:
Streaming:
Tuesday, May 14, 13
Streaming-aware algorithm
54
Kangaroo:
Streaming: Next Hdr
Next Hdr
Tuesday, May 14, 13
Streaming-aware algorithm
55
Kangaroo: OPT (n, b) = minc2Clusters(n)
0
@entries(c) +X
j2Fringe(c)
OPT (j, . . .)
1
A
Tuesday, May 14, 13
Streaming-aware algorithm
55
Kangaroo:
Streaming:
OPT (n, b) = minc2Clusters(n)
0
@entries(c) +X
j2Fringe(c)
OPT (j, . . .)
1
A
OPT (n, b, w) = minc2Clusters(n,w)
0
@entries(c) +X
j2Fringe(c)
OPT (j, . . . , NewLoc(w, j, c))
1
A
Tuesday, May 14, 13
Streaming-aware algorithm
55
Kangaroo:
Streaming:
OPT (n, b) = minc2Clusters(n)
0
@entries(c) +X
j2Fringe(c)
OPT (j, . . .)
1
A
New parameter:window location
OPT (n, b, w) = minc2Clusters(n,w)
0
@entries(c) +X
j2Fringe(c)
OPT (j, . . . , NewLoc(w, j, c))
1
A
Tuesday, May 14, 13
Streaming-aware algorithm
55
Kangaroo:
Streaming:
OPT (n, b) = minc2Clusters(n)
0
@entries(c) +X
j2Fringe(c)
OPT (j, . . .)
1
A
Node clusters restricted by:• windows location• window size
New parameter:window location
OPT (n, b, w) = minc2Clusters(n,w)
0
@entries(c) +X
j2Fringe(c)
OPT (j, . . . , NewLoc(w, j, c))
1
A
Tuesday, May 14, 13
Streaming-aware algorithm
55
Kangaroo:
Streaming:
OPT (n, b) = minc2Clusters(n)
0
@entries(c) +X
j2Fringe(c)
OPT (j, . . .)
1
A
Node clusters restricted by:• windows location• window size
New parameter:window location
Updated location for subgraphs
OPT (n, b, w) = minc2Clusters(n,w)
0
@entries(c) +X
j2Fringe(c)
OPT (j, . . . , NewLoc(w, j, c))
1
A
Tuesday, May 14, 13
Algorithm performance
56
O(|E||V|dk)
Method
40b TCAM(8b state +
2 x 16b inputs)
56b TCAM(8b state +
3 x 16b inputs)
Naive 342 entries0.48s
641 entries0.48s
Algorithm(excluding non-tree logic)
177 entries2.6s
170 entries5.5s
Algorithm 112 entries128.7s
106 entries207.6s
Tuesday, May 14, 13
Benefits of parallel lookups?
57
32 640
30
60
90
120
Tabl
e en
trie
s re
quir
ed
Data arrival rate (bits/cycle)
1234
Lookups
Tuesday, May 14, 13
Benefits of parallel lookups?
57
32 640
30
60
90
120
Tabl
e en
trie
s re
quir
ed
Data arrival rate (bits/cycle)
1234
Lookups
Tuesday, May 14, 13
Benefits of parallel lookups?
57
32 640
30
60
90
120
Tabl
e en
trie
s re
quir
ed
Data arrival rate (bits/cycle)
1234
Lookups
Unable to process at arrival rate
Tuesday, May 14, 13
Benefits of parallel lookups?
57
32 640
30
60
90
120
Tabl
e en
trie
s re
quir
ed
Data arrival rate (bits/cycle)
1234
Lookups
0
2000
4000
6000
8000T
CA
M b
its r
equi
red
Unable to process at arrival rate
Tuesday, May 14, 13
Benefits of parallel lookups?
57
32 640
30
60
90
120
Tabl
e en
trie
s re
quir
ed
Data arrival rate (bits/cycle)
1234
Lookups
0
2000
4000
6000
8000T
CA
M b
its r
equi
red
Unable to process at arrival rate
Minimize parallel lookups
for single instance
Tuesday, May 14, 13
Contributions
58
Parser generator
Parser design trade-off analysis & principlesFixed parsersSingle parser instances area → minimize by reducing width power → minimize by reducing clock
Aggregating instances for throughput area → independent of instance rate & count power → minimize using few fast instances
Extracted field buffer dominates area
Programmable parsersCost of programmability is low (1.5-3x)State table dominates area increase
RMT model
State table generation algorithm
Tuesday, May 14, 13
Publications
59
Forwarding Metamorphosis: Fast Programmable Match-Action
Processing in Hardware for SDNBosshart, P., Gibb, G., et. al. SIGCOMM 2013 [to appear]
Outsourcing network functionalityGibb, G., Zeng, H., and McKeown, N., HotSDN '12.
Initial Thoughts on the Waypoint ServiceGibb, G., Zeng, H., and McKeown, N., WISH '11.
Can the Production Network be the Testbed?Sherwood, R., Gibb, G., et. al, OSDI '10.
A Packet Generator on the NetFPGA platformCovington, G.A., Gibb, G., et. al. FCCM '09,.
NetFPGA – An Open Platform for Teaching How to Build Gigabit-rate
Network Switches and RoutersGibb, G., et. al. IEEE Transactions on Education ’08.
NetFPGA: Reusable Router Architecture for Experimental
ResearchNaous, J., Gibb, G., et. al. PRESTO '08.
Building a RCP (Rate Control Protocol) Test Network
Dukkipati, N., Gibb, G. et. al. Hot Interconnects ’07.
NetFPGA—An Open Platform for Gigabit-Rate Network Switching and Routing
Lockwood, J., et. al. MSE '07
Tuesday, May 14, 13