Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
BuildingCompilersforReconfigurableSwitches
LavanyaJose,LisaYan,NickMcKeown,andGeorgeVarghese
1ResearchfundedbyAT&T,Intel,OpenNetworkingResearchCenter.
Inthenext20minutes
• Fixed-funcPonswitchchipswillbereplacedbyreconfigurableswitchchips
• WewillprogramthemusinglanguageslikeP4• WeneedacompilertocompileP4programstoreconfigurableswitchchips.
2
Fixed-FuncPonSwitchChips
QueuesL2
StageIPv4StagePa
rser IPv6
StageACLStage
L3
L2
Packet
Packet
3
ControlFlowGraph
QueuesL2
StageIPv4StagePa
rser IPv6
StageACLStageL2
Table
IPv4Table
IPv6
Table
ACLTable
L2v4
v6ACL
ControlFlowGraph
SwitchPipeline
FixedAcPo
n
FixedAcPo
n
AcPon
FixedAcPo
n
4
Fixed-FuncPonSwitchChipsAreLimited
1. Can’taddnewforwardingfuncPonality2. Can’taddnewmonitoringfuncPonality
5
Fixed-FuncPonSwitchChips
QueuesL2
StageIPv4StagePa
rser IPv6
StageACLStageL2
Table
IPv4Table
IPv6
Table
ACLTable
FixedAcPo
n
FixedAcPo
n
AcPo
n
FixedAcPo
n
L2v4
v6ACL
ControlFlowGraph
SwitchPipeline
MyEncapMyEncap
6
7
Fixed-FuncPonSwitchChipsAreLimited
1. Can’taddnewforwardingfuncPonality2. Can’taddnewmonitoringfuncPonality3. Can’tmoveresourcesbetweenfuncPons
QueuesL2Stage
IPv4Stage
Parser IPv6
Stage
ACLStageFi
xedAcPo
n
FixedAcPo
n
AcPo
n
L2Table
FixedAcPo
n
IPv4Table
IPv6
Table
ACLTable
ControlFlowGraph
SwitchPipeline
ReconfigurableSwitchChips
Queues
Parser
FixedAcPo
n
FixedAcPo
n
FixedAcPo
n
L2Table
FixedAcPo
n
IPv4Table
IPv6Table
ACLTable
MatchTable
MatchTable
MatchTable
MatchTable
L2v4
v6ACL
AcPo
nMacro
AcPo
nMacro
AcPo
nMacro
AcPo
nMacro
8
MatchTable
AcPo
nMacro
MappingControlFlowtoReconfigurableChip.
Queues
Parser
MatchTable
MatchTable
MatchTable
L2Table
IPv4Table
IPv6Table
ACLTable
AcPo
nMacro
AcPo
nMacro
AcPo
nMacro
L2v4
v6ACL
ControlFlowGraph
SwitchPipeline
L2
v6ACL
v4
L2AcPon
Macro
v4AcPon
Macro
v6AcPon
ACLAcPo
nMacro
9
ControlFlowGraph
SwitchPipeline
ReconfigurableSwitchChips
Queues
Parser
L2Table
IPv4Table
ACLTable
IPv6
MyEncap
L2v4
v6ACL
MyEncapL2AcPon
Macro
v4AcPon
Macro
ACLAcPo
nMacro
AcPo
n
MyEncap
AcPo
n
IPv4
AcPo
n
IPv4
AcPo
n
10
IPv6
IPv4
AcPo
n
Parser
L2Table
IPv4Table
IPv6Table
ACLTable
MatchMemory
AcPonALU
ProtocolIndependentSwitch
L2AcPon
Macro
v4AcPon
Macro
v6AcPon
Macro
ACLAcPo
nMacro
11
Parser
L2Table
IPv4Table
IPv6
ACL
Table
IPv4Table
IPv4
Table
L2AcPon
Macro
v4AcPon
Macro
AcPo
n
AcPo
n
AcPo
n
12
Match+AcPonProcessor:pipelinedandin-parallel
13
Reconfigurability:thenormin5years
• Reconfigurabilityaddsmostlytologic.• LogicisgeangrelaPvelysmaller.• Thecostofreconfigurabilityisgoingdown.• Fixedswitchchipareatoday:– I/O(40%),Memory(40%),– Wires,Logic
SwitchI/O(30%)
Memory(30%)
Wires(20%)
Logic(20%)
SwitchI/O
Memory
Wires Logic
14
FixedFunc-onBroadcomTomahawk:3.2TbpsReconfigurableCaviumXpliant:3.2Tbps
15
Reconfigurablechipsareinevitable.
16
ConfiguringSwitchChipsP4code
Compiler
CompilerTarget
Queues
Parser
FixedAcPo
n
FixedAcPo
n
FixedAcPo
n
L2Table
FixedAcPo
n
IPv4Table
IPv6Table
ACLTable
MatchTable
MatchTable
MatchTable
MatchTable
AcPo
nMacro
AcPo
nMacro
AcPo
nMacro
AcPo
nMacro
17
Queues
Parser
FixedAcPo
n
FixedAcPo
n
FixedAcPo
n
L2Table
FixedAcPo
n
IPv4Table
IPv6Table
ACLTable
MatchTable
MatchTable
MatchTable
MatchTable
P4(hgp://p4.org/)
parser parse_ethernet {! extract(ethernet);!select(latest.etherType) {! 0x800 : parse_ipv4;! 0x86DD : parse_ipv6;! }!}!
�table ipv4_lpm {! reads {! ipv4.dstAddr : ! lpm;! }! actions {! set_next_hop;! drop; ! }!}!
control ingress !{! apply(l2_table);! if (valid(ipv4)) {! apply(ipv4_table);! }! if (valid(ipv6)) {! apply(ipv6_table);! }! apply (acl);!}!
L2 v4v6
ACL
AcPo
nMacro
AcPo
nMacro
AcPo
nMacro
AcPo
nMacro
18
(ANCS’13)Parser
MatchAcPonTables
ControlFlowGraph
Whatdoesreconfigurabilitybuyus?
19
• Useresourcesefficiently– MulPpletablesperstage– BigtableinmulPplestages
• UsefewerstagesL2
IPv4
IPv6ACL
BenefitsofReconfigurability
20
NaïveMapping:ControlFlowGraphParser
MatchTable
MatchTable
MatchTable
MatchTable
AcPo
nMacro
AcPo
nMacro
AcPo
nMacro
AcPo
nMacro
L2v4
v6ACL
ControlFlow
SwitchPipeline
Queues
L2Table
IPv4Table
IPv6Table
ACL
Table
L2
v6ACL
v4
AcPo
n
v4AcPon
Macro
v6AcPon
Macro
AcPo
n
21
ControlFlowGraph
L2
TableDependencyGraph(TDG)
v4
v6ACL
L2
v4
v6
ACL
TableDependencyGraph
22
SwitchPipeline
EfficientMapping:TDG
Queues
Parser L2Table
IPv4Table
IPv6Table
TableDependencyGraphControlFlowGraph
L2v4
v6ACLL2
v4
v6
ACLAcPo
n
v4AcPon
Macro
v6AcPon
Macro
23AC
LTable
AcPo
n
L2
ControlFlowGraph
SwitchPipeline
Resourceconstraints
v4
v6ACL
Queues
Parser
L2Table
IPv6
IPv4
L3
L2
v6
v4
L2AcPon
Macro
v4AcPon
Macro
v6AcPon
Macro
AcPo
n
ACLTable
24
25
Headerwidths
AcPonALUinputMemoryType
Tableparallelism
Moreresourceconstraints
AcPonMemory
MapmatchacPontablesinaTDGtoaswitchpipelinewhilerespecPngdependencyand
resourceconstraints.
TheCompilerProblem
26
Step1:P4Program
Step2:ControlFlowGraph
L2v4
v6ACL
27
Step3:TableDependencyGraph
L2 v4
v6ACL
Step4:TableConfiguraPon
Isthatit?
28
TwoSwitchesWeStudied1 2 3 4 32…
14
3 5
2
RMT32Stages
(SIGCOMM2013)
FlexPipe5Stages
(IntelFM6000)
29
AddiPonalswitchfeatures
L2
v4
v6
ACL
L2
L2
v4
v6
TableshapinginRMT TablesharinginFlexPipe
30
31
MapmatchacPontablesinaTDGtoaswitchpipelinewhilerespecPngdependencyand
resourceconstraints.
TheCompilerProblem
Tableshaping Tablesharing
Headerwidths AcPonALUinput
MemoryType
TableparallelismAcPonMemory
Firstapproach:Greedy
• PrioriPzeoneconstraint• Sorttables• MaptablesoneataPme
Queues
Parser
32
12 3
3 Sortby#dependencies
Firstapproach:Greedy
Queues
Parser
1
• PrioriPzeoneconstraint• Sorttables• MaptablesoneataPme
2 3 4
11 Sortbymatchwidth
33
ToomanyconstraintsforGreedy
• AnygreedymustsorttablesbasedonametricthatisafixedfuncPonofconstraints.
• Asthenumberofconstraintsgetslarger,it’sharderforafixedfuncPontorepresenttheinterplaybetweenallconstraints.
• Canwedobegerthangreedy?
34
Secondapproach:IntegerLinearProgramming(ILP)
FindanopPmalmapping.
Pros:• Takesinallconstraints• DifferentobjecPves• Solversexist(CPLEX)
35
Cons:• Blackboxsolver• Encodingisanart• Slow
ILPSetup
min#stagessubjectto:
≥
≤
dependencyconstraints
36
tablesizesassignedmemoriesassigned
tablesizesspecifiedmemoriesinphysicalstage
ExperimentSetup
• 4datacenterusecasesfromIntel,Barefoot
• Differintables,tablesizes,anddependencies
37
ExampleUseCase
ATypicalTDG
38
IPv6-Mcast
EG-ACL1
EG-Phy-Meta
IG-Agg-Int
IG-Dmac
IPv4-Mcast
IPv4-Nexthop
IPv6-Nexthop
IG-Props
IG-Router-Mac
Ipv4-Ecmp
IG-Smac
Ipv4-Ucast-LPM
Ipv4-Ucast-Host
Ipv6-Ucast-Host
Ipv6-Ucast-LPM
Ipv6-Ecmp
IG_ACL2
IG_Bcast_Storm
Ipv4_Urpf
Ipv6_Urpf
IG_ACL1
EG_Props
IG_Phy_Meta
ConfiguraPonforRMT
Metrics:GreedyvsILP
1. Abilitytofitprograminchip
2. OpPmality
3. RunPme
39
Setup:GreedyvsILP
1. Abilitytofit:FlexPipe– Variantsofusecasesin5-stagepipeline.
2. OpPmality:RMT– Minimumstage,pipelinelatency,power
3. RunPme:bothswitches
40
Results:GreedyvsILP
1. CanGreedyfitmyprogram?– Yes,ifresourcesaplenty(RMT,32stages)– No,ifresourcesconstrained(FlexPipe,5stages),
Can’tfit25%ofprograms.2. HowclosetoopPmalisGreedy?– 30%morePmeforpackettogetthroughRMTpipeline.
3. Hmm..lookslikeIneedILP.Howslowisit?– 100xslowerthanGreedy– Reasonableifprogramsdon’tchangeoven.
41
IfwehavePme,weshouldrunILP.
42
UseILPtosuggestbestGreedyforprogramtype.
43
CriPcalconstraints• DependencycriPcal:16à13stages• AddiPonalresourceconstraintslessimportantCriPcalresources• TCAMmemoriescriPcal:16à14stages– ResultsforoneofourdatacenterL2/L3usecases
Conclusion• Challenge:Parallelismandconstraintsinreconfigurablechipsmakescompilingdifficult.
• TDG:highlightsparallelisminprogram.• ILP:begerifenoughPme,fiangiscriPcal,orobjecPvesarecomplicated.
• BestGreedy:ILPcanchoosevianoPonofcri1calconstraintsandcri1calresources.
44
Thankyou!
45ResearchfundedbyAT&T,Intel,OpenNetworkingResearchCenter.
ILPRunPme
• Numberofconstraints?Notobvious.E.g.,RMT– Min.stage:fewsecs.– Min.power:fewsecs.– Min.pipelinelatency10xslower
• Numberofvariables?Howfine-grainedistheresourceassignment?E.g.,FlexPipe– OnematchentryataPme:manydays..– 100-500matchentriesataPme:<1hr