26
Slide 1 The Architecture of the M40: The Architecture of the M40: A Backbone IP Router A Backbone IP Router Pradeep Sindhu Pradeep Sindhu March 11, 2004 March 11, 2004

Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Embed Size (px)

Citation preview

Page 1: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 1

The Architecture of the M40: The Architecture of the M40: A Backbone IP RouterA Backbone IP Router

Pradeep SindhuPradeep Sindhu

March 11, 2004March 11, 2004

Page 2: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 2

The M40: Juniper’s First The M40: Juniper’s First ProductProduct

Put the entire forwarding path in hardware for first Put the entire forwarding path in hardware for first timetime

Achieve line-rate performance for 8 2.5Gbps Achieve line-rate performance for 8 2.5Gbps interfacesinterfaces

Do it against overwhelming competitionDo it against overwhelming competition Do it with a limited budget and a small team (40 at Do it with a limited budget and a small team (40 at

FCS)FCS) Do it in two yearsDo it in two years

We didn’t understand the full complexity

at the beginning - it unfolded only as we went

along!

We succeeded only because the M40 team was

incredibly talented and driven

We didn’t understand the full complexity

at the beginning - it unfolded only as we went

along!

We succeeded only because the M40 team was

incredibly talented and driven

We took 2 years and 4 monthsWe took 2 years and 4 months

Copyright © 2000, Juniper Networks, Inc. Slide 2

Page 3: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 3

So What Is a Backbone IP So What Is a Backbone IP Router Anyway?Router Anyway?

Certain Minimum QualificationsCertain Minimum Qualifications Capable of switching IP & MPLS datagrams: L3 Capable of switching IP & MPLS datagrams: L3

forwardingforwarding Symmetric any-port-to-any-port switching speedSymmetric any-port-to-any-port switching speed Delay-bandwidth buffering plus congestion controlDelay-bandwidth buffering plus congestion control Internet scale routing tablesInternet scale routing tables Internet scale IS-IS, OSPF, MPLS, BGP4Internet scale IS-IS, OSPF, MPLS, BGP4

Today’s BenchmarkToday’s Benchmark Line rate forwarding on all ports for 40-byte packetsLine rate forwarding on all ports for 40-byte packets Performance independent of loadPerformance independent of load Support of CoS queuing, shaping and policingSupport of CoS queuing, shaping and policing L2 and L3 VPN’sL2 and L3 VPN’s Traffic engineeringTraffic engineering Classification and filtering at line rateClassification and filtering at line rate

Page 4: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 4

Why Are They Hard to Build?Why Are They Hard to Build? Bottom line: inherent complexityBottom line: inherent complexity

Scaling along multiple dimensionsScaling along multiple dimensions Bandwidth, packets per secondBandwidth, packets per second #interfaces, #channels, #routes, #neighbors, #policies, #filters#interfaces, #channels, #routes, #neighbors, #policies, #filters

Unpredictable, hostile environmentUnpredictable, hostile environment Need for reliable, seamless interoperabilityNeed for reliable, seamless interoperability System design and partitioning is non-intuitiveSystem design and partitioning is non-intuitive Deep technical expertise across multiple disciplinesDeep technical expertise across multiple disciplines

Software: routing protocols, embedded systems, Software: routing protocols, embedded systems, network managementnetwork management

Hardware: ASIC design, board design, high speed circuit designHardware: ASIC design, board design, high speed circuit design Mechanical: power, packaging, thermal, emissionsMechanical: power, packaging, thermal, emissions

Changing requirementsChanging requirements

Building routers requires a special viewpointBuilding routers requires a special viewpoint The network is the system, not the boxThe network is the system, not the box Routers uniquely integrate the network at scaleRouters uniquely integrate the network at scale

Page 5: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 5

SpecificationsSpecifications HardwareHardware

20 Gbps line rate; 40Mpps; 400ms 20 Gbps line rate; 40Mpps; 400ms bufferbuffer

POS 8xOC-48, 32xOC-12, 128xOC-3POS 8xOC-48, 32xOC-12, 128xOC-3 ATM 32xOC-12, 128xOC-3ATM 32xOC-12, 128xOC-3 128xDS-3128xDS-3 32xGbE32xGbE 34” x 19” x 26”34” x 19” x 26”

SoftwareSoftware BGP4, OSPF, IS-IS, MPLS/RSVPBGP4, OSPF, IS-IS, MPLS/RSVP DVMRP, PIM SM & DMDVMRP, PIM SM & DM Control, Configuration & monitoringControl, Configuration & monitoring

Page 6: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 6

Forwarding Engine ApproachForwarding Engine Approach

Design is based on highly integrated siliconDesign is based on highly integrated silicon Treat silicon as an empty canvasTreat silicon as an empty canvas Let technology set the limits, not preconceptionLet technology set the limits, not preconception Apply computer design experienceApply computer design experience Use high volume components where possibleUse high volume components where possible Entire forwarding path in hardware; no corner casesEntire forwarding path in hardware; no corner cases Partition design around clean, stable interfacesPartition design around clean, stable interfaces

Why?Why? Every major advance in systems in the last 30 years Every major advance in systems in the last 30 years

can be traced ultimately to silicon integrationcan be traced ultimately to silicon integration Companies that have bet against the compounding Companies that have bet against the compounding

power of integration have diedpower of integration have died History will repeat itself because exponentials are History will repeat itself because exponentials are

hard for people to understandhard for people to understand

Page 7: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 7

Design PhilosophyDesign Philosophy

No compromise line-rate No compromise line-rate performance. PERIODperformance. PERIOD..

No assumptions needed aboutNo assumptions needed about traffic conditionstraffic conditions packet sizespacket sizes interface typesinterface types encapsulation typesencapsulation types etc.etc.

Hardware needs to handle worst-Hardware needs to handle worst-case conditionscase conditions

Page 8: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 8

System Level PartitioningSystem Level Partitioning

Problem is broken into two roughly equally complex partsProblem is broken into two roughly equally complex partsthat interact infrequentlythat interact infrequently

Loading of one does not affect the other, eliminating a common failure modeLoading of one does not affect the other, eliminating a common failure modeof legacy routersof legacy routers

Facilitates independent hardware and software development and early Facilitates independent hardware and software development and early software testingsoftware testing

RE is standard off-the-shelf Intel platform, so it leverages industry advancesRE is standard off-the-shelf Intel platform, so it leverages industry advancesin computer designin computer design

RE can be leveraged across multiple generations of FE’s with no changeRE can be leveraged across multiple generations of FE’s with no change

Software structure can now be clean because software is not burdened with real-time Software structure can now be clean because software is not burdened with real-time considerationsconsiderations

ForwardingForwardingEngine (FE)Engine (FE)

RoutingRoutingEngine (RE)Engine (RE)

Why this partitioning is Why this partitioning is goodgood

Control Packets OnlyControl Packets OnlyAll PacketsAll Packets

Good architecture is the art of defining clean, stable interfaces;Good architecture is the art of defining clean, stable interfaces;it is the only way we know to build anything complexit is the only way we know to build anything complex

Fast EthernetFast Ethernet

General-purpose General-purpose computer (Pentium computer (Pentium

based)based)

Specialized HardwareSpecialized Hardware

Page 9: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 9

Routing EngineRouting Engine

Standard 233 MHz Pentium PCStandard 233 MHz Pentium PC 256 MB memory256 MB memory Specialized BIOS for bootingSpecialized BIOS for booting LS-120LS-120 Flash memoryFlash memory Hard Disk for dumpsHard Disk for dumps 100BT link to Forwarding Engine100BT link to Forwarding Engine

Page 10: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 10

Software Structure: JunOSSoftware Structure: JunOS

Built for scale using modern OS design principlesBuilt for scale using modern OS design principles Strong protectionStrong protection ModularityModularity Clean, stable interfacesClean, stable interfaces

Reliable, maintainable, serviceableReliable, maintainable, serviceable Average of three to four major releases per yearAverage of three to four major releases per year

RPD DCD MgD AppsChassisD

JUNOS KernelJUNOS Kernel

Routing Routing ProtocolsProtocols

Mgmt AppsMgmt Apps

Page 11: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 11

Forwarding Engine Forwarding Engine ArchitectureArchitecture

M

C

603

A1 A2

BI 0

BI 1

BI 7

DI 0

DI 1

DI 7

BI 0

BI 1

BI 7

DI 0

DI 1

DI 7

SRAM

Page 12: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 12

ChipsetChipset

A: implements A1 & A2 (1x)A: implements A1 & A2 (1x) B: implements BI, BO, and BM (8x)B: implements BI, BO, and BM (8x) C: implements route lookup (1x)C: implements route lookup (1x) D: implements SONET & POSD: implements SONET & POS

Page 13: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 13

Physical StructurePhysical Structure

Active BackplaneActive Backplane contains A1 and A2 chipscontains A1 and A2 chips

Up to 8 FPC’sUp to 8 FPC’s each FPC has up to 4 PIC’seach FPC has up to 4 PIC’s each FPC has 603 control processoreach FPC has 603 control processor each PIC handles up to 622Mbps line each PIC handles up to 622Mbps line

raterate 1 SCB1 SCB

603 control processor, memory, 603 control processor, memory, EthernetEthernet

C chip and route lookup memoryC chip and route lookup memory

Page 14: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 14

Card CageCard Cage

Activebackplane

FPCFPC

SCB

PIC

PIC

PIC

PIC

PIC

PIC

PIC

PIC

PIC

PIC

PIC

PIC

FPCSCB

airflow

Page 15: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 15

TerminologyTerminology

StreamStream source of non-interleaved packetssource of non-interleaved packets

CellCell 64 byte datum64 byte datum

NotificationNotification 16 byte pointer to packet + control bits16 byte pointer to packet + control bits

BankBank unit of main memory on one FPCunit of main memory on one FPC

KeyKey variable length qty used to do route variable length qty used to do route

lookuplookup

Page 16: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 16

Memory OrganizationMemory Organization

Divided into 64 byte cellsDivided into 64 byte cells Logically One giant bufferLogically One giant buffer Physically distributed among line-Physically distributed among line-

cardscards Two 72 bit wide DIMMSTwo 72 bit wide DIMMS 125MHz clock125MHz clock

Packets read and written as cellsPackets read and written as cells Cells written as they arriveCells written as they arrive No garbage collectionNo garbage collection Cells chained together via offsetsCells chained together via offsets

Page 17: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 17

Packet Flow: inputPacket Flow: input

DI BI A1

C

SONET decapsulationPOS/HDLC

Layers 2 and 3CellificationWrite cells to memoryIIF determinationinput accounting

Switch cells to memoryBuild ICellsForward Key to C

Cells tomemory

Key+info

BD interface

Line

Page 18: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 18

Packet Flow: Route Packet Flow: Route LookupLookup

A2

CKey + Info

A1

Result + Info

Key = variable # bits(up to 31 bytes)

Result = nexthop_id + destMask

SRAM

Page 19: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 19

Packet Flow: OutputPacket Flow: Output

A2 BO DO

C

POS/HDLCSONET encapsulation

Output queueingRead cells from memoryPacketizationNexthop lookupsLayers 2 and 3Output accounting

Switch cells from memoryForward notification to BO

Cells frommemory

Result+infoBD interface

Line

Page 20: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 20

Output QueuingOutput Queuing

Arriving notifications queued by BoArriving notifications queued by Bo 4 queues per stream4 queues per stream weighted round robin serviceweighted round robin service random-early droprandom-early drop

Each notification is 16 BytesEach notification is 16 Bytes pointer to start of packetpointer to start of packet first few offsetsfirst few offsets next-hop idnext-hop id control bitscontrol bits

Page 21: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 21

Route LookupRoute Lookup

Generic Problem:Find best (longest) match in table

Our Solution: JTree

Key

0 1

01

Key

ResultPattern + Mask

Page 22: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 22

Input Switch Input Switch OrganizationOrganization

Input switch connects BI’s to Input switch connects BI’s to memorymemory

Memory implemented by multiple Memory implemented by multiple banksbanks

Cells of each stream are written to Cells of each stream are written to increasing bank numberincreasing bank number

Perfect pattern guarantees Perfect pattern guarantees freedom from bank conflictsfreedom from bank conflicts

Simple TDM discipline suffices!Simple TDM discipline suffices!

Page 23: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 23

Output SwitchOutput Switch

Output switch connects BO’s to Output switch connects BO’s to memorymemory

Same multiple bank memorySame multiple bank memory Reads can be a lot more chaoticReads can be a lot more chaotic

deterministic within a packetdeterministic within a packet but not across packetsbut not across packets

Only probabilistically conflict freeOnly probabilistically conflict free Reservation table handles conflict Reservation table handles conflict

casescases

Page 24: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 24

Control ChannelControl Channel

Provides PIO channel to chipsProvides PIO channel to chips Used for booting & configurationUsed for booting & configuration

A1

C

A2

B

Page 25: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 25

High Speed Links and High Speed Links and ClockingClocking

Single synchronous domainSingle synchronous domain Single ended, low voltage (GTL)Single ended, low voltage (GTL) Clock sent with dataClock sent with data 250Mbits/sec per wire250Mbits/sec per wire 16 bit wide data path => 4Gbps16 bit wide data path => 4Gbps

Page 26: Slide 1 The Architecture of the M40: A Backbone IP Router Pradeep Sindhu March 11, 2004

Slide 26

RetrospectiveRetrospective 2.3 years from start to product launch2.3 years from start to product launch Small team: 8 ramping to 40Small team: 8 ramping to 40 Combined a lot of different areas of Combined a lot of different areas of

expertiseexpertise No major mistakes, just a lot of little onesNo major mistakes, just a lot of little ones

System building experience was keySystem building experience was key Average experience was ~10 yearsAverage experience was ~10 years Average person had delivered 2 systemsAverage person had delivered 2 systems

Implementation was incredibly optimizedImplementation was incredibly optimized Great one-time leverage of knowledge from Great one-time leverage of knowledge from

computer industry to networking industrycomputer industry to networking industry This product launched the companyThis product launched the company This product changed the way routers are This product changed the way routers are

builtbuilt