Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Stanford Guest Lecture Seminar Course: Technologies in Finance
April 08, 2019
MohanKalkunte,Ph.D.VicePresident,Architecture&Technology
CoreSwitchingGroupBroadcomInc.
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Design of a Switch Chip
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Before you decide to build a chip • BusinessCase• MarketSegments• MajorCustomers• Keyfeaturesofchip• Risks• ROI
• MarketSegment
• HyperscaleDataCenter• e.g.,Facebook,Microsoft,Google,Amazon
• EnterpriseDataCenter• e.g.,LargeFortune500company
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Where in the Network
• TopofRack(ToR)
• Leaf/Spine
• Aggregatoretc.
FacebookDatacenterArchitectureSource:https://code.fb.com/data-center-engineering/f16-minipack/
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Switch Silicon Design: A Highly Constrained Problem Features
Throughput
Capacity
Area(Cost)Power
Latency
Buffering
Optimalswitchsiliconneedstomeetorexceedonallthesevectors
ANDWithinthemarket
window
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Typical Data Center features • Highbandwidth• Lowerlatency• LargeIPRouting• EqualCostMultiPath(ECMP)• Hashing• ACLs• Monitoring
• sFlow• Mirroringetc.
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
• Currenthard-limitonsilicondiesize– 26mmx33mm– dictatedbyreticlesize– Practicalsize~800sqmm– Tightmarginforerror
Constraint: Max Die Size limit
http://www.silicon-edge.co.uk/j/index.php/resources/die-per-wafer
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
• Onetimecost–amortizedovertheproductvolume– Developmentcost– Maskcosts
• Devicecost– Die+package+test– Yield
– Improvesovertimethenflattens– Fallsexponentiallywithsizeorcomplexity
– Repairisamustformemories
• Memoryisrepairable– Rowandcolumnredundancy– Lowercostpersqmmformemoryafterrepair
Constraint: Cost Source:anysilicon.com
Source:www.ee.ryerson.ca/~courses/coe838/lectures/SoC-IC-Basics.pdf
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
• Tomahawk3–256x50G–12.8Tbps
• Singledeviceswitchbandwidthkeepingupwithexponentialincrease
• Criteria– Reach
– CopperCables–Highersignallossperunitdistance
– Optics:lowersignallossperunitdistance– Cost/area
Constraint: IO Speed (Serdes speed)
Source:ethernetalliance.orgroadmap
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Constraint: Power Dissipation
Noredundancyforfanfailure-Fanshavehigherfailurerate
Heatsink
Immersioncooling
Coldplatetechnology
Heatsinkwithheatpipes
PowerDen
sity
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Constraint: Process Geometry
Source:semiengineering.com/knowledge_centers/manufacturing/lithography/impact-of-lithography-on-wafer-costs/
IntelCPUperformanceinSpecIntCPUisrisingatjustthreepercent/year,saidPatterson.Source:ComputerArchitecture:AQuantitativeApproach,2018.
BroadcomSwitchGeometryProgression
Power
Transistor
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Choice of Buffer Architecture • Manybufferarchitecturesarepossible
• Whichisthebestchoice?
• Depends
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
EFFICIENT BUFFER ARCHITECTURE • Highburstabsorption
• Unusedpacketbufferavailablefortransientcongestion
• Fairnessundercongestion• Fairaccesstoallportsandqueuesunderheavytrafficload
• AvoidStarvation• Congestedportshouldnotstarveuncongestedports
• Lowframeloss• Highzero-lossthroughputperformance
• TrafficIndependentPerformance• Buffermanagementwithminimaltuning
• Scalableacrossmultiplegenerations
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Chip Development Process
BusinessCase• Marketing
ProgramCommit• Engineering
RTL• Design
Verification• Arch• block
Netlist• TimingClosure
Layout• Floorplan• Timing
Tapeout• checklists
Samples• SystemVerification
ProductionRelease• VolumeProduction
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
TOMAHAWK FAMILY
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Band
width
Extensibilit
y
Three High Performance Switch Architectures - Broadcom
Features
10.0T
Band
width
5.0T
1.0T
MassiveBandwidthforHyperscaleFabrics
Programmable,Feature-RichSwitchesforEnterpriseandDataCenter
Scale-Out,ConvergedCarrier-GradeInfrastructure
Trident
Jericho
Tomahawk
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Scaling up the Network with Merchant Silicon
2010 2012 2014 2016 2018 2020
0.64Tbps64xserdes10GNRZ
40nm
1.28Tbps128xserdes10GNRZ
40nm
3.2Tbps128xserdes25GNRZ
28nm
6.4Tbps256xserdes25GNRZ
16nm
12.8Tbps256xserdes50GPAM-4
16nm
25.6Tbps50Gor100GPAM-4
7nm
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Data Center Market
Source:https://www.max.ee/sites/default/files/dell_open_networking_vision_and_portfolio_.pdf
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
TOMAHAWK 3
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Tomahawk 3: By the numbers • 12.8Tb/smultilayerLayer3switching• Configurableas32x400GbE,64x200GbE,or128x100GbE• 256dual-mode–56G-PAM4and28G-NRZ• 40%Powerreductionper100GbEport• 75%lowercostper100GbEport• Integratedshared-bufferarchitecture• BroadviewGen3networkinstrumentation• IPforwarding,ECMP• DynamicLoadBalancingandGroupMultipathing• In-bandNetworkTelemetry• 16nmprocessgeometry• InProductionnow
Source:https://www.broadcom.com/blog/broadcom-s-tomahawk-3-ethernet-switch-chip-delivers-12-8-tbps-of-speed-in-a-single-16-nm-device
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Tomahawk 3 Architecture
Source:https://www.linleygroup.com/mpr/article.php?id=11908
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Terminology • VLAN–VirtualLAN
• VirtualLAN• L2Table
• Tablelookedupwithkey=DestinationMACaddress• Determinetheoutgoingport
• L3Table• Tablelookedupwithkey=DestinationIPaddress• Determinetheoutgoinginterface/port
• ACL–AccessControlList• Implementsaccesscontrolpolicies
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Day in the life of a Packet
MAC Parser VLANTable
L2Table
L3Table
ACLTable
PacketEdits
EgressACL MAC
PacketBuffer
.
.
.
.
PacketReceive
Checkframing,CRC
Packetheaderfieldsareparsed
VLANassignment
DestinationMACAddressLookupàPort
DestinationIPAddressLookup
àPort
AccessControlLists
Packetqueuedtooutputport Packetheader
editsEgressAccessControlList
Framing,CRCgeneration Packettransmit
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
PacketssteeredbasedonFlowHashingo Distributeflowsequallyamonglinksasmuchaspossibleo Switchchipshouldhavecapabilitytoprovide
o Sufficientdepthofparsing
o Hashing
o Abilitytohandledifferenttypesofflows
Example: ECMP Load Balancing
ECMPGroup
A
B
C
D
Server
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Tomahawk 3 enables Cost and Power Reduction
12.8T
75%ReductioninSystemPower,85%reductioninSystemCost*
Tomahawk Tomahawk
Tomahawk Tomahawk Tomahawk
Source:Facebook,OCP
*PowerMetricIncludesOptics,CostMetricExcludesOptics
FBBackpack NextGen
128x100GbE 32x400GbE/128x100GbE
128xQSFP28 32xQSFP-DDorOSFP
8UChassis 1UFixed
12x3.2T 1x12.8T
Capacity
Frontpanel
Size
#SwitchChips
3.2T 3.2T 3.2T 3.2T
3.2T 3.2T 3.2T 3.2T 3.2T 3.2T 3.2T 3.2T
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
HardwarePlatform
Industry’s Broadest Ecosystem
Orchestration&Automation
NetworkController
Ryu
NVController
OperatingSystem
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
Key Takeaways • SwitchSilicondevelopmentisabout18to24monthprocess
• Requiresinvestmentof50–100milliondollars
• Coolingtechniquesarechallengingandexpensive
• ProcessGeometryisnotyieldingcostandpoweradvantage
• Monolithicdiesmaybereplacedwithmulti-dieinapackage
©2019Broadcom.AllRightsReserved.Theterm“Broadcom”referstoBroadcomInc.
thank you