Dis-Aggregation as a Vehicle for Hyper-Scalability in

Preview:

Citation preview

Dan Kilper

May 14, 2018

Dis-AggregationasaVehicleforHyper-ScalabilityinOpticalNetworks

HyperScaleComputing

• Methodtoscaledatacentersto‘warehouse’sizes• 100k’sservers• Entiredatacenterbecomesthesystem

• Hardware/softwareseparationenabledDC-widecontrol• Tradeoffserverperformanceforcost&DCperformance

• Merchantsiliconopeneddoorfordatacenteroperatorstodesigntheirownservers

• EnabledholisticDCarchitectures• Computer‘integrators’bouncedbackbydesigningwholerackandpodsolutions

2

DensificationofWirelessAccess

3

MetroCore

LongHaul

MetroCore

DistributionRings

PON

P2P

MicrowaveBH

AccessLink

WDMmmWave

CoreOCS

OLT

LongHaul

WDM-PON

AccessOCS

MacroRH

Micro/pico RH

BBU/DC

CORD/BBUpool

TodayFuture

• Networkoperatorsrequesting10k’sofaccesspointsineachUScity• Eachaccesspoint>10Gb/sbackhaul/fronthaul• Operatorsofferingwholewavelengthaccess(e.g.Pilot)

WhatisDis-Aggregation?• Dis-aggregationiseconomicconcept

• Differentvendorsprovidepartsthatmakeupasystem

• Whethertodisaggregateisusuallydrivenbymarketandsupplychainconsiderations

• Dis-aggregationisanarchitectureconcept• Physicalorcontrolintegrationisseparated• Oftendeterminedbyperformancerequirements

4

MarketDrivenComputerDis-AggregationEnabledHyperscale DCArchitecture

TwoMainDriversforDis-Aggregation• Market

• Whenperformanceislessimportant• Whenscalabilityisneeded• Usemarketcompetitiontodrivedowncost

• Performance• Whencomponentperformanceismoreimportantthansystemperformance

• Whentechnologiesreachnewperformancelevelsenablingdisaggregation

• Usearchitectureenhancementstodrivedowncost

6

ConventionalDataCenter

ToRServerServerServerServerServerServerServerServerServerServer

ToRServerServerServerServerServerServerServerServerServerServer

ToRServerServerServerServerServerServerServerServerServerServer

PackServersintoRacks

ToR

Dis-aggregatedDataCenter

CPU/MEMCPU/MEMCPU/MEMCPU/MEMCPU/MEMCPU/MEMCPU/MEMCPU/MEMCPU/MEMCPU/MEM

ToRDISKDISKDISKDISKDISKDISKDISKDISKDISKDISK

ToRSSDSSDSSDSSDSSDSSDSSDSSDSSDSSD

ResourceperShelf

Dis-aggregatedDataCenter

ToRCPU

MEMORYCPU

MEMORYCPU

MEMORYCPU

MEMORYCPU

MEMORY

ToRDISKDISKDISKDISKDISKDISKDISKDISKDISKDISK

ToRSSDSSDSSDSSDSSDSSDSSDSSDSSDSSD

ToR

Dis-aggregatedDataCenter

CPUCPUCPUCPUCPUGPUGPUASICASICASIC

ToRDISKDISKDISKDISKDISKDISKDISKDISKDISKDISK

ToRMEMORYMEMORYMEMORYMEMORY

SSDSSDSSDSSDSSDSSD

WhyDis-AggregateAgain?

• Ifyouhaveopticstothecomponentsthenincreaseinterconnectdistancesto~100m

• Latencyrequirementbecomesthelimitation

• Isserveroptimumcombinationofcpu/memory/disk/storage/NIC?

• Canvirtualizationbemoreefficientifremoveartificialboundariescreatedbyserverarchitecture?

• ServermemorylockedtoCPUs• Doesserverallowforbestnetworkarchitecture?• Optimizethermalmanagementtodevicerequirements

• Atshelfandracklevel

ArchitectureDis-AggregationBenefits

12

BringingOpticsInsidetheComputer• CPUIOBottleneck:

• NeedopticsforCPUtomemoryinterconnects• Itsgoingtobetherenomatterwhat

• Whataretheprospectsforscalingthisto10-100m?• D.A.B.MillerProc.IEEE2009

• Embeddedoptics:movingtheNIContotheboard• ExpandingtheNICandintegratingitonboard

• DataCenterOpticalNetworks• Ifyouhaveanetwork,whynotdis-aggregate?

EmbeddedOptics

CPU

NIC

NIC

HighCapacityElectricalInterconnects

Startstolooklikeanopticalline

card…

Dis-AggregatingOpticalSystems

15

SomeHistory• Late90’s:MCI/Globecom triedtobuildtheirownsystemsfromcomponents

• ~2000:UnifiedcontrolplaneattempttomergecontrolofopticalsystemsintoL3control

• GMPLS/MPLSwasresult• Mid00’s:JDSU/Nortelintroduce‘generic’ROADMbuildingblocksystems

• Late00’s:Coherenttransceiverschangesystemengineering(nodispersionmaps,PMD)

• Early10’s:Enterprises/DCoperatorsbuildtheirownopticalnetworks

• 2020:5Giscoming!

16

OpticalSystemVendors• HistoricallyopticalsystemvendorsNOT‘systemintegrators’

• Opticalsystemsareengineeredproducts• Componentsandsub-systemshighlyspecifictosystemdesign• Tightlycoupledhardwareandsoftwaredesign• LongR&Dandtestcyclestodevelopproduct

• Keyquestion:Canopticalsystemvendorsmovetosystemintegratormodel?

• SimilartoDellorHP• Oroperatingsystemmodel?e.g.Microsoft

17

Hyperscale Attributes

• Largenumbersofaccesspoints(ROADMnodes)• Gofrom100’spercityto10k-100kpercity• Designedatthenetworkleveltoachievescalability

• Unifiedandscalablesoftwarecontrol• Remove‘siloing’– hardwaretiedtosoftware(operatingsystem)

18

ProprietaryOpticalSystems

19

ROADM

ROADM

ROADM

ROADM

ROADM

ROADM

ROADM

NetworkOrchestrator/OperatingSystem

OLSControl OLSControl

OLSManagementSystem OLSManagementSystem

OTN/L2

OTN/L2

OTN/L2OTN/L2

OTN/L2L2/L3

TransceiverDisaggregation(Alienls)

20

ROADM

ROADM

ROADM

ROADM

ROADM

ROADM

ROADM

NetworkOrchestrator/OperatingSystem

OLSControl OLSControl

OLSManagementSystem OLSManagementSystem

OTN/L2

OTN/L2

OTN/L2OTN/L2

OTN/L2L2/L3

Whitebox/openROADM Systems

21

ROADM

ROADM

ROADM

ROADM

ROADM

ROADM

NetworkOrchestrator/OperatingSystem

Menara:BuiltinOTN

ComputerSystemIntegration• Stillvalueinmatchingcomponentstomotherboardandgoodsystemdesignprinciples

22

Whitebox/OpenOpticalNetworks

23

ROADM

ROADM

ROADM

ROADM

ROADM

ROADM

ROADM

NetworkOrchestrator/OperatingSystem

OLSControl&ManagementSystem

CostModels:Where’stheSavings?

24Riccardi,et.al.JLT2018

TransceiverSavings:AvoidRegens

• Withalmostnoregeneration

25J.Santoset.al.JOCN2018

WithRegeneration

• Disaggregationpenalty&networkdomainsmakeadifference

26J.Santoset.al.JOCN2018

TransmissionReach

27

InMetro&datacenternetworks:Distance=#Hops2000km~20hops

Bosco,et.al.JLT2011

Higherordermodulation

OpticalPowerDynamics• OpticalpowerdynamicsinOADMringnetwork

• Simulations&modelingofchannelpoweroscillationsandinstability

• L.PavelAutomatica 2004• Gorinevsky &FarberJLT2004

28

SustainedOscillationsoverLongPeriods

DynamicDomainPowerControlAlgorithm• Powerdriftsovertimeandnewchannelsareprovisioned:needperiodicpowercontroltostaywithinmargins

• Adjustnodesinparallelwithin‘optically’isolateddomains

• Nodeorderingbasedonchannelroutes

29

[1]

A B

[1,2,3,4] [1,4]

[3]

[2]C

D

[3]

[4]

[ ]

Wait for Chn 1

Ready to Adjust

[i,j,k] = channels adjusting upstream

4

511

12

22

23

3130

29

28

27

2625

24

21

13 15

20

18

19

1614

109

17

87

123

6

Kilper&WhiteOFC2007

Objectives:• tunabledrop(reject)

4-channeltunableadd4+1channelVOA

• 100,000timessmaller• approx.250mW• nomovingparts

WDM Network Node on-a-Chip:Lower performance, but much lower cost

R.Aguinaldo,H.Grant,S.Mookherjea(UCSD)+Sandia

Channel 36 Channel 35

Channel 34 Channel 33

100 ps

A B

C D

1.3mmx0.52mm

8fiberV-groovearray

4x10Gbps addedfromindividualINfiberstocommonOUT

Common (23 ch)OUT (23 ch)Diag (“Test”)A B, C, D (in)

In/Out

all channels on ITU-T 100 GHz grid 30

SystemLevelIssues• Transceiver&systemperformanceinteractions

• Biggerproblemforbleedingedgeperformance• Transceiverscomplexsystemsontheirown

• Blockingbadcornercases• Handlingthewiderangeofsystemfunctions• Systemtestingpullsinmargins

• Toomanyuncertainties

• Controldynamics• Opticalpowerdynamics

31

ResearchQuestions• Atwhatmetroreach(numberofnodehops)dothedifferentdisaggregationmodelsbecomeproblematic?Forwhichtransceivertypes?

• Howdoesphysicallayersoftwarecontrolscalewithnumberofnodes?

• DICONETandotherexamplesforlonghaulneedtobeadaptedhere

• Needtoolstodevelopandtestcontrolatscale(seenexttalk)

• Whatcomponentscanbescaledtoverylargenumbers?

• Needintegratedphotonics

32

Conclusions• Computingsystemsaregoingthroughmultipleroundsofdisaggregationinordertocontinuehyperscale growth

• Marketand/orperformancedrivenarchitecturalchange

• 5Gcreatespotentialforopticalsystemstojumptohyperscale models

• Notjustaboutopeningcompetitionfortransceivers,needfullnetworkdesignforhyperscale growth

• Transmissionengineeringremainsanobstacle• Hardware&Software• Neednewtoolstackleproblem(machinelearning?)

• Savingsneedtocomefromhighvolumes:needtothinkhyperscale

33

www.cian-erc.org

CenterforDis-IntegratedandDis-AggregatedNetworks

Thank You

Our Group:https://wp.optics.arizona.edu/dkilper/

CIAN:www.cian-erc.org

35

Recommended