View
222
Download
0
Category
Preview:
Citation preview
www. chameleoncloud.org
FEBRUARY 5, 2016 1
CHAMELEON:BUILDINGARECONFIGURABLEEXPERIMENTALTESTBEDFORLARGE-SCALECLOUDRESEARCH
Pierre Riteau, Chameleon Lead DevOps Engineer priteau@uchicago.edu
Grid’5000 Winter School 2016 February 5, 2016 Grenoble, France
www. chameleoncloud.org
TOAVOIDANYMISUNDERSTANDINGS
www. chameleoncloud.org
CHAMELEONDESIGNSTRATEGY� Large-scale:“BigData,BigCompute,BigInstrumentresearch”
� ~650nodes(~14,500cores),5PBdiskovertwosites,2sitesconnectedwith100Gnetwork
� Reconfigurable:“Ascloseaspossibletohavingitinyourlab”� Baremetalreconfigura[on,operatedasasingleinstrument� Supportforrepeatableandreproducibleexperiments
� Connected:“Onestopshoppingforexperimentalneeds”� WorkloadandTraceArchive� Partnershipswithproduc[onclouds:CERN,OSDC,Rackspace,Google,andothers
� Partnershipswithusers� Complementary:“Can’tdoeverythingourselves”
� Complemen[ngGENI,Grid’5000,andotherexperimentaltestbeds� Sustainable:“Easytomaintain,easytoshare”
www. chameleoncloud.org
CHAMELEONHARDWARE
SCUsconnecttocoreandfullyconnectedtoeachother
HeterogeneousCloudUnits
AlternateProcessorsandNetworks
SwitchStandardCloudUnit42compute4storagex10
Chicago
To UTSA, GENI, Future Partners
Aus,nChameleonCoreNetwork
100Gbpsuplinkpublicnetwork(eachsite)
CoreServices3.6PBCentralFileSystems,FrontEndandDataMovers
CoreServicesFrontEndandData
MoverNodes 504x86ComputeServers48Dist.StorageServers102HeterogeneousServers16MgmtandStorageNodes
SwitchStandardCloudUnit42compute4storagex2
www. chameleoncloud.org
CHAMELEONHARDWARE� StandardCloudUnits(SCU)(deployed)
� Eachofthe12StandardCloudUnitsisasingle48Urack� 42DellR630computeservers,eachwithdual-socketIntelXeon(Haswell)processors(12cores,24threads)and128GBofRAM
� 4DellFX2storageservers,eachwithaconnectedJBODarrayof162TBdrives(totalof128TBperSCU),2x10cores,and64GBofRAM
� Alloca[onscanbeanen[reSCU,mul[pleSCUs,orwithinasingleSCU,oracrossSCUs(e.g.,storageserversforHadoopconfigura[ons)
� 48portForce10S6000OpenFlow-enabledswitches10Gbtohosts,40GbuplinkstoChameleoncorenetwork
� Connectx3InfinibandnetworkinonerackatTACC� Sharedinfrastructure(deployed)
� 3.6PBglobalstorage,100GbInternetconnec[onbetweensites� HeterogeneousCloudUnits(tobeprocuredinY2)
� ARMmicroservers,Atommicroservers,SSDs,GPUs,FPGAs
www. chameleoncloud.org
CAPABILITIESANDSUPPORTEDRESEARCH
Virtualiza[ontechnology(e.g.,SR-IOV,accelerators),systems,networking,infrastructure-levelresourcemanagement,etc.
Repeatableexperimentsinnewmodels,algorithms,plaeorms,auto-scaling,high-availability,cloudfedera[on,etc.
Developmentofnewmodels,algorithms,plaeorms,auto-scalingHA,etc.,innova[veapplica[onandeduca[onaluses
Isolatedpar,,on,fullbaremetalreconfigura,on
Isolatedpar,,on,ChameleonAppliances
Persistent,reliable,sharedclouds
www. chameleoncloud.org
IMPLEMENTINGTHEEXPERIMENTALWORKFLOW
discover resources
provision resources
configure and interact monitor
- Fine-grained - Complete - Up-to-date - Versioned - Verifiable
- Advance reservations & on-demand - Fine-grained allocations - Isolation
- Bare metal - Deeply reconfigurable - Multiple appliances to a lease - Snapshotting - Complex Appliances
- Hardware metrics - Fine-grained information - Aggregate and archive
www. chameleoncloud.org
www. chameleoncloud.org
BUILDINGATESTBEDFROMSCRATCH
� Requirements(proposalstage)� Architecture(projectstart)� TechnologyEvalua[onandRiskAnalysis
� Manyop[ons:G5K,Nimbus,LosF,OpenStack� Sustainabilityasdesigncriterion:canaCStestbedbebuiltfromcommoditycomponents?
� Technologyevalua[on:Grid’5000andOpenStack� Architecture-basedanalysisandimplementa[onproposals
� CHI=OpenStack+Grid’5000+specialsauce
www. chameleoncloud.org
CHI:DISCOVERINGANDVERIFYINGRESOURCES� Fine-grained,up-to-date,andcompleterepresenta[on� Bothmachineparsableanduserfriendlyrepresenta[ons� Testbedversioning
� “WhatwasthedriveonthenodesIused6monthsago?”� Dynamicallyverifiable
� Doesrealitycorrespondtodescrip[on?(e.g.,failurehandling)� Grid’5000registrytoolkit+ChameleonportalUI
� Automatedresourcedescrip[on,automatedexporttoRM/Blazar� g5k-checks(renamedcc-checksforconsistency)
� Canberunamerboot,acquiresinforma[onandcomparesitwithresourcecatalogdescrip[on
www. chameleoncloud.org
v1
www. chameleoncloud.org
v1
v2
www. chameleoncloud.org
CHI:PROVISIONINGRESOURCES� Resourceleases� Advancereserva[ons(AR)andon-demand
� ARfacilitatesalloca[ngatlargescale� Fine-grainalloca[onofarangeofresources
� Differentnodetypes,switches,etc.� Isola[onbetweenexperiments� Futureextensions:matchmaking,testbedalloca[onmanagement
� OpenStackNova/Blazar,contribu[onstoBlazar� ExtensionstosupportGanochartdisplaysandotherfeatures
www. chameleoncloud.org
CHI:CONFIGUREANDINTERACT� BareMetal� Allowdeepreconfigurability(accesstoconsole)� Mapmul[pleappliancestoalease� Snapshopngforimagesharing� Efficientappliancedeployment� Handlecomplexappliances
� Virtualclusters,cloudinstalla[ons,etc.� Interact:shapeexperimentalcondi[ons
� OpenStackIronic,Glance,anduser-data/meta-data
www. chameleoncloud.org
CHI:INSTRUMENTATIONANDMONITORING
� Enablesuserstounderstandwhathappensduringtheexperiment
� Instrumenta[on:high-resolu[onmetrics� Typesofmonitoring:
� Infrastructuremonitoring(e.g.,PDUs)� Userresourcemonitoring� Customusermetrics
� Aggrega[onandArchival� Easilyexportdataforspecificexperiments
� OpenStackCeilometer+custommetrics
www. chameleoncloud.org
CHI:OVERALLARCHITECTURE
Portal Identity
Management Resource discovery
Grid’5000 Reference
API
Reservation Service (Blazar)
Horizon
Keystone
Nova
Ironic
Neutron
Ceilometer
Glance
special sauce
Custom development
OpenStack
www. chameleoncloud.org
HOWDOESITWORKINTERNALLY?Chameleon
user Blazar
R1 R2 Reservations
Reserve resources
Nova
P1 P2 Resource pools
freepool
Create dedicated resource pool
(host aggregate)
www. chameleoncloud.org
HOWDOESITWORKINTERNALLY?Chameleon
user Blazar
R1 R2 Reservations
Reserve resources
Nova
P1 P2 Resource pools
freepool
Create dedicated resource pool
(host aggregate)
Launch bare-metal instances in reservation
Ironic
Schedule then request bare-metal
deployment
Cluster Control & provision (IPMI / PXE / iSCSI)
www. chameleoncloud.org
DEVELOPEDINTHEOPEN
� hops://github.com/ChameleonCloud
� OpenStackpatches,Grid’5000g5k-checkspatches� Userportal,resourcediscovery,Horizonextensions� Testbedconfigura[onwithPuppet(notyetopen)
� AimistoprovideaChameleon-in-a-box!
www. chameleoncloud.org
CHAMELEONTIMELINEANDSTATUS� 10/2014:Projectstarts� 12/2014:FutureGrid@Chameleon(OpenStackKVM)� 04/2015:ChameleonTechnologyPreviewonFutureGridhardware
� 06/2015:ChameleonEarlyUseronnewhardware� 07/2015:ChameleonPublicavailability(baremetal)� 09/2015:ChameleonKVMOpenStackcloudavailable� 10/2015:InteroperabilitywithGENI(1stphase)� Today:600+users/150+projects� 2016:Heterogeneoushardwareavailable
www. chameleoncloud.org
INTHEPIPELINE…� Y1themewas“makingthingspossible”:focusoninfrastructure� Y2themeis“frompossibletoeasy”:focusonusers� Outreach:webinars,tutorials,userstories� Experimentmanagement
� Appliances:snapshopng,sharing,appliancemarketplace,community� ExperimentBlueprint:automa[onandpreserva[on
� Func[onality:frompossibletoeasy� Beoerreconfigura[oncapabili[es� Beoernetworkingcapabili[es� Beoerinfrastructuremonitoring(PDUs,etc.)� Andothers
www. chameleoncloud.org
www. chameleoncloud.org
OPENSTACK:LESSONSLEARNED
� Opera[ngOpenStackcanbedifficult� Forgetabouttradi[onalUNIXadmin:evenbaremetalneedsOVSandIPnamespaces� Thousandsofconfigura[onswitches,manywithlioledocumenta[on� Mustreadthecode!� Inter-dependentcomponentsèchecksalllogswithdebugenabled
� UpstreamdevelopmentmostlydoneonKVM� Lesstes[ngofIronicèbugs
� Lotsofexperimentalprojectswithlioleupstreamsupport� WewereluckyascommunityinterestedinrevivingBlazar
� Donotputtoomuchhopeinblueprints� Manyabandonedordelayedformul[plereleases
� Wheretofindhelpandpossiblefixes?� bugs.launchpad.net(bugreports)/review.openstack.org(patches)� MostdevelopersavailableonIRC
www. chameleoncloud.org
VIRTUALIZATIONORCONTAINERIZATION?
� YuyuZhou,UniversityofPiosburgh� Research:lightweightvirtualiza[on� Testbedrequirements:
� Baremetalreconfigura[on� Bootfromcustomkernel� Consoleaccess� Up-to-datehardware� Largescaleexperiments
SC15 Poster: “Comparison of Virtualization and Containerization Techniques for HPC”
www. chameleoncloud.org
TEACHINGCLOUDCOMPUTING� NiravMerchantandEricLyons,UniversityofArizona
� ACIC2015:project-basedlearningcourse� Dataminingtofindexoplanets� ScaledanalysispipelinebyJaredMales� DevelopaVM/workflowmanagement
applianceandbestprac[cethatcanbesharedwithbroadercommunity
� Testbedrequirements:� EasytouseIaaS/KVMinstalla[on� Minimalstartup[me� Supportdistributedworkers� Blockstore:makecopiesofmany100GB
datasets
www. chameleoncloud.org
DEFENDINGCOMPUTINGRESOURCES� LedbyJessieWalker,UniversityofArkansasatPineBluff
� Workingondetec[ngcyberaoacks� Modelandvisualizemul[-stage
intrusionaoacks(MAS)� CreatecustomSnortrulestomonitor
trafficanddetectaoacks� Complexandexpensivetobuyandusetheirownhardware
� Limitedbypermissionsneededtoruncybersecurityaoacksinsidecampuses
� Testbedrequirements:� Virtualmachinestosimulateaoacksin
thecloudandrunintrusiondetec[onsystems
www. chameleoncloud.org
PARTINGTHOUGHTS� FromvisiontorealitywithExpressDelivery
� Builtfromscratchwithinayearonashoestring� Thankstoexperiencefromothertestbeds,esp.Grid’5000
� Thankstoopen-sourcecodefromotherprojects,esp.OpenStackandGrid’5000
� Opera[onaltestbed:600+users/150+projects� Federa[on
� OngoingeffortswithGENI� Grid’5000too?
www. chameleoncloud.org
CHAMELEONTEAMKate Keahey
Chameleon PI Science Director
Architect University of Chicago
Joe Mambretti Programmable networks Federation activities Northwestern University
Dan Stanzione Facilities Director
TACC
Pierre Riteau DevOps Lead University of Chicago
Paul Rad Industry Liaison
Education and training UTSA
DK Panda High-perf networking Ohio State University
www. chameleoncloud.org
COMEANDWORKWITHUS!
� Asacollaborator� Generalizingresults:whatwouldKameleonorDISTEMlooklikeintheChameleoncontext?
� AlsoprojectsinresourcemanagementforHPC&Cloud,elas[cscalingplaeorm
� Summerinternshipopportuni[es
� Asaco-worker� Programmingpostdocorresearchingprogrammer
Recommended