Upload
corwin-brown
View
43
Download
1
Embed Size (px)
Citation preview
MESOSANDTHESTATEOFAPPLICATION
DEPLOYS
WHOAMI?KoryBrown.DevOpsEngineerFromTexas.SiteOperationsEngineeratFitbit.
WHATISTHISTALKABOUT?
MESOS!
APPLICATIONDEPLOYMENTS!
MORESPECIFICALLYHowwe,asanindustry,usedtohandleapplicationdeployments.Howwe,asanindustry,currentlyhandleapplicationdeployments.HowIthinkwe'llhandleapplicationdeploymentsinthefuture.
WAIT,WHATABOUTMESOS?We'llgetthere.
ButtounderstandtheproblemsMesossolvesfordeployswemustfirst:
Understandwhatwedotoday.Understandhowwegothere.
Mesos,inthiscase,isanimplementationdetail.
WHYSHOULDYOUCARE?It'ssuperfreakingcoolAndyouknow...DevOps
WHATISDEVOPS?ThisisnotatalkonDevOps......butit'simportantforustohaveacommondefinition.
"DevOpsisaboutrecognizingthatthebackinginfrastructureisnotseparatefromyourapplication,butratheravitalpartof
it."--Me
HOWTHINGSUSEDTOBE
YOUWANTTODEPLOY1. GetaServer.
GETASERVERPutinarequest.Ahumanallocatesaserver.Ahumaninstallsanoperatingsystem.Ahumanensuresthenetworkingiscorrect.Ahumaninstallsallspecifieddependencies.etc,etc,etc
YOUWANTTODEPLOY1. GetaServer.2. Deployyourapplication.
DEPLOYYOURAPPLICATIONAnOpsguylogsintotheserver.
Downloadsyourapplication.Installsyourapplication.Configuresyourapplication.
DEPLOYYOURAPPLICATIONTheapplicationdoesn'tstart.
Theycallyou(probablyinthemiddleofthenight).Afteranhouroftroubleshooting,yourealizetheytypoedtheconfig.Youhateeverything.
YOUWANTTODEPLOY1. GetaServer.2. Deployyourapplication.3. Repeatntimesforscale.
Eachtimeslightlydifferently
SOMETHINGGOESWRONG(Hardwarefailure/OSissues/Maintenance/etc)
GodHelpYou.
SOMETHINGGOESWRONG(Hardwarefailure/OSissues/Maintenence/etc)
Fileaticket.Datacentertechfindsthemachine.Theypulltheharddrive.
Whichisweird,becauseIsaidtheRAMtestedbad.Twoweekslaterthemachineislostinaseaoftickets.Everythingisterrible.
THISSUCKEDNotuncommontomeasureturnaroundinweeks.Littletonoautomation.Incrediblyerrorprone.Requiresapersonatmosteverystep.Generallyleadstofinelycraftedartisanalmachines.
TL;DR:
Horrificallyinefficient,errorprone,andtimeconsuming.
HOWTHINGSARETODAY
YOUWANTTODEPLOY1. Spinupanewcloudinstance.2. Runyourautomationtoolofchoice
(Puppet/Ansible/Chef/etc).3. Repeatntimes.
SOMETHINGGOESWRONG(Hardwarefailure/OSissues/Maintenance/etc)
Spinupanewinstance!
Turnaroundtimesaresolow,whocares?
Birthofthe"TreatserverslikeCattle,notpets"thoughtprocess.
SUCKSWAYLESSTurnaroundmeasuredinminutes.Almostentirelyautomated.Fairlydeterministic.Doesn'thavetoinvolvepeopleatall!
BUTNOTPERFECTStillmanageanentireOSforeachapplication.Thisincludes:
Updates!(BothOSandanyapplications/libraries)Backups!Monitoring/Metrics!etc,etc,etc
BUTNOTPERFECTNotfullyutilizingtheavailablehardware.
Unlessyourapplicationrunsconstantlyat100%CPUandRAM,youarewastingcyclesandmoney!
BUTNOTPERFECTTL;DR:
Good,butnotgreatpastacertainscale.
ENTERMESOS
WHATISMESOS?Opensourcedistributedclustermanagementtool.
WHATISMESOS?Anabstractionlayerforcomputingresources(CPU/RAM/Disk/etc)containedwithinapoolofservers.
MESOSISADISTRIBUTEDKERNEL
AtraditionalKernel(likeLinux!)providesasetofAPIsforinteractingwithavailablehardwareonalocalmachine.
AdistributedKernel(likeMesos!)providesasetofAPIsforinteractingwithavailablehardwareonapoolofservers.
THEDATACENTEROSTheKernelitselfisasmallpartofanOperatingSystem.Manyothercomponentsthatmakeitsomethinguseful.Mostrelevantforusrightnow:
Initsystem--Someprocesstomanagethelifecycleofotherprocesses.Cron--Someprocesstoruntasksonsomespecifiedinterval
Mesosimplementsthisfunctionalitywith"Frameworks".
WHAT'SAFRAMEWORK?AMesos"Application".Mustcontain2components:
Ascheduler,whichregisterswiththeMaster,andreceivesresourceoffers.OneormoreExecutors,whichlaunchestasksonslaves.
MESOSFRAMEWORKSInitSystem:
MarathonAurora
Cron:
ChronosAurora
AQUICKASIDEONFRAMEWORKSTheFrameworkslistedarejustonesthatarerelevantinthe
contextofapplicationdeploys.
AQUICKASIDEONFRAMEWORKSMesosismeanttobeusedasagenericcomputingclustermanager,whichmeansyoucanalsouse,asaframework:
JenkinsHadoopSparkStormKafkaManymorethingsprobably!
Perhapsmoreimportantly,alloftheseframeworkscansharethesameclusterofmachines!
MESOSOFFERSAlistofanagentnode'savailableresources(CPU/RAM/Disk/etc).Two-TierSchedulingSystem.
AgentNodesendsresourcestoMasternode.Masternodeusesanalgorithmcalled"DominantResourceFairness"tofairlypassthatresourceoffertoit'sregisteredFrameworks.
OTHERMESOSRELATEDWORDS
MasterService:RunsonamachinethathasbeendesignatedasaMesosMaster.Managesagents.
Agent:Runstasksthatbelongtoframeworks.
PreviouslyreferredtoasaSlave,butthisisbeingchangedinnewerversions.
Task:AunitofworkscheduledbytheFramework,andexecutedontheAgent.
ZooKeeper:Distributedconsensusframework.
Masterusesforleaderelection.Frameworksusetosharestate.
MESOSARCHITECTURE
Quorumofmasters,keptinsyncbyZooKeeper.
Slavessendresourceofferstotheelectedmaster.
Mastersofferthoseresourcestoit'sregisteredframeworks.
Iftheofferisaccepted,frameworksexecutetasksonslaves.
SHOULDMYAPPRUNONMESOS?
STATELESSAPPLICATIONWebApp(Rails,Django,Play,etc)JenkinsBuildSlavesMemcachedAny12factor-ishapplication
Yes
STATEFULAPPLICATIONSMySQLPostgresSQLJenkinsMaster
No
DOESITADDRESSCURRENTPROBLEMS?ManagingfewerOperatingSystems.Betterutilization.
MANAGINGFEWEROPERATINGSYSTEMS
ManageanOSforeachphysicalhost,NOTeachapplication.
BETTERUTILIZATIONInsteadofoneapplicationrunningperhost...
...webinpackthemalltogether!
Anyresourcesyourapplicationisn'tusingcanbesharedasneeded!
BONUS:REDUNDANCYBAKEDIN!Sincewehavethisbigpoolofsharedresources,ifan
instanceorhostdies,wejustrescheduleitsomewhereelse.
STATICALLYPROVISIONEDCLUSTER
STATICALLYPROVISIONEDCLUSTER
Railsappnowat1/3rdcapacity!
DYNAMICALLYPROVISIONEDCLUSTER
DYNAMICALLYPROVISIONEDCLUSTER
Twonodesdown!
Whocares?
DEPLOYINGWITHMESOS
YOUWANTTODEPLOY1. WriteaJobConfigforyourFrameworkofChoice.
WeuseAurora--it'sprettysweet.2. TellAuroratorunit.
SOMETHINGGOESWRONG(Hardwarefailure/OSissues/Maintenance/etc)
Aurorareschedulesyourapplicationtoanothernode.
HARDLYSUCKSATALL!Turnaroundmeasuredinseconds!Entirelyautomated!Deterministic!FaultTolerant!
DEVELOPERPERSPECTIVE
TESTINGMesoscanbeyourdev,qa,staging,andproductionenvironments.
Aurorasplitspermissionsandallocationsbetweenconfiguredenvironments.Prioritizesproduction.Meaningitwillpreemptjobsrunningin"lesser"environments.
BecomeseasytotestinaProdlikeenvironmentbecauseitISthesameenvironment!
SELF-MANAGEMENTNomorewaitingonOpsforhardware!Easilyautomate-able!
WantHerokustyleddeploys?ThisishowHerokudoesit.
TL;DRWe'vecomealonglongway.Thing'ssuckwaylessnow.Suckinglessandlesseveryday.
QUESTIONS?