View
827
Download
3
Embed Size (px)
Citation preview
On-DemandHDPClustersusingCloudbreakandAmbari
KarthikKaruppaiya NarendraBidariSr.EngineeringManager,CPE Sr.So0wareEngineer,CPE
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
Agenda
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
IntroducIon1
BigDataPlaJormChallenges2
WhatisthesoluIon?3
SelfServiceAnalyIcs4
GoingHybridCloudusingCloudbreak5
IngesIngData6
IntroducFon
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
Symantec- Symantecistheworldleaderinprovidingsecurityso0wareforbothenterprises
andendusers
- Thereare1000’sofEnterprisesandmorethan400milliondevices(Pcs,TabletsandPhones)thatrelyonSymantectohelpthemsecuretheirassetsfromaXacks,includingtheirdatacenters,emailsandothersensiIvedata
CloudPlaHormEngineering(CPE)- BuildconsolidatedcloudinfrastructureandplaJormservicesfornextgeneraIon
datapoweredSymantecapplicaIons
- AbigdataplaJormforbatchandstreamanalyIcsintegratedwithOpenstack
- Opensourcecomponentsasbuildingblocks
- HadoopandOpenstack
- Bridgefeaturegapsandcontributeback
Agenda
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
IntroducIon1
BigDataPlaJormChallenges2
WhatisthesoluIon?3
SelfServiceAnalyIcs4
GoingHybridCloudusingCloudbreak5
IngesIngData6
BigDataPlaHormChallenge• HundredsofmillionsofusersgeneraIngBillionsofeventseverydayfromacrosstheglobe
• HundredsofBigDataApplicaIonDevelopersdeveloping1000sofapplicaIons
• At12PBand500+nodes,CloudPlaJormEngineeringAnalyIcsteambuiltthelargestsecuritydatalakeatSymantec
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
BigDataPlaHormChallenge
• Great!NowDeveloperscanstartbuildingapplicaIonsonourBigDataLake• 100sofdevelopersstartbuildingapplicaIonsusingdifferentbigdatatools
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
BigDataPlaHormChallenge
• Productteamdeveloperswantsquickchanges,latestversions• PlaJormteamwantsstability!
• Soon,frustraIonprevails
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
Agenda
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
IntroducIon1
BigDataPlaJormChallenges2
WhatisthesoluIon?3
SelfServiceAnalyIcs4
GoingHybridCloudusingCloudbreak5
IngesIngData6
WhatistheSoluFon?
• BuildanduseyourownliXleclusterfordevelopment• Copysubsetofdatafordevelopmentpurposes
• Teardowntheclustera0erdevelopmentiscomplete
• RepeatandRinse
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
WhatistheSoluFon?
• ButBuildingclustersarehardandImeconsuming• Toomanyservicestoinstallandconfigure
• Developersarenotinterestedinbuildingandmanagingclusters
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
WhatistheSoluFon?–SelfService
• Whatifwemakeitreallyeasytobuildclusters?• AbstractallthedeploymentcomplexiIesandenabledeveloperstogettheirownclusterinoneclickofabuXon
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
WhatistheSoluFon?–ExtendAmbari
• WhatabouttheservicesthatarenotsupportedbyAmbarioutofthebox?• WewriteourownAmbaricustomstack
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
Agenda
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
IntroducIon1
BigDataPlaJormChallenges2
WhatisthesoluIon?3
SelfServiceAnalyIcs4
GoingHybridCloudusingCloudbreak5
IngesIngData6
SelfServiceAnalyFcs(SSA)Clusters
• RESTfulwebservicestoallowcreaIonandmanagementofcustomclusters• Selectfrompre-definedAmbariBlueprints
• SpinsupVMsonourprivateOpenstackcloud
• InstallsHDPstackspecifiedaspartofAmbariblueprint
• Dashingdashboardtomonitorandmanage(start/stop/kill)clusters
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
Environment
• PrivatecloudonOpenstack• HDP2.3.2• Ambari2.1.2
• AbilitytoseamlesslysupportpubliccloudlikeAWS
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
SSAProsandCons
• Pros– Automatesspinningupclusters
– GetsclusterupandrunningwithHDPinminutes
– Userscanspinup/killclustersatwill
– CentralDashboardtomanageclusters
– Customizabletocatertoourprivatecloud
• Cons– Tightlycoupledwithourspecificprivatecloudinfrastructure
– Notportabletoworkwithotherpubliccloudvendors
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
Agenda
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
IntroducIon1
BigDataPlaJormChallenges2
WhatisthesoluIon?3
SelfServiceAnalyIcs4
GoingHybridCloudusingCloudbreak5
IngesIngData6
NextGenSSA
• Thisisallgreat!But,weareoutofcapacityonourprivateopenstackcloud.• Justusethesameso0waretosetupclustersonAWSandGoogleCloud–thesamecodeshouldworkright?
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
NextGenSSA–Cloudbreak
• Cloudbreak– CloudbreakhelpstosimplifytheprovisioningofHDPclustersincloudenvironments
– SupportsmulIplecloudsincludingAWS,Google,AzureandOpenstack
– UsesApacheAmbariforHDPinstallaIonandmanagement– HasaniceUItobuildandmanageclusters
– Supportsautomatedclusterscaling
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
HybridCloudUsingCloudbreak-Gaps
• Cloudbreak’soutoftheboxsupportforopenstackislimited– No support for Keystone v3 – private cloud uses keystone v3 – No support for native Openstack APIs – Supports only Heat template for cluster provisioning
• Nowwehavetwodifferentsystems–oneforprivateopenstackcloudandoneforAWS
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
Symantec’sContribuFontoCloudbreak
• Keystonev3support– Cloudbreak1.2–released03/2016
• NaIveOpenstackAPIsupport(withoutusingHeattemplate)– Codedevelopmentisdone,willbecontributedbacksoon.
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
Agenda
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
IntroducIon1
BigDataPlaJormChallenges2
WhatisthesoluIon?3
SelfServiceAnalyIcs4
GoingHybridCloudusingCloudbreak5
IngesIngData6
Whereismydata?
• Great!Inowhaveashinynewcluster–butwhatdoIdowithnodata?
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
DataIngesFonService
• BuiltasaStorm-TridentTopology• SupportvarioussourceslikeKaja,RabbitMQ,HDFS,Cassandra,etc
• Pluggableinterfacetoaddmoresources
• Samplethedata,fortesIngpurposes.
• GeneralizedtemplateprovidesabilitytoadddatatransformaIon
• Abilitytomonitordatatransferandcontrolrateoftransfer
• TryingtoachieveExactlyoncemessageDelivery
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari
SummaryandFutureWork
• Ajourneytowardsoneclickclusterdeployment• Cloudbreak-onetoolforallcloud- EnableCloudbreaktosupportourversionofOpenstack
- EnableCloudbreaktosupportBaremetalclusterprovisioning
- SinglelargeYARNclusterforvarietyofcomputeandstorageloads
• Opensource–useandcontribute- Workwithcommunitytoaddressgaps
• SSAandDataIngesIoncodealreadyopensourced- hXps://github.com/symantec/
DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari