33
On-Demand HDP Clusters using Cloudbreak and Ambari Karthik Karuppaiya Narendra Bidari Sr. Engineering Manager, CPE Sr. So0ware Engineer, CPE Dublin Hadoop Summit 2016 – Karthik Karuppaiya & Narendra Bidari

On Demand HDP Clusters using Cloudbreak and Ambari

Embed Size (px)

Citation preview

On-DemandHDPClustersusingCloudbreakandAmbari

KarthikKaruppaiya NarendraBidariSr.EngineeringManager,CPE Sr.So0wareEngineer,CPE

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

Agenda

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

IntroducIon1

BigDataPlaJormChallenges2

WhatisthesoluIon?3

SelfServiceAnalyIcs4

GoingHybridCloudusingCloudbreak5

IngesIngData6

IntroducFon

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

Symantec-  Symantecistheworldleaderinprovidingsecurityso0wareforbothenterprises

andendusers

-  Thereare1000’sofEnterprisesandmorethan400milliondevices(Pcs,TabletsandPhones)thatrelyonSymantectohelpthemsecuretheirassetsfromaXacks,includingtheirdatacenters,emailsandothersensiIvedata

CloudPlaHormEngineering(CPE)-  BuildconsolidatedcloudinfrastructureandplaJormservicesfornextgeneraIon

datapoweredSymantecapplicaIons

-  AbigdataplaJormforbatchandstreamanalyIcsintegratedwithOpenstack

-  Opensourcecomponentsasbuildingblocks

-  HadoopandOpenstack

-  Bridgefeaturegapsandcontributeback

Agenda

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

IntroducIon1

BigDataPlaJormChallenges2

WhatisthesoluIon?3

SelfServiceAnalyIcs4

GoingHybridCloudusingCloudbreak5

IngesIngData6

BigDataPlaHormChallenge• HundredsofmillionsofusersgeneraIngBillionsofeventseverydayfromacrosstheglobe

• HundredsofBigDataApplicaIonDevelopersdeveloping1000sofapplicaIons

• At12PBand500+nodes,CloudPlaJormEngineeringAnalyIcsteambuiltthelargestsecuritydatalakeatSymantec

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

BigDataPlaHormChallenge

• Great!NowDeveloperscanstartbuildingapplicaIonsonourBigDataLake• 100sofdevelopersstartbuildingapplicaIonsusingdifferentbigdatatools

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

BigDataPlaHormChallenge

• Productteamdeveloperswantsquickchanges,latestversions• PlaJormteamwantsstability!

• Soon,frustraIonprevails

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

Agenda

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

IntroducIon1

BigDataPlaJormChallenges2

WhatisthesoluIon?3

SelfServiceAnalyIcs4

GoingHybridCloudusingCloudbreak5

IngesIngData6

WhatistheSoluFon?

• BuildanduseyourownliXleclusterfordevelopment• Copysubsetofdatafordevelopmentpurposes

• Teardowntheclustera0erdevelopmentiscomplete

• RepeatandRinse

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

WhatistheSoluFon?

• ButBuildingclustersarehardandImeconsuming• Toomanyservicestoinstallandconfigure

• Developersarenotinterestedinbuildingandmanagingclusters

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

WhatistheSoluFon?–SelfService

• Whatifwemakeitreallyeasytobuildclusters?• AbstractallthedeploymentcomplexiIesandenabledeveloperstogettheirownclusterinoneclickofabuXon

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

WhatistheSoluFon?–ExtendAmbari

• WhatabouttheservicesthatarenotsupportedbyAmbarioutofthebox?• WewriteourownAmbaricustomstack

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

Agenda

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

IntroducIon1

BigDataPlaJormChallenges2

WhatisthesoluIon?3

SelfServiceAnalyIcs4

GoingHybridCloudusingCloudbreak5

IngesIngData6

SelfServiceAnalyFcs(SSA)Clusters

• RESTfulwebservicestoallowcreaIonandmanagementofcustomclusters• Selectfrompre-definedAmbariBlueprints

• SpinsupVMsonourprivateOpenstackcloud

•  InstallsHDPstackspecifiedaspartofAmbariblueprint

• Dashingdashboardtomonitorandmanage(start/stop/kill)clusters

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

Environment

• PrivatecloudonOpenstack• HDP2.3.2• Ambari2.1.2

• AbilitytoseamlesslysupportpubliccloudlikeAWS

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

SSAArchitecture

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

SSAServices

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

SSADemo

SSAProsandCons

• Pros– Automatesspinningupclusters

– GetsclusterupandrunningwithHDPinminutes

– Userscanspinup/killclustersatwill

– CentralDashboardtomanageclusters

– Customizabletocatertoourprivatecloud

• Cons– Tightlycoupledwithourspecificprivatecloudinfrastructure

– Notportabletoworkwithotherpubliccloudvendors

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

Agenda

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

IntroducIon1

BigDataPlaJormChallenges2

WhatisthesoluIon?3

SelfServiceAnalyIcs4

GoingHybridCloudusingCloudbreak5

IngesIngData6

NextGenSSA

• Thisisallgreat!But,weareoutofcapacityonourprivateopenstackcloud.• Justusethesameso0waretosetupclustersonAWSandGoogleCloud–thesamecodeshouldworkright?

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

NextGenSSA–Cloudbreak

• Cloudbreak– CloudbreakhelpstosimplifytheprovisioningofHDPclustersincloudenvironments

– SupportsmulIplecloudsincludingAWS,Google,AzureandOpenstack

– UsesApacheAmbariforHDPinstallaIonandmanagement– HasaniceUItobuildandmanageclusters

– Supportsautomatedclusterscaling

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

HybridCloudUsingCloudbreak-Gaps

•  Cloudbreak’soutoftheboxsupportforopenstackislimited–  No support for Keystone v3 – private cloud uses keystone v3 –  No support for native Openstack APIs –  Supports only Heat template for cluster provisioning

•  Nowwehavetwodifferentsystems–oneforprivateopenstackcloudandoneforAWS

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

Symantec’sContribuFontoCloudbreak

• Keystonev3support– Cloudbreak1.2–released03/2016

• NaIveOpenstackAPIsupport(withoutusingHeattemplate)– Codedevelopmentisdone,willbecontributedbacksoon.

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

Cloudbreak–KeystonreV3Screenshot

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

Cloudbreak–KeystoneV3ProjectScopeScreenshot

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

Agenda

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

IntroducIon1

BigDataPlaJormChallenges2

WhatisthesoluIon?3

SelfServiceAnalyIcs4

GoingHybridCloudusingCloudbreak5

IngesIngData6

Whereismydata?

• Great!Inowhaveashinynewcluster–butwhatdoIdowithnodata?

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

DataIngesFonService

• BuiltasaStorm-TridentTopology• SupportvarioussourceslikeKaja,RabbitMQ,HDFS,Cassandra,etc

• Pluggableinterfacetoaddmoresources

• Samplethedata,fortesIngpurposes.

• GeneralizedtemplateprovidesabilitytoadddatatransformaIon

• Abilitytomonitordatatransferandcontrolrateoftransfer

• TryingtoachieveExactlyoncemessageDelivery

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

DataIngesFonServiceArchitecture

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

DataIngesFonServiceDemo

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

SummaryandFutureWork

•  Ajourneytowardsoneclickclusterdeployment•  Cloudbreak-onetoolforallcloud-  EnableCloudbreaktosupportourversionofOpenstack

-  EnableCloudbreaktosupportBaremetalclusterprovisioning

-  SinglelargeYARNclusterforvarietyofcomputeandstorageloads

•  Opensource–useandcontribute-  Workwithcommunitytoaddressgaps

•  SSAandDataIngesIoncodealreadyopensourced-  hXps://github.com/symantec/

DublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari

ThankYou!Q&A

KarthikKaruppaiyakarthik_karuppaiya@symantec.comNarendraBidarinarendra_bidari@symantec.comDublinHadoopSummit2016–KarthikKaruppaiya&NarendraBidari