Upload
hortonworks
View
44
Download
0
Embed Size (px)
Citation preview
1 ©HortonworksInc.2011–2016.AllRightsReserved
AnalyticsModernization:ConfiguringSAS®GridManagerforHadoopMarkLochbihler,ChannelsPartnerEngineering,Hortonworks
April21,2017
2 ©HortonworksInc.2011–2016.AllRightsReserved
PresenterMarkLochbihler- Hortonworks,Inc.
MarkisaPrincipalArchitectwith27yearsofSASexperience,havingspent17yearswithinFinancialServices.HeiscurrentlyinhisfourthyearatHortonworksandisfocusedonintegratingtheHadoopecosystemwithstrategicpartnerecosystemproductsandsolutions.MarkhasaBSinComputerSciencefromNorthCarolinaStateUniversityandalsoholdsaSixSigmaBlackBelt.
MarkisaPrincipalArchitectwith27yearsofSASexperience,havingspent17yearswithinFinancialServices.HeiscurrentlyinhisfourthyearatHortonworksandisfocusedonintegratingtheHadoopecosystemwithstrategicpartnerecosystemproductsandsolutions.MarkhasaBSinComputerSciencefromNorthCarolinaStateUniversityandalsoholdsaSixSigmaBlackBelt.
3 ©HortonworksInc.2011–2016.AllRightsReserved
Clickstream Web&Social
Geolocation Sensor& Machine
ServerLogs
Unstructured
SOUR
CES
Existing Systems
ERP CRM SCM
ANAL
YTIC
S
Data Marts
Business Analytics
Visualization& Dashboards
ANAL
YTIC
S
Applications Business Analytics
Visualization& Dashboards
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
°
HDFS (Hadoop Distributed File System)
YARN: Data Operating System
Interactive Real-TimeBatch SAS GridManager
Batch BatchMPP
EDW
Figure1:AHadoopClusterrunningBatch,InteractiveandRealTimeEngines,includingSASGridManager.
HadoopandYARN101
4 ©HortonworksInc.2011–2016.AllRightsReserved
Agenda
à Whyà Architectureà RealworldSizingà ConfigurationDetailsà Demo- UserPerspectiveà MigrationConsiderationsà CalltoAction
5 ©HortonworksInc.2011–2016.AllRightsReserved
WhyMoveSASWorkloadstoHadoop?
à Lowerinfrastructureandstoragecostsà Optimizeperformanceà Minimizeadministrativeoverhead
6 ©HortonworksInc.2011–2016.AllRightsReserved
Figure2:SASGridManagerforHadoopConceptualArchitecture(Reference:SASGridManagerforHadoop)
7 ©HortonworksInc.2011–2016.AllRightsReserved
KEYSASGRIDARCHITECTURECOMPONENTS
à SASMetadataServer– ASASservicesupporting,amongotherobjects,thelogicaltophysicalmappingofSASLogicalServerstoYARN.Inourexamples,wewillbeusingthelocalSASServer,SASGrid.
à SASGridControlServer– ASASservicerunningontheYARNResourceManagernode.ItiscalledbySASclientstocommunicatewithYARNResourceManagertonegotiateresourcesforSASjobs.
à SASObjectSpawner – ASASServicerunningontheYARNResourceManagernode.ItisusedtolaunchSAScontainswithYARN.
à SASClients – includesSASbatchjobs,SASGSUB(abatchgridutility),andinteractiveClientslikeSASEnterpriseGuide.SASClientswill“Connect”and“Disconnect”fromaSASLogicalServer,likeSASGrid,definedinSASMetadataandshownaboveinFigure4.
8 ©HortonworksInc.2011–2016.AllRightsReserved
KEYHADOOPINTEGRATIONPOINTSFORSASGRID
à YARNResourceManager – AHadoopYARNMasterServiceresponsibleforcontrollingglobalHadoopclusterresourceusage.ResourceManagerenablesmulti-tenancyandSLAs.ItisalsoresponsibleformonitoringNodeManagerState,submittingApplicationMasterrequests,verifyingcontainerlaunchandmonitoringApplicationMasterstate.
à YARNNodeManager – ThisHadoopYARNWorkerNodeServicemanageslocalresourcesonbehalfoftherequestingservice.ItalsotracksnodehealthandcommunicatesstatustotheResourceManager.
à YARNCapacityScheduler – AHadoopYARNservice,whichcanbeconfiguredtoprovideJobSchedulingpoliciesforSLAs,Users,Groups,andResources.
à HadoopDataNodes – HadoopDistributedFileSystem(HDFS)storagenodes.à KerberosService – TheHadoopclustermustbeKerborized.
9 ©HortonworksInc.2011–2016.AllRightsReserved
HADOOPMASTERNODEDECISIONS
à ASASGridControlServerandSASObjectSpawner mustbedeployedonthesameHadoopMasterNodeastheYARNResourceManager.
10 ©HortonworksInc.2011–2016.AllRightsReserved
YARNNodeManager YARNNodeManager YARNNodeManager YARNNodeManager
Job1Container1.1
YARNNodeManager YARNNodeManager YARNNodeManager YARNNodeManager
YARNNodeManager YARNNodeManager YARNNodeManager YARNNodeManager
Job1Container1.2
Job1Container 1.3
Job1AM 1 SASAM 2
SAS Grid Manager w YARN Architecture Overview
SASClient• SASGSUB• SASBatch• SASEG
YARNResourceManager
YARNCapacityScheduler
SASGridControlServer
SASObjectSpawner
SAS MetadataServer
SASContainer2.1
Figure3: SASGridManagerforHadoopwYARNArchitectureOverview
11 ©HortonworksInc.2011–2016.AllRightsReserved
HADOOPWORKERNODEDECISIONS
à SASHOMEandSASCONFIGForeachHadoopWorkerNodewhichisacandidatetorunSASjobsmustbeconfiguredsothatSASHOMEandSASCONFIGareavailable.
à SASWORKandSASUTILItiscriticalthateachHadoopWorkernodewhichisacandidatetorunSASjobsisconfiguredcorrectly.AlargepartoftheI/OrequiredwhenrunningSASanalyticsistothescratchortemporarylocationsofSASWORKandSASUTIL.SASrequiredI/Othroughputforthesefilesystems,toprovidethenecessaryperformancetoaheavilyloadedsystem,is100MB/sec/core.AdequatesizingforSASWORKisalsonecessary.
à TraditionalStorageandComputeVerseComputeOnlyWorkerNodesWithinHadoop,itisacommonpracticetohavedualpurposeworkernodeswhichrunmathorprogramsnearonthesamenodeswheretheHadoopdataresides.WithHadoop2.x,theconceptofdedicatedComputeOnlyHadoopWorkerNodesisanoption.ForSASGridManagerforHadoop,bothoptionsareanoption.ForComputeOnly,theseHadoopWorkerNodeswillnolongerhosttherequiredservicesanddataforHDFS,givingmorecomputingresourcesdedicatedtotheprogramsrunningonthesenodes.ThetradeoffforComputeOnlyHadoopNodesisthelossofHDFSdatalocality.YoursitesSASworkloadrequirementswilldeterminewhichtypeofWorkerNodestodeployforSASGridManagerforHadoop.
12 ©HortonworksInc.2011–2016.AllRightsReserved
Figure4: ViewofSASManagementConsole,withexpandedSASGrid LogicalServer.
13 ©HortonworksInc.2011–2016.AllRightsReserved
REALWORLDCONFIGURATIONEXAMPLE
TotalRAMPerCluster
Node
AvailableContainerRAM
PerNode
#WorkerNodesinCluster
TotalContainer
RAMAvailable
AmountofClusterRAMAllocatedto
SASQueue
TotalContainerRAMforSAS
Queue
256GB 192GB 28 5.376TB 50% 2.688TB
Average#ofBatchJobsorInteractiveSessionsper
SASUser
#ContainersPerJoborSession
Average#ofAdditionalHadoopContainersSpawnedfrominitial
SASJobContainer
AverageTotal#ofContainersperSAS
user
2 2 4 8
Table1: TotalClusterYARNContainerRAMAvailableforSASUsers
Table2: AverageNumberofContainersperSASUsers
14 ©HortonworksInc.2011–2016.AllRightsReserved
REALWORLDCONFIGURATIONEXAMPLE(Continued)
SASAppType(UserType)
ContainerSize
Anticipated%ofUsersTypeonServer
AvailableClusterMemoryforSASjobs
Max#ofContainers
Average#ContainersPerSASUser
Total#ofSASUsers
Low(General/Analyst) 2GB 70% 1.881TB 940 8 117
Medium 4GB 20% 537GB 134 8 16
High 8GB 10% 268GB 33 8 4Totals 2.688TB 1107 134
Table3: BreakdownofSASApplicationTypestobeconfiguredforSASUsers
15 ©HortonworksInc.2011–2016.AllRightsReserved
YARN Capacity SchedulerExample: 50% of Cluster RAM allocated to SAS Queue
ResourceManager
Scheduler
root
Adhoc30%
SAS50%
Mrkting20%
Dev10%
Reserved20%
Prod70%
Prod80%
Dev20%
P070%
P130%
Capacity Scheduler
HierarchicalQueues
REALWORLDCONFIGURATIONEXAMPLE(Continued)
Figure5: YARNCapacitySchedulerLogicalView- SASQueue- 50%HadoopClusterRAM
16 ©HortonworksInc.2011–2016.AllRightsReserved
REALWORLDCONFIGURATIONEXAMPLE(Continued)
Figure6: YARNCapacitySchedulerAdminView- SASQueue- 50%HadoopClusterRAM
17 ©HortonworksInc.2011–2016.AllRightsReserved
(SASGridPolicyFile- GridApplicationType“Low”section)<?xmlversion="1.0"encoding="UTF-8"standalone="yes"?>
<GridPolicy defaultAppType="low">
<GridApplicationType name="low">
<jobname>SASLow</jobname>
<priority>10</priority>
<nice>0</nice>
<memory>2048</memory>
<vcores>1</vcores>
<runlimit>480</runlimit>
<queue>sas94_queue</queue>
<hosts>
<hostGroup>sas94_work</hostGroup>
</hosts>
</GridApplicationType>
………….………….Continuedinpaper………………………………..
REALWORLDCONFIGURATIONEXAMPLE(Continued)
18 ©HortonworksInc.2011–2016.AllRightsReserved
REALWORLDCONFIGURATIONEXAMPLE(Continued)
Figure7: ConfiguringSASMetadataGroupstoSASGridApplicationTypes
20 ©HortonworksInc.2011–2016.AllRightsReserved
SASWORKLOADMIGRATIONCONSIDERATIONS
à ComplimentyourexistingSASInfrastructure- itsnotaforkliftmigration
à IdentifySASStorageCostSavingOpportunities• libname tohive• libname tohdfs• filenametohdfs
à IdentifySASWorkloadComputeMigrationOpportunities• SASJobsthatwillbeusinglargedatasetsstoredinHadoopareidealcandidates
• SASJobsthatwouldbenefitfromSASInDatabasePushDowntoHive