Hadoop Summit San Jose 2015: YARN - Past, Present and Future

  • Published on
    16-Apr-2017

  • View
    236

  • Download
    2

Embed Size (px)

Transcript

Positioning, Campaigns & 2.0 Launch

Apache Hadoop YARN - 2015June 9, 2015Past, Present & Future

Page # Hortonworks Inc. 2011 2015. All Rights ReservedWe areVinod Kumar VavilapalliLong time Hadooper since 2007Apache Hadoop Committer / PMCApache MemberYahoo! -> HortonworksMapReduce -> YARN from day one

Jian HeHadoop contributor since 2012Apache Hadoop Committer / PMCHortonworksAll things YARN

Page # Hortonworks Inc. 2011 2015. All Rights ReservedOverviewThe Why and the What

Page # Hortonworks Inc. 2011 2015. All Rights ReservedData architecturesTraditional architecturesSpecialized SilosPer silo security, management, governance etc.Limited ScalabilityLimited cost efficienciesFor the present and the futureHadoop repositoryCommodity storageCentralized but distributed systemScalableUniform org policy enforcementInnovation across silos!

Data - HDFS

Cluster Resources

Page # Hortonworks Inc. 2011 2015. All Rights ReservedResource ManagementExtracting value out of centralized data architecture

A messy problemMultiple apps, frameworks, their life-cycles and evolutionTenancyI am running this system for one userIt almost never stops thereGroups, Teams, UsersSharing / isolation neededAdhoc structures get unusable real fast

Page # Hortonworks Inc. 2011 2015. All Rights ReservedVaried goals & expectationsOn isolation, capacity allocations, scheduling

Faster!More!Best for my clusterThroughputUtilizationElasticityService uptimeSecurityROIEverything!

Right now!

SLA!

Page # Hortonworks Inc. 2011 2015. All Rights ReservedEnter Hadoop YARN

HDFS (Scalable, Reliable Storage)YARN (Cluster Resource Management)

Applications (Running Natively in Hadoop)Store all your data in one place (HDFS)

Interact with that data in multiple ways (YARN Platform + Apps): Data centric

Scale as you go, shared, multi-tenant, secure (The Hadoop Stack)QueuesAdmins/UsersCluster ResourcesPipelines

Page # Hortonworks Inc. 2011 2015. All Rights ReservedQueues reflect org structures. Hierarchical in nature.7

Hadoop YARNDistributed SystemHost of frameworks, meta-frameworks, applicationsVaried workloadsBatchInteractiveStream processingNoSQL databases.Large scaleLinear scalabilityTens of thousands of nodesMore coming

Page # Hortonworks Inc. 2011 2015. All Rights ReservedPastA quick history

Page # Hortonworks Inc. 2011 2015. All Rights ReservedA brief TimelineSub-project of Apache HadoopReleases tied to Hadoop releasesAlphas and betasIn production at several large sites for MapReduce already by that timeJune-July 2010

August 2011May 2012August 2013

Page # Hortonworks Inc. 2011 2015. All Rights ReservedGA Releases

15 October 201324 February 201407 April 201411 August 20141st GA

MR binary compatibility

YARN API cleanup

Testing!1st Post GA

Bug fixes

Alpha featuresRM Fail-over

CS Preemption

Timeline Service V1Writable REST APIs

Timeline Service V1 security

Page # Hortonworks Inc. 2011 2015. All Rights ReservedPresent

Page # Hortonworks Inc. 2011 2015. All Rights ReservedLast few Hadoop releasesHadoop 2.618 November 2014Rolling UpgradesServicesNode labels

Hadoop 2.721 Apr 2015Moving to JDK 7+

Focus on some features next!

Apache Hadoop 2.6Apache Hadoop 2.7

Page # Hortonworks Inc. 2011 2015. All Rights ReservedRolling Upgrades

Page # Hortonworks Inc. 2011 2015. All Rights Reserved

YARN Rolling UpgradesWhy? No more losing work during upgrades!WorkflowServers first: Masters followed by per-node agentsUpgrade of Applications/Frameworks is decoupled!

Work preserving RM restart: RM recovers state from NMs and appsWork preserving NM restart: NM recovers state from local diskRM fail-over is optional

Page # Hortonworks Inc. 2011 2015. All Rights ReservedYARN Rolling Upgrades: A Cluster Snapshot

Page # Hortonworks Inc. 2011 2015. All Rights ReservedStack Rolling Upgrades

Enterprise grade rolling upgrade of a Live Hadoop ClusterJun 10, 3:25PM-4:05PMSanjay Radia & Vinod K V from Hortonworks

Page # Hortonworks Inc. 2011 2015. All Rights ReservedServices on YARN

Page # Hortonworks Inc. 2011 2015. All Rights ReservedLong running servicesYou could run them already before 2.6!

Enhancements neededLogsSecurityManagement/monitoringSharing and PlacementDiscovery

Resource sharing across workload types

Fault tolerance of long running servicesWork preserving AM restartAM forgetting faults

Service registry

Page # Hortonworks Inc. 2011 2015. All Rights ReservedProject SliderBring your existing services unmodified to YARN: slider.incubator.apache.org/HBase, Storm, Kafka already!

YARNMapReduceTezStormKafkaSparkHBase

PigHiveCascadingApache Slider

Moreservices..DeathStar: Easy, Dynamic, Multi-tenant HBase via YARNJune 11: 1:30-2:10PMIshan Chhabra & Nitin Aggarwal from Rocket FuelAuthoring and hosting applications on YARN using SliderJun 11,11:00AM-11:40AM Sumit Mohanty & Jonathan Maron from Hortonworks

Page # Hortonworks Inc. 2011 2015. All Rights ReservedOperational and Developer tooling

Page # Hortonworks Inc. 2011 2015. All Rights ReservedNode LabelsToday: PartitionsAdmin: I have machines of different typesImpact on capacity planning: Hey, we bought those GPU machinesTypesExclusive: This is my Precious!Non-exclusive: I get binding preference. Use it for others when idle

Future: ConstraintsTake me to a machine running JDK version 9No impact on capacity planning

Default PartitionPartition BGPUsPartition CWindowsJDK 8JDK 7JDK 7Node Labels in YARNJun 11,11:00AM-11:40AMMayank Bansal (ebay) & Wangda Tan (Hortonworks)

Page # Hortonworks Inc. 2011 2015. All Rights ReservedPluggable ACLsPluggable YARN authorization modelYARN Apache Ranger integration

Apache Ranger

Queue ACLsManagement

plugin2. Submit app1. Admin manages ACLsYARNSecuring Hadoop with Apache Ranger : Strategies & Best PracticesJun 11, 3:10PM-3:50PMSelvamohan Neethiraj & Velmurugan Periasamy from HortonWorks

Page # Hortonworks Inc. 2011 2015. All Rights Reserved

UsabilityWhy is my application stuck?

How many rack local containers did I get

Lots more..Why is my application stuck? What limits did it hit?What is the number of running containers of my app?How healthy is the scheduler?

Page # Hortonworks Inc. 2011 2015. All Rights ReservedFuture

Page # Hortonworks Inc. 2011 2015. All Rights ReservedPer-queue Policy-driven schedulingPreviouslyNow

Ingestion

FIFOAdhoc

User-fairness

Adhoc

FIFOIngestion

FIFOCoarse policiesOne scheduling algorithm in the clusterRigidDifficult to experimentFine grained policiesOne scheduling algorithm per queueFlexibleVery easy to experiment!

Batch

FIFO

Batch

FIFOrootroot

Page # Hortonworks Inc. 2011 2015. All Rights ReservedReservationsRun my workload tomorrow at 6AMNext: Persistence of the plansTimelineResources6:00AM

Block #1

TimelineResources6:00AM

Block #1

Block #2

Reservation-based Scheduling: If Youre Late Dont Blame Us!June 10 12:05PM 12:45PMCarlo Curino & Subru Venkatraman Krishnan (Microsoft)

Page # Hortonworks Inc. 2011 2015. All Rights ReservedContainerized ApplicationsRunning Containerized Applications on YARNAs a packaging mechanismAs a resource-isolation mechanismDockerAdding the notion of Container RuntimesMultiple use-casesRun my existing service on YARN via Slider + DockerRun my existing MapReduce application on YARN via a docker image

Apache Hadoop YARN and the Docker EcosystemJune 9 1:45PM 2:25PMSidharta Seethana (Hortonworks) & Abin Shahab (Altiscale)

Page # Hortonworks Inc. 2011 2015. All Rights ReservedDisk IsolationIsolation and scheduling dimensionsDisk CapacityIOPsBandwidth

DataNodeNodeManagerMap TaskHBase RegionServerDisks on a nodeReduce TaskReadWriteLocalizationLogsShuffleReadWriteRead SpillsWrite shuffled dataRead SpillsWriteRemote IOToday: Equal allocation to all containers along all dimensionsNext: Scheduling

Page # Hortonworks Inc. 2011 2015. All Rights ReservedNetwork IsolationIsolation and scheduling dimensionsIncoming bandwidthOutgoing bandwidthDataNodeNodeManagerMap TaskStorm SpoutReduce TaskWrite Pipeline

LocalizationLogsShuffleReadRead shuffled dataWrite outputsReadinputRemote IOToday: Equi-share Outbound bandwidthNext: Scheduling

NetworkStorm BoltReadWrite

Page # Hortonworks Inc. 2011 2015. All Rights ReservedTimeline ServiceApplication HistoryWhere did my containers run?MapReduce specific Job History ServerNeed a generic solution beyond ResourceManager RestartCluster HistoryRun analytics on historical apps!User with most resource utilizationLargest application runRunning Applications TimelineFramework specific event collection and UIsShow me the Counters for my running MapReduce taskShow me the slowest Storm stream processing bolt while it is running

What exists todayA LevelDB based implementationIntegrated into MapReduce, Apache Tez, Apache Hive

Page # Hortonworks Inc. 2011 2015. All Rights ReservedTimeline Service 2.0Next generationTodays solution helped us understand the spaceLimited scalability and availability

Analyzing Hadoop Clusters is becoming a big-data problemDont want to throw away the Hadoop application metadataLarge scaleEnable near real-time analysis: Find me the user who is hammering the FileSystem with rouge applications. Now.

Timeline data stored in HBase and accessible to queries

Page # Hortonworks Inc. 2011 2015. All Rights ReservedImproved UsabilityWith Timeline ServiceWhy is my application slow?Is it really slow?Why is my application failing?What happened with my application? Succeeded?

Why is my cluster slow?Why is my cluster down?What happened in my clusters?Collect and use past dataTo schedule my application betterTo do better capacity planning

Page # Hortonworks Inc. 2011 2015. All Rights ReservedMore..Application priorities within a queue YARN Federation 100K+ nodesNode anti-affinityDo not run two copies of my service daemon on the same machineGang schedulingRun all of my app at onceDynamic scheduling based on actual containers utilizationTime based policies10% cluster capacity for queue A from 6-9AM, but 20% from 9-12AMPrioritized queuesAdmins queue takes precedence over everything else

Lot more ..HDFS on YARNGlobal schedulingUser level preemptionContainer resizing

Page # Hortonworks Inc. 2011 2015. All Rights ReservedCommunityStarted with just 5 of us!104 and countingFew big contributorsAnd a long tail

Page # Hortonworks Inc. 2011 2015. All Rights ReservedThank you!

Page # Hortonworks Inc. 2011 2015. All Rights ReservedAddendum

Page # Hortonworks Inc. 2011 2015. All Rights Reserved

Work preserving ResourceManager restartResourceManager remembers some stateReconstructs the remaining from nodes and apps

Page # Hortonworks Inc. 2011 2015. All Rights ReservedWork preserving NodeManager restart

NodeManager remembers state on each machineReconnects to running containers

Page # Hortonworks Inc. 2011 2015. All Rights ReservedResourceManager Fail-over

Active/Standby based fail-overDepends on fast-recovery

Page # Hortonworks Inc. 2011 2015. All Rights Reserved