YARN - Presented At Dallas Hadoop User Group

Preview:

DESCRIPTION

Developing Hadoop Solutions With YARN.

Citation preview

© Hortonworks Inc. 2013

Hadoop 2.0 –YARNYet Another Resource Negotiator

Page 1

Rommel Garcia

Solutions Engineer

© Hortonworks Inc. 2013

Agenda

• Hadoop 1.X & 2.X – Concepts Recap• YARN Architecture – How does this affect MRv1?• Slots be gone – What does this mean for MapReduce? • Building YARN Applications • Q & A

© Hortonworks Inc. 2013

Hadoop 1.X vs. 2.XRecap over the differences

© Hortonworks Inc. 2013

The 1st Generation of Hadoop: Batch

HADOOP 1.0Built for Web-Scale Batch Apps

Single App

BATCH

HDFS

Single App

INTERACTIVE

Single App

BATCH

HDFS

• All other usage patterns must leverage that same infrastructure

• Forces the creation of silos for managing mixed workloads

Single App

BATCH

HDFS

Single App

ONLINE

© Hortonworks Inc. 2013

Hadoop MapReduce Classic

• JobTracker

–Manages cluster resources and job scheduling

• TaskTracker

–Per-node agent

–Manage tasks

Page 5

© Hortonworks Inc. 2013

Hadoop 1

• Limited up to 4,000 nodes per cluster• O(# of tasks in a cluster)• JobTracker bottleneck - resource management, job scheduling and monitoring

• Only has one namespace for managing HDFS• Map and Reduce slots are static• Only job to run is MapReduce

© Hortonworks Inc. 2013

OS Cloud VM Appliance

Hadoop 1.X Stack

Page 7

PLATFORM SERVICES

HADOOP CORE

Enterprise ReadinessHigh Availability, Disaster Recovery,Security and Snapshots

HORTONWORKS DATA PLATFORM (HDP)

OPERATIONAL SERVICES

DATASERVICES

HIVE & HCATALOG

PIG HBASE

OOZIE

AMBARI

HDFS

MAP REDUCE

SQOOP

FLUME

NFS

LOAD & EXTRACT

WebHDFS

© Hortonworks Inc. 2012© Hortonworks Inc. 2013. Confidential and Proprietary.

Our Vision: Hadoop as Next-Gen Platform

HADOOP 1.0

HDFS(redundant, reliable storage)

MapReduce(cluster resource management

& data processing)

HDFS2(redundant, reliable storage)

YARN(cluster resource management)

MapReduce(data processing)

Others(data processing)

HADOOP 2.0

Single Use SystemBatch Apps

Multi Purpose PlatformBatch, Interactive, Online, Streaming, …

Page 8

© Hortonworks Inc. 2013

YARN: Taking Hadoop Beyond Batch

Page 9

Applications Run Natively IN Hadoop

HDFS2 (Redundant, Reliable Storage)

YARN (Cluster Resource Management)

BATCH(MapReduce)

INTERACTIVE(Tez)

STREAMING(Storm, S4,…)

GRAPH(Giraph)

IN-MEMORY(Spark)

HPC MPI(OpenMPI)

ONLINE(HBase)

OTHER(Search)

(Weave…)

Store ALL DATA in one place…

Interact with that data in MULTIPLE WAYS

with Predictable Performance and Quality of Service

© Hortonworks Inc. 2013

Hadoop 2

• Potentially up to 10,000 nodes per cluster• O(cluster size)• Supports multiple namespace for managing HDFS• Efficient cluster utilization (YARN)• MRv1 backward and forward compatible• Any apps can integrate with Hadoop• Beyond Java

© Hortonworks Inc. 2013

Hadoop 2.X Stack

Page 11

*included Q1 2013

OS/VM Cloud Appliance

PLATFORM SERVICES

HADOOP CORE

Enterprise ReadinessHigh Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots

HORTONWORKS DATA PLATFORM (HDP)

OPERATIONAL SERVICES

DATASERVICES

HDFS

SQOOP

FLUME

NFS

LOAD & EXTRACT

WebHDFS

KNOX*

OOZIE

AMBARI

FALCON*

YARN

MAP TEZ*REDUCE

HIVE &HCATALOG

PIGHBASE

© Hortonworks Inc. 2013

YARN Architecture

© Hortonworks Inc. 2013

A Brief History of YARN

• Originally conceived & architected by the team at Yahoo!– Arun Murthy created the original JIRA in 2008, led the PMC– Currently Arun is the Lead for Map-Reduce/YARN/Tez at Hortonworks and

was formerly Architect Hadoop MapReduce at Yahoo

• The team at Hortonworks has been working on YARN for 4 years

• YARN based architecture running at scale at Yahoo!– Deployed on 35,000 nodes for about a year– Implemented Storm-on-Yarn that processes 133,000 events per second.

Page 13

© Hortonworks Inc. 201314

Concepts

• Application–Application is a job submitted to the framework–Example – Map Reduce Job

• Container–Basic unit of allocation–Fine-grained resource allocation across multiple resource

types (memory, cpu, disk, network, gpu etc.)– container_0 = 2GB, 1CPU– container_1 = 1GB, 6 CPU

–Replaces the fixed map/reduce slots

© Hortonworks Inc. 201315

Architecture

• Resource Manager–Global resource scheduler–Hierarchical queues–Application management

• Node Manager–Per-machine agent–Manages the life-cycle of container–Container resource monitoring

• Application Master–Per-application–Manages application scheduling and task execution–E.g. MapReduce Application Master

© Hortonworks Inc. 2012

RackN

NodeManager

NodeManager

NodeManager

Rack2

NodeManager

NodeManager

NodeManager

Rack1

NodeManager

NodeManager

NodeManager

C2.1

C1.4

AM2

C2.2 C2.3

AM1

C1.3

C1.2

C1.1

Hadoop Client 1

Hadoop Client 2

create app2

submit app1

submit app2

create app1

ASM Schedulerqueues

ASM Containers

NM ASM

Scheduler Resources

.......negotiates.......

.......reports to.......

.......partitions.......

ResourceManager

status report

YARN – Running Apps

© Hortonworks Inc. 2013

Slots be gone! How does MapReduce run on YARN

© Hortonworks Inc. 2013

Apache Hadoop MapReduce on YARN

• Original use-case• Most complex application to build

–Data-locality–Fault tolerance–ApplicationMaster recovery: Check point to HDFS– Intra-application Priorities: Maps v/s Reduces

– Needed complex feedback mechanism from ResourceManager

–Security– Isolation

• Binary compatible with Apache Hadoop 1.x

Page 18

© Hortonworks Inc. 2012

NodeManager NodeManager NodeManager NodeManager

map 1.1

reduce2.1

ResourceManager

NodeManager NodeManager NodeManager NodeManager

NodeManager NodeManager NodeManager NodeManager

map1.2

reduce1.1

MR AM 1

map2.1

map2.2

reduce2.2

MR AM2

Apache Hadoop MapReduce on YARN

Scheduler

© Hortonworks Inc. 2013

Efficiency Gains of YARN

• Key Optimizations–No hard segmentation of resource into map and reduce slots–Yarn scheduler is more efficient–All resources are fungible

• Yahoo has over 30000 nodes running YARN across over 365PB of data.

• They calculate running about 400,000 jobs per day for about 10 million hours of compute time.

• They also have estimated a 60% – 150% improvement on node usage per day.

• Yahoo got rid of a whole colo (10,000 node datacenter) because of their increased utilization.

© Hortonworks Inc. 2013

An Example Calculating Node Capacity

• Important Parameters–mapreduce.[map|reduce].memory.mb

– This is the physical ram hard-limit enforced by Hadoop on the task

–mapreduce.[map|reduce].java.opts– The heapsize of the jvm –Xmx

– yarn.scheduler.minimum-allocation-mb– The smallest container yarn will allow

– yarn.nodemanager.resource.memory-mb– The amount of physical ram on the node

– yarn.nodemanager.vmem-pmem-ratio– The amount of virtual ram each container is allowed. – This is calculated by containerMemoryRequest*vmem-pmem-ratio

© Hortonworks Inc. 2013

Calculating Node Capacity Continued

• Lets pretend we need a 1g map and a 2g reduce• mapreduce[map|reduce].memory.mb = [-Xmx 1g | -Xmx 2g]

• Remember a container has more overhead then just your heap! Add 512mb to the container limit for overhead

• mapreduce.[map.reduce].memory.mb= [1536 | 2560]

• We have 36g per node and minimum allocations of 512mb • yarn.nodemanager.resource.memory-mb=36864• yarn.scheduler.minimum-allocation-mb=512

• Virtual Memory for each container is• Map: 1536mb*vmem-pmem-ratio (default is 2.1) = 3225.6mb• Reduce 2560mb*vmem-pmem-ratio = 5376mb

• Our 36g node can support• 24 Maps OR 14 Reducers OR any combination allowed by the resources

on the node

© Hortonworks Inc. 2013

Building YARN AppsSuper Simple APIs

© Hortonworks Inc. 201324

YARN – Implementing Applications

•What APIs do I need to use?–Only three protocols

–Client to ResourceManager – Application submission

–ApplicationMaster to ResourceManager – Container allocation

–ApplicationMaster to NodeManager – Container launch

–Use client libraries for all 3 actions–Module yarn-client–Provides both synchronous and asynchronous libraries–Use 3rd party like Weave

– http://continuuity.github.io/weave/

© Hortonworks Inc. 201325

YARN – Implementing Applications

•What do I need to do?–Write a submission Client–Write an ApplicationMaster (well copy-paste)

–DistributedShell is the new WordCount

–Get containers, run whatever you want!

© Hortonworks Inc. 201326

YARN – Implementing Applications

•What else do I need to know?–Resource Allocation & Usage

–ResourceRequest –Container –ContainerLaunchContext–LocalResource

–ApplicationMaster–ApplicationId–ApplicationAttemptId–ApplicationSubmissionContext

© Hortonworks Inc. 2013

YARN – Resource Allocation & Usage

• ResourceRequest–Fine-grained resource ask to the ResourceManager–Ask for a specific amount of resources (memory, cpu etc.) on a

specific machine or rack –Use special value of * for resource name for any machine

Page 27

ResourceRequestpriority

resourceName

capability

numContainers

© Hortonworks Inc. 2013

YARN – Resource Allocation & Usage

• ResourceRequest

Page 28

priority capability resourceName numContainers

0 <2gb, 1 core>host01 1

rack0 1

* 1

1 <4gb, 1 core> * 1

© Hortonworks Inc. 2013

YARN – Resource Allocation & Usage

• Container–The basic unit of allocation in YARN–The result of the ResourceRequest provided by

ResourceManager to the ApplicationMaster–A specific amount of resources (cpu, memory etc.) on a specific

machine

Page 29

ContainercontainerId

resourceName

capability

tokens

© Hortonworks Inc. 2013

YARN – Resource Allocation & Usage

• ContainerLaunchContext–The context provided by ApplicationMaster to NodeManager to

launch the Container–Complete specification for a process–LocalResource used to specify container binary and

dependencies– NodeManager responsible for downloading from shared namespace

(typically HDFS)

Page 30

ContainerLaunchContextcontainer

commands

environment

localResources LocalResourceuri

type

© Hortonworks Inc. 2013

YARN - ApplicationMaster

• ApplicationMaster–Per-application controller aka container_0–Parent for all containers of the application

– ApplicationMaster negotiates all it’s containers from ResourceManager

–ApplicationMaster container is child of ResourceManager– Think init process in Unix– RM restarts the ApplicationMaster attempt if required (unique ApplicationAttemptId)

–Code for application is submitted along with Application itself

Page 31

© Hortonworks Inc. 2013

YARN - ApplicationMaster

• ApplicationMaster–ApplicationSubmissionContext is the complete

specification of the ApplicationMaster, provided by Client–ResourceManager responsible for allocating and launching

ApplicationMaster container

Page 32

ApplicationSubmissionContext

resourceRequest

containerLaunchContext

appName

queue

© Hortonworks Inc. 2013

YARN Application API - Overview

• YarnClient is submission client api• Both synchronous & asynchronous APIs for resource allocation and container start/stop

• Synchronous API–AMRMClient–AMNMClient

• Asynchronous API–AMRMClientAsync–AMNMClientAsync

Page 33

© Hortonworks Inc. 2012

NodeManager NodeManager NodeManager NodeManager

Container 1.1

Container 2.4

ResourceManager

NodeManager NodeManager NodeManager NodeManager

NodeManager NodeManager NodeManager NodeManager

Container 1.2

Container 1.3

AM 1

Container 2.2

Container 2.1

Container 2.3

AM2

Client2

New Application Request: YarnClient.createApplication

Submit Application:

YarnClient.submitApplication

1

2

YARN Application API – The Client

Scheduler

© Hortonworks Inc. 2013

YARN Application API – The Client

• YarnClient–createApplication to create application–submitApplication to start application

– Application developer needs to provide ApplicationSubmissionContext

–APIs to get other information from ResourceManager– getAllQueues– getApplications– getNodeReports

–APIs to manipulate submitted application e.g. killApplication

Page 35

© Hortonworks Inc. 2012

NodeManager NodeManager NodeManager NodeManager

YARN Application API – Resource Allocation

ResourceManager

NodeManager NodeManager NodeManager

AM

registerApplicationMaster1

4

AMRMClient.allocate

Container

2

3

NodeManager NodeManager NodeManager NodeManager

unregisterApplicationMaster

Scheduler

© Hortonworks Inc. 2013

YARN Application API – Resource Allocation

• AMRMClient - Synchronous API for ApplicationMaster to interact with ResourceManager–Prologue / epilogue – registerApplicationMaster / unregisterApplicationMaster

–Resource negotiation with ResourceManager– Internal book-keeping - addContainerRequest / removeContainerRequest / releaseAssignedContainer

– Main API – allocate

–Helper APIs for cluster information– getAvailableResources– getClusterNodeCount

Page 37

© Hortonworks Inc. 2013

YARN Application API – Resource Allocation

• AMRMClientAsync - Asynchronous API for ApplicationMaster–Extension of AMRMClient to provide asynchronous CallbackHandler

–Callbacks make it easier to build mental model of interaction with ResourceManager for the application developer

– onContainersAllocated– onContainersCompleted– onNodesUpdated– onError– onShutdownRequest

Page 38

© Hortonworks Inc. 2012

NodeManager NodeManager NodeManager NodeManager

YARN Application API – Using Resources

Container 1.1

NodeManager NodeManager NodeManager NodeManager

NodeManager NodeManager NodeManager NodeManager

AM 1

AMNMClient.startContainer

AMNMClient.getContainerStatus

ResourceManager

Scheduler

© Hortonworks Inc. 2013

YARN Application API – Using Resources

• AMNMClient - Synchronous API for ApplicationMaster to launch / stop containers at NodeManager–Simple (trivial) APIs

– startContainer– stopContainer– getContainerStatus

Page 40

© Hortonworks Inc. 2013

YARN Application API – Using Resources

• AMNMClient - Asynchronous API for ApplicationMaster to launch / stop containers at NodeManager–Simple (trivial) APIs

– startContainerAsync– stopContainerAsync– getContainerStatusAsync

–CallbackHandler to make it easier to build mental model of interaction with NodeManager for the application developer

– onContainerStarted– onContainerStopped– onStartContainerError– onContainerStatusReceived

Page 41

© Hortonworks Inc. 2013

Hadoop Summit 2014

Page 42

© Hortonworks Inc. 2012Page 43

THANK YOU! Rommel Garcia, Solution Engineer – Big Data

rgarcia@hortonworks.com

Recommended