43
© Hortonworks Inc. 2013 Hadoop 2.0 –YARN Yet Another Resource Negotiator Page 1 Rommel Garcia Solutions Engineer

YARN - Presented At Dallas Hadoop User Group

Embed Size (px)

DESCRIPTION

Developing Hadoop Solutions With YARN.

Citation preview

Page 1: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

Hadoop 2.0 –YARNYet Another Resource Negotiator

Page 1

Rommel Garcia

Solutions Engineer

Page 2: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

Agenda

• Hadoop 1.X & 2.X – Concepts Recap• YARN Architecture – How does this affect MRv1?• Slots be gone – What does this mean for MapReduce? • Building YARN Applications • Q & A

Page 3: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

Hadoop 1.X vs. 2.XRecap over the differences

Page 4: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

The 1st Generation of Hadoop: Batch

HADOOP 1.0Built for Web-Scale Batch Apps

Single App

BATCH

HDFS

Single App

INTERACTIVE

Single App

BATCH

HDFS

• All other usage patterns must leverage that same infrastructure

• Forces the creation of silos for managing mixed workloads

Single App

BATCH

HDFS

Single App

ONLINE

Page 5: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

Hadoop MapReduce Classic

• JobTracker

–Manages cluster resources and job scheduling

• TaskTracker

–Per-node agent

–Manage tasks

Page 5

Page 6: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

Hadoop 1

• Limited up to 4,000 nodes per cluster• O(# of tasks in a cluster)• JobTracker bottleneck - resource management, job scheduling and monitoring

• Only has one namespace for managing HDFS• Map and Reduce slots are static• Only job to run is MapReduce

Page 7: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

OS Cloud VM Appliance

Hadoop 1.X Stack

Page 7

PLATFORM SERVICES

HADOOP CORE

Enterprise ReadinessHigh Availability, Disaster Recovery,Security and Snapshots

HORTONWORKS DATA PLATFORM (HDP)

OPERATIONAL SERVICES

DATASERVICES

HIVE & HCATALOG

PIG HBASE

OOZIE

AMBARI

HDFS

MAP REDUCE

SQOOP

FLUME

NFS

LOAD & EXTRACT

WebHDFS

Page 8: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2012© Hortonworks Inc. 2013. Confidential and Proprietary.

Our Vision: Hadoop as Next-Gen Platform

HADOOP 1.0

HDFS(redundant, reliable storage)

MapReduce(cluster resource management

& data processing)

HDFS2(redundant, reliable storage)

YARN(cluster resource management)

MapReduce(data processing)

Others(data processing)

HADOOP 2.0

Single Use SystemBatch Apps

Multi Purpose PlatformBatch, Interactive, Online, Streaming, …

Page 8

Page 9: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN: Taking Hadoop Beyond Batch

Page 9

Applications Run Natively IN Hadoop

HDFS2 (Redundant, Reliable Storage)

YARN (Cluster Resource Management)

BATCH(MapReduce)

INTERACTIVE(Tez)

STREAMING(Storm, S4,…)

GRAPH(Giraph)

IN-MEMORY(Spark)

HPC MPI(OpenMPI)

ONLINE(HBase)

OTHER(Search)

(Weave…)

Store ALL DATA in one place…

Interact with that data in MULTIPLE WAYS

with Predictable Performance and Quality of Service

Page 10: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

Hadoop 2

• Potentially up to 10,000 nodes per cluster• O(cluster size)• Supports multiple namespace for managing HDFS• Efficient cluster utilization (YARN)• MRv1 backward and forward compatible• Any apps can integrate with Hadoop• Beyond Java

Page 11: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

Hadoop 2.X Stack

Page 11

*included Q1 2013

OS/VM Cloud Appliance

PLATFORM SERVICES

HADOOP CORE

Enterprise ReadinessHigh Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots

HORTONWORKS DATA PLATFORM (HDP)

OPERATIONAL SERVICES

DATASERVICES

HDFS

SQOOP

FLUME

NFS

LOAD & EXTRACT

WebHDFS

KNOX*

OOZIE

AMBARI

FALCON*

YARN

MAP TEZ*REDUCE

HIVE &HCATALOG

PIGHBASE

Page 12: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN Architecture

Page 13: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

A Brief History of YARN

• Originally conceived & architected by the team at Yahoo!– Arun Murthy created the original JIRA in 2008, led the PMC– Currently Arun is the Lead for Map-Reduce/YARN/Tez at Hortonworks and

was formerly Architect Hadoop MapReduce at Yahoo

• The team at Hortonworks has been working on YARN for 4 years

• YARN based architecture running at scale at Yahoo!– Deployed on 35,000 nodes for about a year– Implemented Storm-on-Yarn that processes 133,000 events per second.

Page 13

Page 14: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 201314

Concepts

• Application–Application is a job submitted to the framework–Example – Map Reduce Job

• Container–Basic unit of allocation–Fine-grained resource allocation across multiple resource

types (memory, cpu, disk, network, gpu etc.)– container_0 = 2GB, 1CPU– container_1 = 1GB, 6 CPU

–Replaces the fixed map/reduce slots

Page 15: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 201315

Architecture

• Resource Manager–Global resource scheduler–Hierarchical queues–Application management

• Node Manager–Per-machine agent–Manages the life-cycle of container–Container resource monitoring

• Application Master–Per-application–Manages application scheduling and task execution–E.g. MapReduce Application Master

Page 16: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2012

RackN

NodeManager

NodeManager

NodeManager

Rack2

NodeManager

NodeManager

NodeManager

Rack1

NodeManager

NodeManager

NodeManager

C2.1

C1.4

AM2

C2.2 C2.3

AM1

C1.3

C1.2

C1.1

Hadoop Client 1

Hadoop Client 2

create app2

submit app1

submit app2

create app1

ASM Schedulerqueues

ASM Containers

NM ASM

Scheduler Resources

.......negotiates.......

.......reports to.......

.......partitions.......

ResourceManager

status report

YARN – Running Apps

Page 17: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

Slots be gone! How does MapReduce run on YARN

Page 18: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

Apache Hadoop MapReduce on YARN

• Original use-case• Most complex application to build

–Data-locality–Fault tolerance–ApplicationMaster recovery: Check point to HDFS– Intra-application Priorities: Maps v/s Reduces

– Needed complex feedback mechanism from ResourceManager

–Security– Isolation

• Binary compatible with Apache Hadoop 1.x

Page 18

Page 19: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2012

NodeManager NodeManager NodeManager NodeManager

map 1.1

reduce2.1

ResourceManager

NodeManager NodeManager NodeManager NodeManager

NodeManager NodeManager NodeManager NodeManager

map1.2

reduce1.1

MR AM 1

map2.1

map2.2

reduce2.2

MR AM2

Apache Hadoop MapReduce on YARN

Scheduler

Page 20: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

Efficiency Gains of YARN

• Key Optimizations–No hard segmentation of resource into map and reduce slots–Yarn scheduler is more efficient–All resources are fungible

• Yahoo has over 30000 nodes running YARN across over 365PB of data.

• They calculate running about 400,000 jobs per day for about 10 million hours of compute time.

• They also have estimated a 60% – 150% improvement on node usage per day.

• Yahoo got rid of a whole colo (10,000 node datacenter) because of their increased utilization.

Page 21: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

An Example Calculating Node Capacity

• Important Parameters–mapreduce.[map|reduce].memory.mb

– This is the physical ram hard-limit enforced by Hadoop on the task

–mapreduce.[map|reduce].java.opts– The heapsize of the jvm –Xmx

– yarn.scheduler.minimum-allocation-mb– The smallest container yarn will allow

– yarn.nodemanager.resource.memory-mb– The amount of physical ram on the node

– yarn.nodemanager.vmem-pmem-ratio– The amount of virtual ram each container is allowed. – This is calculated by containerMemoryRequest*vmem-pmem-ratio

Page 22: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

Calculating Node Capacity Continued

• Lets pretend we need a 1g map and a 2g reduce• mapreduce[map|reduce].memory.mb = [-Xmx 1g | -Xmx 2g]

• Remember a container has more overhead then just your heap! Add 512mb to the container limit for overhead

• mapreduce.[map.reduce].memory.mb= [1536 | 2560]

• We have 36g per node and minimum allocations of 512mb • yarn.nodemanager.resource.memory-mb=36864• yarn.scheduler.minimum-allocation-mb=512

• Virtual Memory for each container is• Map: 1536mb*vmem-pmem-ratio (default is 2.1) = 3225.6mb• Reduce 2560mb*vmem-pmem-ratio = 5376mb

• Our 36g node can support• 24 Maps OR 14 Reducers OR any combination allowed by the resources

on the node

Page 23: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

Building YARN AppsSuper Simple APIs

Page 24: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 201324

YARN – Implementing Applications

•What APIs do I need to use?–Only three protocols

–Client to ResourceManager – Application submission

–ApplicationMaster to ResourceManager – Container allocation

–ApplicationMaster to NodeManager – Container launch

–Use client libraries for all 3 actions–Module yarn-client–Provides both synchronous and asynchronous libraries–Use 3rd party like Weave

– http://continuuity.github.io/weave/

Page 25: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 201325

YARN – Implementing Applications

•What do I need to do?–Write a submission Client–Write an ApplicationMaster (well copy-paste)

–DistributedShell is the new WordCount

–Get containers, run whatever you want!

Page 26: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 201326

YARN – Implementing Applications

•What else do I need to know?–Resource Allocation & Usage

–ResourceRequest –Container –ContainerLaunchContext–LocalResource

–ApplicationMaster–ApplicationId–ApplicationAttemptId–ApplicationSubmissionContext

Page 27: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN – Resource Allocation & Usage

• ResourceRequest–Fine-grained resource ask to the ResourceManager–Ask for a specific amount of resources (memory, cpu etc.) on a

specific machine or rack –Use special value of * for resource name for any machine

Page 27

ResourceRequestpriority

resourceName

capability

numContainers

Page 28: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN – Resource Allocation & Usage

• ResourceRequest

Page 28

priority capability resourceName numContainers

0 <2gb, 1 core>host01 1

rack0 1

* 1

1 <4gb, 1 core> * 1

Page 29: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN – Resource Allocation & Usage

• Container–The basic unit of allocation in YARN–The result of the ResourceRequest provided by

ResourceManager to the ApplicationMaster–A specific amount of resources (cpu, memory etc.) on a specific

machine

Page 29

ContainercontainerId

resourceName

capability

tokens

Page 30: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN – Resource Allocation & Usage

• ContainerLaunchContext–The context provided by ApplicationMaster to NodeManager to

launch the Container–Complete specification for a process–LocalResource used to specify container binary and

dependencies– NodeManager responsible for downloading from shared namespace

(typically HDFS)

Page 30

ContainerLaunchContextcontainer

commands

environment

localResources LocalResourceuri

type

Page 31: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN - ApplicationMaster

• ApplicationMaster–Per-application controller aka container_0–Parent for all containers of the application

– ApplicationMaster negotiates all it’s containers from ResourceManager

–ApplicationMaster container is child of ResourceManager– Think init process in Unix– RM restarts the ApplicationMaster attempt if required (unique ApplicationAttemptId)

–Code for application is submitted along with Application itself

Page 31

Page 32: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN - ApplicationMaster

• ApplicationMaster–ApplicationSubmissionContext is the complete

specification of the ApplicationMaster, provided by Client–ResourceManager responsible for allocating and launching

ApplicationMaster container

Page 32

ApplicationSubmissionContext

resourceRequest

containerLaunchContext

appName

queue

Page 33: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN Application API - Overview

• YarnClient is submission client api• Both synchronous & asynchronous APIs for resource allocation and container start/stop

• Synchronous API–AMRMClient–AMNMClient

• Asynchronous API–AMRMClientAsync–AMNMClientAsync

Page 33

Page 34: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2012

NodeManager NodeManager NodeManager NodeManager

Container 1.1

Container 2.4

ResourceManager

NodeManager NodeManager NodeManager NodeManager

NodeManager NodeManager NodeManager NodeManager

Container 1.2

Container 1.3

AM 1

Container 2.2

Container 2.1

Container 2.3

AM2

Client2

New Application Request: YarnClient.createApplication

Submit Application:

YarnClient.submitApplication

1

2

YARN Application API – The Client

Scheduler

Page 35: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN Application API – The Client

• YarnClient–createApplication to create application–submitApplication to start application

– Application developer needs to provide ApplicationSubmissionContext

–APIs to get other information from ResourceManager– getAllQueues– getApplications– getNodeReports

–APIs to manipulate submitted application e.g. killApplication

Page 35

Page 36: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2012

NodeManager NodeManager NodeManager NodeManager

YARN Application API – Resource Allocation

ResourceManager

NodeManager NodeManager NodeManager

AM

registerApplicationMaster1

4

AMRMClient.allocate

Container

2

3

NodeManager NodeManager NodeManager NodeManager

unregisterApplicationMaster

Scheduler

Page 37: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN Application API – Resource Allocation

• AMRMClient - Synchronous API for ApplicationMaster to interact with ResourceManager–Prologue / epilogue – registerApplicationMaster / unregisterApplicationMaster

–Resource negotiation with ResourceManager– Internal book-keeping - addContainerRequest / removeContainerRequest / releaseAssignedContainer

– Main API – allocate

–Helper APIs for cluster information– getAvailableResources– getClusterNodeCount

Page 37

Page 38: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN Application API – Resource Allocation

• AMRMClientAsync - Asynchronous API for ApplicationMaster–Extension of AMRMClient to provide asynchronous CallbackHandler

–Callbacks make it easier to build mental model of interaction with ResourceManager for the application developer

– onContainersAllocated– onContainersCompleted– onNodesUpdated– onError– onShutdownRequest

Page 38

Page 39: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2012

NodeManager NodeManager NodeManager NodeManager

YARN Application API – Using Resources

Container 1.1

NodeManager NodeManager NodeManager NodeManager

NodeManager NodeManager NodeManager NodeManager

AM 1

AMNMClient.startContainer

AMNMClient.getContainerStatus

ResourceManager

Scheduler

Page 40: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN Application API – Using Resources

• AMNMClient - Synchronous API for ApplicationMaster to launch / stop containers at NodeManager–Simple (trivial) APIs

– startContainer– stopContainer– getContainerStatus

Page 40

Page 41: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

YARN Application API – Using Resources

• AMNMClient - Asynchronous API for ApplicationMaster to launch / stop containers at NodeManager–Simple (trivial) APIs

– startContainerAsync– stopContainerAsync– getContainerStatusAsync

–CallbackHandler to make it easier to build mental model of interaction with NodeManager for the application developer

– onContainerStarted– onContainerStopped– onStartContainerError– onContainerStatusReceived

Page 41

Page 42: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2013

Hadoop Summit 2014

Page 42

Page 43: YARN - Presented At Dallas Hadoop User Group

© Hortonworks Inc. 2012Page 43

THANK YOU! Rommel Garcia, Solution Engineer – Big Data

[email protected]