Hadoop Fault Detection Using Multiple Heartbeat Threads

Hadoop Fault Detection Using

Multiple Heartbeat Threads

Hunter Ingle

Distributed Systems

u Still relatively new field

u Cluster of machines work together to perform the same task

u Each “node” is given a task to perform

u Master-slave relationship

u Many different connections and data being utilized at once

http://csis.pace.edu/~marchese/CS865/Lectures/Chap1/Chapter1a.htm

http://csis.pace.edu/~marchese/CS865/Lectures/Chap1/Chapter1a.htm

Fault Tolerance

u What happens if a node is disconnected or cannot function?

u Known as a faulty or “dead” node

u Fault Tolerance

u Master node troubleshoots overall state of system

u Data replication

u Remaining resources

Fault Detection

u How to find out if a node is faulty?

u Multiple methods

u Focus for this project: heartbeat method

u Static method

u Every few seconds, slave node sends message to master

u If no message arrives, a timer starts

u If a certain amount of time passes and no heartbeat message: the node is dead

u Fault tolerance methods are executed

Hadoop Framework

u Java-based distributed system framework

u Released in 2006, still being updated with new releases

u Open-sourced

u Three main aspects:

u Hadoop Distributed File System (HDFS)

u MapReduce

u YARN (Yet Another Resource Negotiator)

Hadoop – HDFS

u How Hadoop nodes are connected and communicate

u Determines how data and fault tolerance methods are regulated

u In Hadoop

u Master node = “NameNode”

u Slave node = “DataNode”

u One NameNode to potentially thousands of DataNodes

u Can use backup NameNodes in more recent releases

HDFS Architecture

u Each DataNode holds data that is replicated across the system in case of failure

u DataNodes send a heartbeat message every 3 seconds and timeout after 10 minutes

What Happens After a Node Dies?

u The YARN aspect of NameNode re-allocates the task to another Node

u If data replication factor drops from too many dead nodes

u NameNode re-replicates the data

u If the DataNode does reconnect after being declared dead

u System deletes excess data from re-replication

u Does not always mean it can resume task execution

Hadoop – MapReduce

u Programming model for task execution throughout the system

u Map functions

u Take input data and run instructions through the appropriate DataNodes

u Formats data into key-valued pairs

u Reduce functions

u Combine values from each node based on the key

u Finished output is what the users see

https://data-flair.training/blogs/map-only-job-in-hadoop-mapreduce/

https://data-flair.training/blogs/map-only-job-in-hadoop-mapreduce/

Hadoop – YARN

u Resource allocation and management model

u NameNode

u Resource Manager

u Scheduler

u Sends tasks to DataNode

u Application Manager

u Receives output data

YARN – DataNode

u DataNode

u Application Master

u Communicates with Resource Manager for available resources in node (via containers)

u Keeps track of instructions being executed in DataNode

u Sends output to Application Manager

u Is only used per instance of execution

u Node Managers

u Accesses resource containers to generate block report

u Generates block report

u Sends block report and heartbeat message to Resource Manager

Related Works

u Alternative fault detection methods

u Predetermined timeout intervals based on task complexity

u Adaptive failure-aware scheduler

u Pre-determining faulty nodes based on behavior

The Objective

u The heartbeat method is too inefficient

u Task execution is based on a node that isn’t declared dead for the standard ten minutes

u What if a DataNode is experiencing hardware troubles?

u Or drops the connection entirely?

Current Heartbeat Process

THE SCHEDULER IN THE NAMENODE’S

RESOURCE MANAGER IS TOLD BY THE APPLICATIONS MANAGER THE

RESOURCES AVAILABLE TO THE NODES.

THE SCHEDULER SENDS COMPATIBLE TASKS TO

THE RESPECTIVE DATANODE TO BE

EXECUTED.

THE DATANODE’S NODE MANAGER

RECEIVES THE TASKS AND SETS THE

MACHINE TO EXECUTE THEM.

THE APPLICATION’S INSTANCE OF THE

APPLICATION MASTER KEEPS TRACK OF THE EXECUTION OF THE

TASK IN THE DATANODE.

AFTER TASK HAS FINISHED EXECUTING, THE JOB SUBMISSION

IS SENT TO THE APPLICATIONS

MANAGER.

THE NODE MANAGER DETERMINES THE

AVAILABLE RESOURCES THE DATANODE POSSESSES BY CHECKING THE

RESOURCE CONTAINER.

THE NODE MANAGER GENERATES THE BLOCK REPORT.

THE NODE MANAGER ATTACHES THE FINISHED BLOCK REPORT TO ITS

HEARTBEAT MESSAGE AND SENDS IT TO THE

NAMENODE

Main Problem with Current Model

u A heartbeat message must contain a block report to be sent

u If the block report cannot be generated, the message cannot be sent

u This initiates the timer before the node is declared dead

u Hardware failure or other cases can cause this but still be able to execute tasks or reconnect to the NameNode

u Network failure can drop the connection, but the DataNode isn’t declared dead for another ten minutes

u The current task may take more than ten minutes for execution

The Algorithm

Allow NameNode to dynamically determine lost connections to its DataNodes and respond accordingly

Implement two threads in the DataNode architecture to work with YARN methods that allow the NameNode to know if hardware failure is currently affecting the node

Node Connection Management

u If a DataNode has lost its connection, fault tolerance methods should immediately be executed

u Current system waits ten minutes to do so

u NameNode is already checking the connections to the DataNodes when heartbeat messages arrive

u New implementation would add that if a heartbeat message was not delivered

u The connection to the DataNode would be tested

u If connection was dropped, begin fault tolerance

Two Heartbeat Threads

u One thread would perform the usual methods and generate the block report

u The other would receive the block report and send the message to the NameNode

u If no block report was generated, an “empty” heartbeat message would be sent

u If the empty message is received by the NameNode, troubleshooting can occur

u The Application Master can be contacted by the Resource Manager to determine if execution is still underway

u If not, then hardware failure or other factors are causing the issue

u The connection can be cut, and fault tolerance methods can be called

Future Work

u Algorithm still needs implementing and testing

u Connection Management has already been completed, but appropriate testing needs to be conducted

u Will set up a distributed system running Hadoop in the GA room for proper development and testing

u Will compare benchmark results between this new implementation and the current Hadoop 3.2.1 performance

Questions?

Documents

Hadoop Fault Detection Using Multiple Heartbeat Threads