Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 1
UNIT IV -
Open source grid middleware packages – Globus Toolkit (GT4) Architecture ,
Configuration – Usage of Globus – Main components and Programming model - Introduction
to Hadoop Framework - Mapreduce, Input splitting, map and reduce functions,
specifying input and output parameters, configuring and running a job – Design of
Hadoop file system, HDFS concepts, command line and java interface, dataflow of File
read & File write.
1. Discuss about the Open Source Grid Middleware Packages.
Open Source Grid Middleware Packages
Two well-formed organizations behind the standards:-
1) Open Grid Forum
2) Object Management
Middleware: software layers connects software components. It lies betrween Operating System
and the applications.
Grid Middleware: A layer between hardware and software, enable the sharing of heterogeneous
resources , and Managing virtual organizations created around the grid.
Popular Grid Middleware: BOINC – Berkeley Open Infrastructure for Network Computing
1) UNICORE ( open source)
Uniform Interface to Computing Resource.
Focus : High level programming models (Java on Unix).
Developed by the German Grid Computing Community.
A ready-to-run Grid system including client and server software.
2) GLOBUS (GT4 - open source)
Focus : Low level services (C and Java on Unix)
It is a library jointly developed by Argonne National Lab, University of Chicago, and
USC Information Science Institute, funded by DARPA, NSF, & NIH.
It is an open source toolkit for grid computing.
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 2
3) GRIDBUS ( open source)
Focus : Abstraction and market models (Java on Unix)
Supports eScience and eBusiness applications.
4) LEGION ( Not open source)
Focus : High level programming models (C++ on Unix)
It was the first integrated grid middleware architected from first principles to address the
complexity of grid environments.
Different types of Grid Environments available in the market:-
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 3
Grid and Web Services: Convergence
Grid community meets Web services: Open Grid Services Architecture (OGSA)
Service orientation to virtualize resources - Everything is a service!
From Web services
Standard interface definition mechanisms.
Evolving set of other standards: security, etc.
From Grids (Globus Toolkit)
Service semantics, reliability & security models.
Lifecycle management, discovery, other services
OGSA implementation: WSRF – A framework for the definition & management of composable,
interoperable services.
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 4
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 5
The WSRF specification
The Web Services Resources Framework is a collection of 4 different specifications:
WS-ResourceProperties
WS-ResourceLifetime
WS-ServiceGroup
WS-BaseFaults
Related specifications
WS-Notification
WS-Addressing
Web services vs. Grid services
“Web services” address discovery & invocation of persistent services.
Interface to persistent state of entire enterprise
In Grids we also need transient services, created/destroyed dynamically.
Interfaces to the states of distributed activities.
E.g. workflow, video conf., dist. data analysis.
Significant implications for how services are managed, named, discovered, and used.
In fact, much of our work is concerned with the management of services.
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 6
Open Grid Services Infrastructure (OGSI) Specification
Defines fundamental interfaces (using extended WSDL) and behaviors that define a Grid
Service.
A unifying framework for interoperability & establishment of total system properties.
Defines WSDL conventions and extensions for describing and naming services.
Defines basic patterns of interaction, which can be combined with each other and with custom
patterns in a myriad of ways.
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 7
2. Draw and explain the global toolkit architecture. (or) What is GT4? Describe in detail
the components of GT4 with a suitable diagram.
The GT4 Library Globus Job Workflow Client-Globus Interactions
Introduction of GT4
Open middleware library for the grid computing communities.
Support many operational grids and their applications on an international basis.
The toolkit addresses common problems and issues related to grid resource discovery,
management, communication, security, fault detection, and portability.
It includes a rich set of service implementations.
Provides tools for building new web services in Java, C, and Python, builds a powerful
standard-based security infrastructure and client APIs.
The Globus Toolkit was initially motivated by a desire to remove obstacles that prevent
seamless collaboration, and thus sharing of resources and services, in scientific and
engineering applications.
The shared resources can be computers, storage, data, services, networks, science
instruments (e.g., sensors).
Globus Tookit GT4 supports distributed and cluster computing services.
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 8
The GT4 Library
GT4 offers the middle-level core services in grid applications.
The high-level services and tools, such as MPI, Condor-G, and Nirod/G, are developed by third
parties for general-purpose distributed computing applications.
The local services, such as LSF, TCP, Linux, and Condor, are at the bottom level and are
fundamental tools supplied by other developers.
GT4 is based on industry-standard web service technologies.
The GT4 Library-Functional modules helps user to discover available resources, move data
between sites, user credentials
Globus Job Workflow
A typical job execution sequence proceeds as follows:-
The user delegates his credentials to a delegation service.
The user submits a job request to GRAM with the delegation identifier as a parameter.
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 9
GRAM parses the request, retrieves the user proxy certificate from the delegation service,
and then acts on behalf of the user.
GRAM sends a transfer request to the RFT (Reliable File Transfer), which applies
GridFTP to bring in the necessary files.
GRAM invokes a local scheduler via a GRAM adapter and the SEG (Scheduler Event
Generator) initiates a set of user jobs.
The local scheduler reports the job state to the SEG.
Once the job is complete, GRAM uses RFT and GridFTP to stage out the resultant files.
The grid monitors the progress of these operations and sends the user a notification when
they succeed, fail, or are delayed.
Client-Globus Interactions
There are strong interactions between provider programs and user code.
GT4 makes heavy use of industry-standard web service protocols and mechanisms in
service description, discovery, access, authentication, authorization, and the like.
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 10
GT4 makes extensive use of Java, C, and Python to write user code.
Web service mechanisms define specific interfaces for grid computing.
Web services provide flexible, extensible, and widely adopted XML-based interfaces.
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 11
GT4 provides a set of infrastructure services for accessing, monitoring, managing, and
controlling access to infrastructure elements.
GT4 implements standards to facilitate construction of operable and reusable user code.
Developers can use these services and libraries to build simple and complex systems
quickly.
A high-security subsystem addresses message protection, authentication, delegation, and
authorization.
The toolkit programs provide a set of useful infrastructure services.
Three containers are used to host user-developed services written in Java, Python, and C,
respectively.
Containers provide implementations of security, management, discovery, state
management, and other mechanisms required when building services.
They extend open source service hosting environments with support for a range of useful
web service specifications, including WSRF, WS-Notification, and WS-Security.
A set of client libraries allow client programs in Java, C, and Python to invoke operations
on both GT4 and user-developed services.
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 12
3. Write short notes on Hadoop Framework.
Distributed Framework for parallel processing and storing data.
It is an open source implementation of Google’s MapReduce and Google File
System(GFS).
Top level Apache project.
Written in Java and Runs on Linux, Mac OS/X, Windows, and Solaris.
Client applications can be written in various languages.
Users: Yahoo!, Facebook, Amazon, eBay, etc.
Hadoop - Applications
Types :
Statistical analysis.
Web applications like crawling (scan through Internet pages to create an index of data),
feeds, graphs.
Similarity and tagging applications.
Batch data processing, not real-time / user facing
Being applied to the back-end of web search
Log Processing
Document Analysis and Indexing
Web Graphs and Crawling
But, Now a days, it is also used for many real-time / user facing applications.
Hadoop Core, our flagship sub-project, provides a distributed filesystem (HDFS) and
support for the MapReduce distributed computing metaphor.
HBase builds on Hadoop Core to provide a scalable, distributed database.
Pig is a high-level data-flow language and execution framework for parallel computation.
It is built on top of Hadoop Core.
ZooKeeper is a highly available and reliable coordination system. Distributed
applications use ZooKeeper to store and mediate updates for critical shared state.
Hive is a data warehouse infrastructure built on Hadoop Core that provides data
summarization, adhoc querying and analysis of datasets.
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 13
Hadoop is a software framework for distributed processing of large datasets across large
clusters of computers
Large datasets -> Terabytes or petabytes of data.
Large clusters -> hundreds or thousands of nodes.
Hadoop is open-source implementation for Google MapReduce.
Hadoop is based on a simple programming model called MapReduce.
Hadoop is based on a simple data model, any data will fit.
Hadoop framework consists on two main layers
Distributed file system (HDFS)
Execution engine (MapReduce)
The Hadoop Core project provides the basic services for building a cloud computing
environment with commodity hardware, and the APIs for developing software that will
run on that cloud.
The two fundamental pieces of Hadoop Core are the MapReduce framework, the cloud
computing environment, and Hadoop Distributed File System (HDFS).
The Hadoop Core MapReduce framework requires a shared file system.
This shared file system does not need to be a system-level file system, as long as
there is a distributed file system plug-in available to the framework.
While Hadoop Core provides HDFS, HDFS is not required.
In Hadoop JIRA (the issue-tracking system), item 4686 is a tracking ticket to
separate HDFS into its own Hadoop project.
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 14
In addition to HDFS, Hadoop Core supports the Cloud- Store (formerly Kosmos)
file system (http://kosmosfs.sourceforge.net/) and Amazon Simple Storage
Service (S3) file system (http://aws.amazon.com/s3/).
The Hadoop Core framework comes with plug-ins for HDFS, CloudStore, and S3.
Users are also free to use any distributed file system that is visible as a system-
mounted file system, such as Network File System (NFS), Global File System
(GFS), or Lustre.
When HDFS is used as the shared file system, Hadoop is able to take advantage
of knowledge about which node hosts a physical copy of input data, and will
attempt to schedule the task that is to read that data, to run on that machine.
4. Discuss MAPREDUCE with suitable diagrams.
Large-Scale Parallel Data Processing framework
We can use 1000s of CPUs without hassle of managing things.
User need to write two functions - map and reduce.
It provides Auto partitioning of job into sub tasks, Auto retry on failures, Linear Scalability,
Locality of task execution, and Monitoring & status updates.
Simple programming model - solve many problems
A natural for log processing
Great for most web search processing
Fits into workflow frameworks as chains of M/R jobs e.g feed processing,
A lot of SQL also maps trivially to this construct. Eg. Pig
Map
Apply given function on sequence of data individually to produce output sequence of data
E.g. square x = x * x, map square [1,2,3,4,5] = [1,4,9,16,25]
Reduce
Allows to define combining operation on sequence of data Folds sequence to return single output
E.g. add [1,4,9,16,25] = 55
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 15
Map function
Takes a key/value pair and generates a set of intermediate key/value pairs
map(k1, v1) -> list(k2, v2)
Reduce function
Takes intermediate values and associates them with the same intermediate key
reduce(k2, list(v2)) -> list (k3, v3)
Abstracts functionality common to all Map/Reduce applications.
Distribute tasks to multiple machines.
Sorts, transfers and merges intermediate map outputs to the Reducers.
Monitors task progress.
Handles faulty machines, faulty tasks transparently.
Provides pluggable APIs and configuration mechanisms for writing applications.
Map and Reduce functions
Input formats and splits
Number of tasks, data types, etc…
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 16
Provides status about jobs to users.
Application writer specifies
– A pair of functions called Map and Reduce and a set of input files
Workflow
– Input phase generates a number of FileSplits from input files (one per Map task)
– The Map phase executes a user function to transform input kv-pairs into a new set of kv-
pairs
– The framework sorts & Shuffles the kv-pairs to output nodes
– The Reduce phase combines all kv-pairs with the same key into new kv-pairs
– The output phase writes the resulting pairs to files
All phases are distributed with many tasks doing the work
– Framework handles scheduling of tasks on cluster
– Framework handles recovery when a node fails
The Hadoop Distributed File System (HDFS)MapReduce environment provides the user
with a sophisticated framework to manage the execution of map and reduce tasks across a
cluster of machines.
The user is required to tell the framework for the following:-
– The location(s) in the distributed file system of the job input
– The location(s) in the distributed file system for the job output
– The input format
– The output format
– The class containing the map function
– Optionally. the class containing the reduce function
– The JAR file(s) containing the map and reduce functions and any support classes
If a job does not need a reduce function, the user does not need to specify a reducer class,
and a reduce phase of the job will not be run.
The framework will partition the input, and schedule and execute map tasks across the
cluster.
If requested, it will sort the results of the map task and execute the reduce task(s) with the
map output.
The final output will be moved to the output directory, and the job status will be reported
to the user.
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 17
MapReduce is based on key/value pairs.
The framework will convert each record of input into a key/value pair, and each pair will
be input to the map function once.
The map output is a set of key/value pairs—nominally one pair that is the transformed
input pair, but it is perfectly acceptable to output multiple pairs.
The map output pairs are grouped and sorted by key.
The reduce function is called one time for each key, in sort sequence, with the key and
the set of values that share that key.
The reduce method may output an arbitrary number of key/value pairs, which are written
to the output files in the job output directory.
If the reduce output keys are unchanged from the reduce input keys, the final output will
be sorted.
The framework provides two processes that handle the management of MapReduce jobs:-
TaskTracker manages the execution of individual map and reduce tasks on a compute
node in the cluster.
JobTracker accepts job submissions, provides job monitoring and control, and
manages the distribution of tasks to the TaskTracker nodes.
Hadoop Map/Reduce Architecture
JobTracker - Master
Client
TaskTracker - Slaves Run Map and Reduce task Manage intermediate output
UI for submitting jobs Polls status information
Accepts MR jobs Assigns tasks to slaves Monitors tasks Handles failures
Task Run the Map and Reduce functions Report progress
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 18
One JobTracker process per cluster and one or more TaskTracker processes per node in
the cluster.
The JobTracker is a single point of failure, and the JobTracker will work around the
failure of individual TaskTracker processes.
5. Elaborate HDFS concepts with suitable illustrations.
o Blocks
o Namenodes and Datanodes
Blocks
A disk has a block size- minimum amount of data that it can read or write.
Filesystems for a single disk building data in blocks is an integral multiple of the disk
block size.
Filesystem blocks are typically a few kilobytes in size, while disk blocks are normally
512 bytes.
This is generally transparent to the filesystem user who is simply reading or writing a
file—of whatever length.
There are tools to perform filesystem maintenance, such as df and fsck, that operate on
the filesystem block level.
HDFS has the concept of a block, but it is a much larger unit—64 MB by default.
Files in HDFS are broken into block-sized chunks, stored as independent units.
Having a block abstraction for a distributed filesystem brings several benefits.
1. A file can be larger than any single disk in the network. Blocks from a file to be
stored on any of the disks in the cluster.
2. Making the unit of abstraction a block rather than a file simplifies the storage
subsystem. The storage subsystem deals with blocks, simplifying storage management
and eliminating metadata
3. Blocks fit well with replication for providing fault tolerance and availability.
% hadoop fsck / -files –blocks will list the blocks that make up each file in the filesystem.
Namenodes and Datanodes
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 19
Hadoop framework consists on two main layers
– Distributed file system (HDFS)
– Execution engine (MapReduce)
An HDFS cluster has two types of node operating in a master- worker pattern:
a namenode(the master) and a number of datanodes (workers).
The namenode manages the filesystem namespace.
o It maintains the filesystem tree and the metadata for all the files and directories in
the tree.
o This information is stored persistently on the local disk in the form of two files:
the namespace image and the edit log.
o The namenode also knows the datanodes on which all the blocks for a given file
are located
A client accesses the filesystem by communicating with the namenode and datanodes.
The client presents a POSIX-like filesystem interface, so the user code does not need to
know about the namenode and datanode to function.
Datanodes are the workhorses of the filesystem.
They store and retrieve blocks when they are told and they report back to the namenode
periodically with lists of blocks that they are storing.
Without the namenode, the filesystem cannot be used.
Drawback
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 20
If the machine running the namenode were obliterated, all the files on the filesystem
would be lost since there would be no way of knowing how to reconstruct the files from
the blocks on the datanodes.
Hadoop provides two mechanisms to make the namenode resilient to failure,
First Mechanism
Back up the files that make up the persistent state of the filesystem metadata.
Hadoop can be configured so that the namenode writes its persistent state to multiple
filesystems and are synchronous and atomic.
The usual configuration choice is to write to local disk as well as a remote NFS mount.
Second Mechanism
Run a secondary namenode,to periodically merge the namespace image with the edit log
to prevent the edit log from becoming too large.
Runs on a separate physical machine, and requires plenty of CPU and as much memory
It keeps a copy of the merged namespace image, which can be used in the event of the
namenode failing.
In the event of total failure of the primary, data loss is almost certain.
6. Write short notes on Command-Line and Java Interface.
Simplest interface to HDFS
Two properties
o fs.default.name, set to hdfs://localhost/ (default)
o dfs.replication, to 1 so that HDFS doesŶ’t replicate filesystem blocks by the
default factor of three.
When running with a single datanode, HDFS replicate blocks to three datanodes.
The Java Interface
Reading Data from a Hadoop URL
Reading Data Using the FileSystem API
Writing Data
Directories
Querying the Filesystem
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 21
Deleting Data
1) Reading Data from a Hadoop URL
Simplest way to read a file from a Hadoop filesystem is by using a java.net.URL object to
open a stream to read the data from.
2) Reading Data Using the FileSystem API
java.io.File object
FileSystem - general filesystem API,
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 22
First step - to retrieve an instance for the filesystem we want to use—HDFS
Two static factory methods for getting a FileSystem instance
– public static FileSystem get(Configuration conf) throws IOException
– public static FileSystem get(URI uri, Configuration conf) throws IOException
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 23
3) Writing Data
Class : FileSystem
Simplest method : takes a Path object for the file to be created and returns an output
stream to write to:
– public FSDataOutputStream create(Path f) throws IOException
4) append()
package org.apache.hadoop.util;
creating a new file, you can append to an existing file using the append() method
– public FSDataOutputStream append(Path f) throws IOException
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 24
4) Directories
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 25
5) Querying the Filesystem
File metadata: FileStatus class
Ability to navigate its directory structure and retrieve information about the files and
directories that it stores.
Encapsulates filesystem metadata for files and directories, including file length,block
size, replication, modification time, ownership, and permission information.
Method : getFileStatus()
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 26
6) Deleting Data
• delete() method on FileSystem to permanently remove files or directories:
– public boolean delete(Path f, boolean recursive) throws IOException
• If f is a file or an empty directory, then the value of recursive is ignored.
• A nonempty directory is only deleted, along with its contents, if recursive is true
(otherwise an IOException is thrown).
7. Illustrate dataflow in HDFS during file read/write operation with suitable diagrams.
Anatomy of a File Read
Fig. A client reading data from HDFS
1) client opens the file it wishes to read by calling open() on the FileSystem object, which for
HDFS is an instance of DistributedFileSystem
2) DistributedFileSystem calls the namenode, using RPC, to determine the locations of the
blocks for the first few blocks in the file DistributedFileSystem returns an FSDataInputStream
(an input stream that supports file seeks) to the client for it to read data from FSDataInputStream
in turn wraps a DFSInputStream, which manages the datanode and namenode I/O
3) The client then calls read() on the stream
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 27
4) Data is streamed from the datanode back to the client, which calls read() repeatedly on the
stream
5) When the end of the block is reached, DFSInputStream will close the connection to the
datanode, then find the best datanode for the next block
6) When the client has finished reading, it calls close() on the FSDataInputStream (step 6).
If the DFSInputStream encounters an error while communicating with a datanode, then it
will try the next closest one for that block. The
DFSInputStream also verifies checksums for the data transferred to it from the datanode.
If a corrupted block is found, it is reported to the namenode before the DFSInput Stream
attempts to read a replica of the block from another datanode.
client contacts datanodes directly to retrieve data and is guided by the namenode to the
best datanode for each block.
Allows HDFS to scale to a large number of concurrent clients, since the data traffic is
spread across all the datanodes in the cluster.
The namenode has to service block location requests, for example, serve data, which
would quickly become a bottleneck as the number of clients grew.
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 28
Network distance in Hadoop
Anatomy of a File Write
Final IT/7th
Sem./Notes/2013R/CS6703 - GRID AND CLOUD COMPUTING
PANIMALAR INSTITUTE OF TECHNOLOGY / DEPARTMENT OF INFORMATION TECHNOLOGY IV - 29
1. The client creates the file by calling create() on DistributedFileSystem.
2. DistributedFileSystem makes an RPC call to the namenode to delete a file blocks
associated with it.
3. As the client writes data DFSOutputStream splits it into packets, which it writes to an
internal queue, called the data queue. The data queue is consumed by the Data Streamer,
whose responsibility it is to ask the namenode to allocate new blocks by picking a list of
suitable datanodes to store the replicas.
4) The DataStreamer streams the packets to the first datanode in the pipeline which stores
the packet and forwards it to the second datanode in the pipeline. Similarly, the second
datanode stores the packet and forwards it to the third (and last) datanode in the pipeline.
5) DFSOutputStream also maintains an internal queue of packets that are waiting to be
acknowledged by datanodes, called the ack queue. A packet is removed from the ack
queue only when it has been acknowledged by all the datanodes in the pipeline.
6) When the client has finished writing data, it calls close() on the stream
7) This action flushes all the remaining packets to the datanode pipeline and waits for
acknowledgments before contacting the namenode to signal that the file is complete.