View
118
Download
1
Category
Preview:
Citation preview
HADOOP ONLINE TUTORIAL
By
HYDERABADSYS ONLINE TRAINING
Hadoop OnlineTraining Course Content
Basics of Hadoop: Motivation for Hadoop Large scale system training Survey of data storage literature Literature survey of data processing Networking constraints New approach requirements
Basic concepts of HadoopWhat is Hadoop? What is Hadoop? Distributed file system of Hadoop Map reduction of Hadoop works Hadoop cluster and its anatomy Hadoop demons
Master demons Name node Tracking of job Secondary node detection Slave daemons
Hadoop OnlineTraining Course Content
Tracking of task HDFS(Hadoop Distributed File System) Spilts and blocks Input Spilts HDFS spilts Replication of data Awareness of Hadoop racking High availably of data Block placement and cluster architecture CASE STUDIES Practices & Tuning of performances Development of mass reduce programs Local mode Running without HDFS
High availably of data Block placement and cluster architecture CASE STUDIES Practices & Tuning of performances Development of mass reduce programs Local mode Running without HDFS
Hadoop OnlineTraining Course Content
Hadoop administration Setup of Hadoop cluster of Cloud era, Apache, Green plum, Horton works On a single desktop, make a full cluster of a Hadoop setup. Configure and Install Apache Hadoop on a multi node cluster. In a distributed mode, configure and install Cloud era distribution. In a fully distributed mode, configure and install Hortom works
distribution
In a fully distributed mode, configure the Green Plum distribution.
Monitor the cluster Get used to the management console of
Horton works and Cloud era. Name the node in a safe mode Data backup. Case studies Monitoring of clusters
Hadoop OnlineTraining Course Content
Hadoop Development : Writing a MapReduce Program Sample the mapreduce program. API concepts and their basics Driver code Mapper Reducer Hadoop AVI streaming
Performing several Hadoop jobs Configuring close methods Sequencing of files Record reading Record writer Reporter and its role Counters Output collection
Assessing HDFS Tool runner Use of distributed CACHE Several MapReduce jobs (In Detailed) 1.MOST EFFECTIVE SEARCH USING MAPREDUCE 2.GENERATING THE RECOMMENDATIONS USING
MAPREDUCE 3.PROCESSING THE LOG FILES USING MAPREDUCE Identification of mapper Identification of reducer Exploring the problems using this application Debugging the MapReduce Programs MR unit testing Logging
Hadoop OnlineTraining Course Content
3.PROCESSING THE LOG FILES USING MAPREDUCE
Identification of mapper Identification of reducer Exploring the problems using this application Debugging the MapReduce Programs MR unit testing Logging
Hadoop OnlineTraining Course Content Debugging strategies Advanced MapReduce Programming Secondary sort Output and input format customization Mapreduce joins Monitoring & debugging on a Production Cluster Counters
Skipping Bad Records Running the local mode MapReduce performance tuning Reduction network traffic by combiner Partitioners Reducing of input data Using Compression Reusing the JVM
Recommended