Tutorial 11.doc

http://hadoop.apache.org/docs/r2.6.0/http://tecadmin.net/setup-hadoop-2-4-single-node-cluster-on-linux/

# java -version openjdk version "1.8.0_45" OpenJDK Runtime Environment (build 1.8.0_45-b13)OpenJDK 64-Bit Server VM (build 25.45-b02, mixed mode)

Install java versi 8# yum install java-1.8.0-openjdk-devel

Pilih alternatif versi java (optional)# alternatives --install /usr/bin/java java /usr/java/latest/bin/java 1# alternatives --config java

Tambahkan dns pada file /etc/hosts dengan nilai “<alamat ip> <hostname>”, catatan: alamat ip dan hostname menyesuaikan dari computer masing-masing. Contoh sebagai berikut:Alamat ip computer yang digunakan adalah 172.18.107.61Nama hostname yang digunakan adalah HadoopMstr1# vim /etc/hosts127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4::1 localhost localhost.localdomain localhost6 localhost6.localdomain6172.18.107.61 HadoopMstr1

# useradd hadoop# passwd hadoop

# su - hadoop$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys$ chmod 0600 ~/.ssh/authorized_keys

$ ssh localhost$ exit

$ cd ~$ wget http://apache.claz.org/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz$ tar xzf hadoop-2.6.0.tar.gz$ mv hadoop-2.6.0 hadoop

$ ls -ltotal 190688drwxr-xr-x. 9 hadoop hadoop 4096 Nov 14 2014 hadoop-rw-rw-r--. 1 hadoop hadoop 195257604 May 26 10:02 hadoop-2.6.0.tar.gz

$ vim .bashrc# .bashrc

# Source global definitionsif [ -f /etc/bashrc ]; then . /etc/bashrc

fi

# User specific aliases and functionsexport HADOOP_HOME=/home/hadoop/hadoopexport HADOOP_INSTALL=$HADOOP_HOMEexport HADOOP_MAPRED_HOME=$HADOOP_HOMEexport HADOOP_COMMON_HOME=$HADOOP_HOMEexport HADOOP_HDFS_HOME=$HADOOP_HOMEexport YARN_HOME=$HADOOP_HOMEexport HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nativeexport PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

$ source ~/.bashrc

Using java-1.7, centos 6.5$ vim hadoop/etc/hadoop/hadoop-env.shexport JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.45.x86_64/jre/

Using java-1.8, centos 6.5$ vim hadoop/etc/hadoop/hadoop-env.shexport JAVA_HOME=/usr/lib/jvm/jre-openjdk

$ vim $HADOOP_HOME/etc/hadoop/core-site.xml<configuration><property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value></property></configuration>

$ vim $HADOOP_HOME/etc/hadoop/hdfs-site.xml<configuration><property> <name>dfs.replication</name> <value>1</value></property>

<property> <name>dfs.name.dir</name> <value>file:///home/hadoop/hadoopdata/hdfs/namenode</value></property>

<property> <name>dfs.data.dir</name> <value>file:///home/hadoop/hadoopdata/hdfs/datanode</value></property></configuration>

$ vim $HADOOP_HOME/etc/hadoop/mapred-site.xml<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property></configuration>

$ vim $HADOOP_HOME/etc/hadoop/yarn-site.xml<configuration><property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>   </property>

<property> <description>The address of the RM web application.</description> <name>yarn.resourcemanager.webapp.address</name> <value>localhost:18088</value> </property>

<property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>localhost:18031</value> </property>

<property> <description>The address of the scheduler interface.</description> <name>yarn.resourcemanager.scheduler.address</name> <value>localhost:18030</value> </property>

<property> <description>The address of the applications manager interface in the RM.</description> <name>yarn.resourcemanager.address</name> <value>localhost:18032</value> </property>

<property> <description>The address of the RM admin interface.</description>

<name>yarn.resourcemanager.admin.address</name> <value>localhost:18033</value> </property>

<property> <description>Set to false, to avoid ip check</description> <name>hadoop.security.token.service.use_ip</name> <value>false</value> </property>

<property> <name>yarn.scheduler.capacity.maximum-applications</name> <value>1000</value> <description>Maximum number of applications in the system which can be concurrently active both running and pending</description> </property>

<property> <description>Whether to use preemption. Note that preemption is experimental in the current version. Defaults to false.</description> <name>yarn.scheduler.fair.preemption</name> <value>true</value> </property>

<property> <description>Whether to allow multiple container assignments in one heartbeat. Defaults to false.</description> <name>yarn.scheduler.fair.assignmultiple</name> <value>true</value> </property> </configuration

$ vim $HADOOP_HOME/etc/hadoop/yarn-site.xml<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property></configuration

Now format the namenode using following command, make sure that Storage directory is

$ hdfs namenode -format

15/02/04 09:58:43 INFO namenode.NameNode: STARTUP_MSG:/************************************************************STARTUP_MSG: Starting NameNodeSTARTUP_MSG: host = java.net.UnknownHostException: HadoopMstr1: HadoopMstr1STARTUP_MSG: args = [-format]STARTUP_MSG: version = 2.6.0

...

...15/02/04 09:58:57 INFO common.Storage: Storage directory /home/hadoop/hadoopdata/hdfs/namenode has been successfully formatted.15/02/04 09:58:57 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 015/02/04 09:58:57 INFO util.ExitUtil: Exiting with status 015/02/04 09:58:57 INFO namenode.NameNode: SHUTDOWN_MSG:/************************************************************SHUTDOWN_MSG: Shutting down NameNode at java.net.UnknownHostException: HadoopMstr1: HadoopMstr1************************************************************/

Lets start your hadoop cluster using the scripts provides by hadoop. Just navigate to your hadoop sbin directory and execute scripts one by one.

$ cd $HADOOP_HOME/sbin/$ start-dfs.sh

15/06/08 15:21:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableStarting namenodes on [localhost]localhost: starting namenode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-namenode-HadoopMstr1.outlocalhost: starting datanode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-datanode-HadoopMstr1.outStarting secondary namenodes [0.0.0.0]The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.RSA key fingerprint is 67:e0:c1:86:06:ce:80:7e:1b:a1:b5:5c:9f:e0:d7:97.Are you sure you want to continue connecting (yes/no)? yes0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.0.0.0.0: starting secondarynamenode, logging to /home/hadoop/hadoop/logs/hadoop-hadoop-secondarynamenode-HadoopMstr1.out15/06/08 15:22:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Now run start-yarn.sh script.

$ start-yarn.sh

starting yarn daemonsstarting resourcemanager, logging to /home/hadoop/hadoop/logs/yarn-hadoop-resourcemanager-HadoopMstr1.outlocalhost: starting nodemanager, logging to /home/hadoop/hadoop/logs/yarn-hadoop-nodemanager-HadoopMstr1.out

Access Hadoop Services in Browser

Access your server on port 50070http://localhost:50070/

cluster and all applications, hadoop 2.6.0, java-1.7, centos 6.5http://localhost:18088/

cluster and all applications, hadoop 2.7.0, java-1.8, centos 6.5http://localhost:8088/

secondary namenodehttp://localhost:50090/

DataNodehttp://localhost:50075/

login to root to create directori “/var/log/httpd”, buat mode akses (chmod 777), kemudian kembali lagi ke user hadoop. Dan jalankan perintah berikut:/home/hadoop/hadoop/bin/hdfs dfs -put /var/log/httpd logs

Replay to peteHi pete, saya lihat kamu telah berhasil dengan lancar dan tidak ada permasalahan. saya telah mengikuti semua langkah yang telah dituliskan diatas tapi saya tidak bisa mengakses ‘DataNode’ dan ‘cluster and all applications’

Hi pete, I see you've managed smoothly and no problems. I have followed all the steps that have been written above but I can not access ‘DataNode’ and ‘cluster and all applications’

DataNode: http://localhost:50075

cluster and all applications: http://localhost:18088

Documents

Tutorial 11.doc