Upload
sai-krishna-kodali
View
222
Download
0
Embed Size (px)
Citation preview
8/11/2019 Hadoop Install Configure
1/4
Hadoop Installation and Configuration
1.
Setup passphraseless ssh
Now check that you can ssh to the localhost without a passphrase:
$ ssh localhost
If you cannot ssh to localhost without a passphrase, execute the following commands:
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
2.
Download Hadoop from Apache. (Lets try the stable version that will be
compatible with Mahout in the future lectures)
http://archive.apache.org/dist/hadoop/core/hadoop-1.0.3/
3.
Using the command tar xzvf hadoop-1.0.4-tar.gz to unzip the package.
4.
Go the hadoop-1.0.3 directory; you will find the following files and
subdirectories.1.-rwxrw-r--@ 1 kzhang6 staff 446615 May 8 2012 CHANGES.txt
2.-rwxrw-r--@ 1 kzhang6 staff 13366 May 8 2012 LICENSE.txt
3.-rwxrw-r--@ 1 kzhang6 staff 101 May 8 2012 NOTICE.txt
4.-rwxrw-r--@ 1 kzhang6 staff 1366 May 8 2012 README.txt
5.drwxrw-r--@ 19 kzhang6 staff 646 May 8 2012 bin
6.-rwxrw-r--@ 1 kzhang6 staff 119875 May 8 2012 build.xml
7.drwxrw-r--@ 4 kzhang6 staff 136 May 8 2012 c++
8.drwxrw-r--@ 18 kzhang6 staff 612 Nov 4 14:29 conf
9.
drwxrw-r--@ 10 kzhang6 staff 340 May 8 2012 contrib10. drwxrw-r--@ 69 kzhang6 staff 2346 May 8 2012 docs
11. -rwxrw-r--@ 1 kzhang6 staff 6840 May 8 2012 hadoop-ant-1.0.3.jar
12. -rwxrw-r--@ 1 kzhang6 staff 410 May 8 2012 hadoop-client-1.0.3.jar
13. -rwxrw-r--@ 1 kzhang6 staff 3928345 May 8 2012 hadoop-core-1.0.3.jar
14. -rwxrw-r--@ 1 kzhang6 staff 142452 May 8 2012 hadoop-examples-1.0.3.jar
15. -rwxrw-r--@ 1 kzhang6 staff 413 May 8 2012 hadoop-minicluster-1.0.3.jar
16. -rwxrw-r--@ 1 kzhang6 staff 2656632 May 8 2012 hadoop-test-1.0.3.jar
17. -rwxrw-r--@ 1 kzhang6 staff 287807 May 8 2012 hadoop-tools-1.0.3.jar
18. drwxrw-r--@ 13 kzhang6 staff 442 May 8 2012 ivy
19. -rwxrw-r--@ 1 kzhang6 staff 10525 May 8 2012 ivy.xml
20. drwxrw-r--@ 52 kzhang6 staff 1768 May 8 2012 lib
21. drwxrw-r--@ 4 kzhang6 staff 136 May 8 2012 libexec
22. drwxrw-r-- 84 kzhang6 staff 2856 Oct 21 18:27 logs
23.
drwxrw-r--@ 9 kzhang6 staff 306 May 8 2012 sbin24. drwxrw-r--@ 3 kzhang6 staff 102 May 8 2012 share
25. drwxrw-r--@ 18 kzhang6 staff 612 May 8 2012 src
26. drwxrw-r--@ 9 kzhang6 staff 306 May 8 2012 webapps
5. In the conf directory, you need to edit the following 4 files: hadoop-env.sh,
hdfs-site.xml, core-site.xml, and mapred-site.xml.
hadoop-env.sh:1 # Set Hadoop-specific environment variables here.
http://archive.apache.org/dist/hadoop/core/hadoop-1.0.3/http://archive.apache.org/dist/hadoop/core/hadoop-1.0.3/http://archive.apache.org/dist/hadoop/core/hadoop-1.0.3/8/11/2019 Hadoop Install Configure
2/4
2
3 # The only required environment variable is JAVA_HOME. All others are
4 # optional. When running a distributed configuration it is best to
5 # set JAVA_HOME in this file, so that it is correctly defined on
6 # remote nodes.
7
8 # The java implementation to use. Required.
9 exportJAVA_HOME=/Library/Java/Home10
11 # Extra Java CLASSPATH elements. Optional.
12 # export HADOOP_CLASSPATH=
13
14 # The maximum amount of heap to use, in MB. Default is 1000.
15 exportHADOOP_HEAPSIZE=8000
16
17 # Extra Java runtime options. Empty by default.
18 # export HADOOP_OPTS=-server
19
20 # Command specific options appended to HADOOP_OPTS when specified
21 export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote$HADOOP_NAMENODE_OPTS"
22 export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote
$HADOOP_SECONDARYNAMENODE_OPTS "23 export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote$HADOOP_DATANODE_OPTS"
24 export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote$HADOOP_BALANCER_OPTS "
25 export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote$HADOOP_JOBTRACKER_OPTS "
26 # export HADOOP_TASKTRACKER_OPTS=
27 # The following applies to multiple commands (fs, dfs, fsck, distcp etc)
28 # export HADOOP_CLIENT_OPTS
29
30 # Extra ssh options. Empty by default.
31 # export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR"
32
33 exportHADOOP_OPTS="-Djava.security.krb5.realm=OX.AC.UK -
Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk"
34
35 # Where log files are stored. $HADOOP_HOME/logs by default.
36 # export HADOOP_LOG_DIR=${HADOOP_HOME}/logs
37
38 # File naming remote slave hosts. $HADOOP_HOME/conf/slaves by default.
39 # export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves
40
41 # host:path where hadoop code should be rsync'd from. Unset by default.
42 # export HADOOP_MASTER=master:/home/$USER/src/hadoop
43
44 # Seconds to sleep between slave commands. Unset by default. This
45 # can be useful in large clusters, where, e.g., slave rsyncs can
46 # otherwise arrive faster than the master can service them.
47 # export HADOOP_SLAVE_SLEEP=0.1
48
49 # The directory where pid files are stored. /tmp by default.
50 # export HADOOP_PID_DIR=/var/hadoop/pids
51
52 # A string representing this instance of hadoop. $USER by default.
53 # export HADOOP_IDENT_STRING=$USER
54
55 # The scheduling priority for daemon processes. See 'man nice'.
56 # export HADOOP_NICENESS=10
Remove some warnings for mac system
Java home directory
8/11/2019 Hadoop Install Configure
3/4
hdfs-site.xml1
2
3
4
5
6
7
8 dfs.replication
9 2
10
11
core-site.xml1
2
3
4
5
6
7 8 hadoop.tmp.dir
9 /Users/kzhang6/Hadoop-data
10 A base for temporary directories
11
12
13
14 fs.default.name
15 hdfs://localhost:54310
16
17
mapred-site.xml1
2
3
4
5
6
7
8 mapred.job.tracker
9 localhost:54311
10
11
12
13 mapred.tasktracker.map.tasks.maximum
14 2
15
16
17
18 mapred.tasktracker.reduce.tasks.maximum
19 2
20
21
6.
Format your namenode1.10-1-210-140:hadoop-1.0.3 kzhang6$ ./bin/hadoop namenode -format
8/11/2019 Hadoop Install Configure
4/4
2.Warning: $HADOOP_HOME is deprecated.
3.
4.14/01/27 11:38:47 INFO namenode.NameNode: STARTUP_MSG:
5./************************************************************
6.STARTUP_MSG: Starting NameNode
7.STARTUP_MSG: host = 10-1-208-217.cba.uic.edu/10.1.208.217
8.STARTUP_MSG: args = [-format]
9.
STARTUP_MSG: version = 1.0.310. STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r
1335192; compiled by 'hortonfo' on Tue May 8 20:31:25 UTC 2012
11. ************************************************************/
12. Re-format filesystem in /Users/kzhang6/Hadoop-data/dfs/name ? (Y or N) Y
13. 14/01/27 11:38:55 INFO util.GSet: VM type = 64-bit
14. 14/01/27 11:38:55 INFO util.GSet: 2% max memory = 159.6675 MB
15. 14/01/27 11:38:55 INFO util.GSet: capacity = 2^24 = 16777216 entries
16. 14/01/27 11:38:55 INFO util.GSet: recommended=16777216, actual=16777216
17. 14/01/27 11:38:55 INFO namenode.FSNamesystem: fsOwner=kzhang6
18. 14/01/27 11:38:55 INFO namenode.FSNamesystem: supergroup=supergroup
19. 14/01/27 11:38:55 INFO namenode.FSNamesystem: isPermissionEnabled=true
20. 14/01/27 11:38:55 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
21. 14/01/27 11:38:55 INFO namenode.FSNamesystem: isAccessTokenEnabled=false
accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)22. 14/01/27 11:38:55 INFO namenode.NameNode: Caching file names occuring more than 10 times
23. 14/01/27 11:38:56 INFO common.Storage: Image file of size 113 saved in 0 seconds.
24. 14/01/27 11:38:56 INFO common.Storage: Storage directory /Users/kzhang6/Hadoop-data/dfs/name has
been successfully formatted.
25. 14/01/27 11:38:56 INFO namenode.NameNode: SHUTDOWN_MSG:
26. /************************************************************
27. SHUTDOWN_MSG: Shutting down NameNode at 10-1-208-217.cba.uic.edu/10.1.208.217
28. ************************************************************/
7. Start/Stop Hadoop1.10-1-210-140:hadoop-1.0.3 kzhang6$ ./bin/start-all.sh
2.Warning: $HADOOP_HOME is deprecated.
3.
4.
starting namenode, logging to /Users/kzhang6/Software/hadoop-1.0.3/libexec/../logs/hadoop-kzhang6-namenode-10-1-208-217.cba.uic.edu.out
5.localhost: starting datanode, logging to /Users/kzhang6/Software/hadoop-1.0.3/libexec/../logs/hadoop-
kzhang6-datanode-10-1-208-217.cba.uic.edu.out
6.localhost: starting secondarynamenode, logging to /Users/kzhang6/Software/hadoop-
1.0.3/libexec/../logs/hadoop-kzhang6-secondarynamenode-10-1-208-217.cba.uic.edu.out
7.starting jobtracker, logging to /Users/kzhang6/Software/hadoop-1.0.3/libexec/../logs/hadoop-kzhang6-
jobtracker-10-1-208-217.cba.uic.edu.out
8. localhost: starting tasktracker, logging to /Users/kzhang6/Software/hadoop-1.0.3/libexec/../logs/hadoop-
kzhang6-tasktracker-10-1-208-217.cba.uic.edu.out
8. Check the status
http://localhost:50070/dfshealth.jsp
http://localhost:50030/jobtracker.jsp
http://localhost:50070/dfshealth.jsphttp://localhost:50070/dfshealth.jsphttp://localhost:50030/jobtracker.jsphttp://localhost:50030/jobtracker.jsphttp://localhost:50030/jobtracker.jsphttp://localhost:50070/dfshealth.jsp