Bk Using Ambari Book-20140401

Embed Size (px)

Citation preview

  • docs.hortonworks.com

  • Hortonworks Data Platform Apr 1, 2014

    ii

    Hortonworks Data Platform : Automated Install with AmbariCopyright 2012, 2014 Hortonworks, Inc. Some rights reserved.

    The Hortonworks Data Platform, powered by Apache Hadoop, is a massively scalable and 100% opensource platform for storing, processing and analyzing large volumes of data. It is designed to deal withdata from many sources and formats in a very quick, easy and cost-effective manner. The HortonworksData Platform consists of the essential set of Apache Hadoop projects including MapReduce, HadoopDistributed File System (HDFS), HCatalog, Pig, Hive, HBase, Zookeeper and Ambari. Hortonworks is themajor contributor of code and patches to many of these projects. These projects have been integrated andtested as part of the Hortonworks Data Platform release process and installation and configuration toolshave also been included.

    Unlike other providers of platforms built using Apache Hadoop, Hortonworks contributes 100% of ourcode back to the Apache Software Foundation. The Hortonworks Data Platform is Apache-licensed andcompletely open source. We sell only expert technical support, training and partner-enablement services.All of our technology is, and will remain free and open source.

    Please visit the Hortonworks Data Platform page for more information on Hortonworks technology. Formore information on Hortonworks services, please visit either the Support or Training page. Feel free toContact Us directly to discuss your specific needs.

    Except where otherwise noted, this document is licensed underCreative Commons Attribution ShareAlike 3.0 License.http://creativecommons.org/licenses/by-sa/3.0/legalcode

  • Hortonworks Data Platform Apr 1, 2014

    iii

    Table of Contents1. Getting Ready ............................................................................................................. 1

    1.1. Determine Version Compatibility ...................................................................... 11.2. Meet Minimum System Requirements ............................................................... 2

    1.2.1. Hardware Recommendations ................................................................. 21.2.2. Operating Systems Requirements ........................................................... 21.2.3. Browser Requirements ........................................................................... 21.2.4. Software Requirements .......................................................................... 31.2.5. JDK Requirements ................................................................................. 31.2.6. Database Requirements ......................................................................... 41.2.7. File System Partitioning Recommendations ............................................. 4

    1.3. Collect Information ........................................................................................... 41.4. Prepare the Environment .................................................................................. 5

    1.4.1. Check Existing Installs ............................................................................ 51.4.2. Set Up Password-less SSH ....................................................................... 61.4.3. Set up Users and Groups ....................................................................... 71.4.4. Enable NTP on the Cluster and on the Browser Host ............................... 71.4.5. Check DNS ............................................................................................. 71.4.6. Configuring iptables .............................................................................. 81.4.7. Disable SELinux and PackageKit and check the umask Value ................... 8

    1.5. Optional: Configure Local Repositories .............................................................. 91.5.1. Setting Up a Local Server Having No Internet Access ............................. 101.5.2. Setting up a Local Server Having Limited Internet Access ...................... 14

    2. Running the Ambari Installer ..................................................................................... 202.1. Set Up the Bits ............................................................................................... 20

    2.1.1. RHEL/CentOS/Oracle Linux 5.x ............................................................. 202.1.2. RHEL/CentOS/Oracle Linux 6.x ............................................................. 212.1.3. SLES 11 ................................................................................................ 21

    2.2. Set Up the Server ........................................................................................... 212.2.1. Setup Options ...................................................................................... 22

    2.3. Start the Ambari Server .................................................................................. 233. Hadoop 1.x - Installing, Configuring, and Deploying the Cluster .................................. 24

    3.1. Log into Apache Ambari ................................................................................. 243.2. Welcome ........................................................................................................ 243.3. Select Stack .................................................................................................... 243.4. Install Options ................................................................................................ 253.5. Confirm Hosts ................................................................................................. 263.6. Choose Services .............................................................................................. 263.7. Assign Masters ............................................................................................... 273.8. Assign Slaves and Clients ................................................................................ 273.9. Customize Services .......................................................................................... 27

    3.9.1. Service Users and Groups ..................................................................... 283.9.2. Properties That Depend on Service Usernames/Groups ......................... 293.9.3. Recommended Memory Configurations for the MapReduce Service ....... 30

    3.10. Review .......................................................................................................... 303.11. Install, Start and Test .................................................................................... 303.12. Summary ...................................................................................................... 31

    4. Hadoop 2.x - Installing, Configuring, and Deploying the Cluster .................................. 324.1. Log into Apache Ambari ................................................................................. 32

  • Hortonworks Data Platform Apr 1, 2014

    iv

    4.2. Welcome ........................................................................................................ 324.3. Select Stack .................................................................................................... 324.4. Install Options ................................................................................................ 334.5. Confirm Hosts ................................................................................................. 344.6. Choose Services .............................................................................................. 344.7. Assign Masters ............................................................................................... 354.8. Assign Slaves and Clients ................................................................................ 354.9. Customize Services .......................................................................................... 35

    4.9.1. Service Users and Groups ..................................................................... 364.9.2. Properties That Depend on Service Usernames/Groups ......................... 37

    4.10. Review .......................................................................................................... 384.11. Install, Start and Test .................................................................................... 384.12. Summary ...................................................................................................... 38

    5. NameNode High Availabilty ....................................................................................... 395.1. Setting Up NameNode High Availabilty ........................................................... 395.2. Rolling Back NameNode HA ........................................................................... 45

    5.2.1. Stop HBase .......................................................................................... 455.2.2. Checkpoint the Active NameNode ........................................................ 455.2.3. Stop All Services ................................................................................... 465.2.4. Prepare the Ambari Server Host for Rollback ........................................ 465.2.5. Restore the HBase Configuration ......................................................... 465.2.6. Delete ZK Failover Controllers .............................................................. 475.2.7. Modify HDFS Configurations ................................................................ 485.2.8. Recreate the Secondary NameNode ..................................................... 505.2.9. Re-enable the Secondary NameNode .................................................... 505.2.10. Delete All JournalNodes ..................................................................... 505.2.11. Delete the Additional NameNode ...................................................... 515.2.12. Verify your HDFS Components ........................................................... 525.2.13. Start HDFS ......................................................................................... 52

  • Hortonworks Data Platform Apr 1, 2014

    v

    List of Tables1.1. Deploying HDP - Option I ....................................................................................... 101.2. Options for $os parameter in repo URL ................................................................. 121.3. Options for $os parameter in repo URL ................................................................. 131.4. Deploying HDP - Option II ...................................................................................... 141.5. Options for $os parameter in repo URL ................................................................. 171.6. Options for $os parameter in repo URL ................................................................. 182.1. Download the repo ................................................................................................ 203.1. Service Users ........................................................................................................... 283.2. Service Group ......................................................................................................... 293.3. HDFS Settings: Advanced ........................................................................................ 293.4. MapReduce Settings: Advanced .............................................................................. 294.1. Service Users ........................................................................................................... 364.2. Service Group ......................................................................................................... 374.3. HDFS Settings: Advanced ........................................................................................ 374.4. MapReduce Settings: Advanced .............................................................................. 385.1. Core-site.xml properties and values for NameNode HA on a cluster using Hue .......... 445.2. Set Environment Variables ...................................................................................... 46

  • Hortonworks Data Platform Apr 1, 2014

    1

    1. Getting ReadyThis section describes the information and materials you need to get ready to installHadoop using Apache Ambari. Apache Ambari provides an end-to-end management andmonitoring solution for Apache Hadoop. With Ambari, you can deploy and operate aHadoop Stack using a Web UI and REST API to manage configuration changes and monitorservices for all the nodes in your cluster from a central point.

    Determine Version Compatibility

    Meet Minimum System Requirements

    Collect Information

    Prepare the Environment

    Optional: Configure Local Repositories for Ambari

    1.1.Determine Version CompatibilityUse this table to determine whether your Ambari and Hadoop stack versions arecompatible.

    Ambari HDP 1.2.0 HDP 1.2.1. HDP 1.3.0 HDP 1.3.2 HDP 1.3.3 HDP 2.0 [a]

    1.5.0 X X X

    1.4.4.23 X X X

    1.4.3.38 X X X

    1.4.2.104 X X X

    1.4.1.61 X X X

    1.4.1.25 X X

    1.2.5.17 X X X

    1.2.4.9 X X X

    1.2.3.7 X X X

    1.2.3.6 X X

    1.2.2.5 X X

    1.2.2.4 X X

    1.2.2.3 X

    1.2.1.2 X

    1.2.0.1 X

    [a] HDP 2.0.6 stack (or later) patch releases.

    For more information about the latest HDP patch releases, see HDP Documentation.

    For more information about using Ambari and the Hadoop 1.x stack, see Hadoop 1.x-Deploying, Configuring, and Upgrading Ambari.

    For more information about using Ambari and the Hadoop 2.x stack, see Hadoop 2.x-Deploying, Configuring, and Upgrading Ambari.

  • Hortonworks Data Platform Apr 1, 2014

    2

    1.2.Meet Minimum System RequirementsTo run Hadoop, your system must meet minimum requirements.

    Hardware Recommendations

    Operating Systems Requirements

    Browser Requirements

    Software Requirements

    JDK Requirements

    Database Requirements

    File System Partitioning Recommendations

    1.2.1.Hardware RecommendationsThere is no single hardware requirement set for installing Hadoop.

    For more information on the parameters that may affect your installation, see HardwareRecommendations For Apache Hadoop.

    1.2.2.Operating Systems RequirementsThe following operating systems are supported:

    Red Hat Enterprise Linux (RHEL) v5.x or 6.x (64-bit)

    CentOS v5.x or 6.x (64-bit)

    Oracle Linux v5.x or 6.x (64-bit)

    SUSE Linux Enterprise Server (SLES) 11, SP1 (64-bit)

    Important

    The installer pulls many packages from the base OS repos. If you do not havea complete set of base OS repos available to all your machines at the time ofinstallation you may run into issues.

    If you encounter problems with base OS repos being unavailable, pleasecontact your system administrator to arrange for these additional repos to beproxied or mirrored. For more information see Optional: Configure the LocalRepositories

    1.2.3.Browser RequirementsThe Ambari Install Wizard runs as a browser-based Web app. You must have a machinecapable of running a graphical browser to use this tool. The supported browsers are:

  • Hortonworks Data Platform Apr 1, 2014

    3

    Windows (Vista, 7)

    Internet Explorer 9.0 and higher (for Vista + Windows 7)

    Firefox latest stable release

    Safari latest stable release

    Google Chrome latest stable release

    Mac OS X (10.6 or later)

    Firefox latest stable release

    Safari latest stable release

    Google Chrome latest stable release

    Linux (RHEL, CentOS, SLES, Oracle Linux)

    Firefox latest stable release

    Google Chrome latest stable release

    1.2.4.Software RequirementsOn each of your hosts:

    yum and rpm (RHEL/CentOS/Oracle Linux)

    zypper (SLES)

    scp, curl, and wget

    python (2.6 or later)

    Important

    The Python version shipped with SUSE 11, 2.6.0-8.12.2, has a critical bugthat may cause the Ambari Agent to fail within the first 24 hours. If youare installing on SUSE 11, please update all your hosts to Python version2.6.8-0.15.1.

    1.2.5.JDK RequirementsThe following Java runtimes are supported:

    Oracle JDK 1.6.0_31 64-bit

    Oracle JDK 1.7_45 64-bit (default)

    OpenJDK 7 64-bit

  • Hortonworks Data Platform Apr 1, 2014

    4

    1.2.6.Database RequirementsHive/HCatalog, Oozie, and Ambari all require their own internal databases.

    Hive/HCatalog: By default uses an Ambari-installed MySQL 5.x instance. With appropriatepreparation, you can also use an existing MySQL 5.x or Oracle 11g r2 instance. See UsingNon-Default Databases for more information on using existing instances.

    Oozie: By default uses an Ambari-installed Derby instance. With appropriate preparation,you can also use an existing MySQL 5.x or Oracle 11g r2 instance. See Using Non-DefaultDatabases for more information on using existing instances.

    Ambari: By default uses an Ambari-installed PostgreSQL 8.x instance. With appropriatepreparation, you can also use an existing Oracle 11g r2, or MySQL 5.x instance. See UsingNon-Default Databases for more information on using existing instances.

    1.2.7.File System Partitioning RecommendationsFor information on setting up file system partitions on master and slave nodes in a HDPcluster, see File System Partitioning Recommendations.

    1.3.Collect InformationTo deploy your Hadoop installation, you need to collect the following information:

    The fully qualified domain name (FQDN) for each host in your system, and whichcomponents you want to set up on which host. The Ambari install wizard doesnotsupport using IP addresses. You can use hostname -f to check for the FQDN if youdo not know it.

    NoteWhile it is possible to deploy all of Hadoop on a single host, this isappropriate only for initial evaluation. In general you should use at leastthree hosts: one master host and two slaves.

    The base directories you want to use as mount points for storing:

    NameNode data

    DataNodes data

    Secondary NameNode data

    Oozie data

    MapReduce data (Hadoop version 1.x)

    YARN data (Hadoop version 2.x)

    ZooKeeper data, if you install ZooKeeper

    Various log, pid, and db files, depending on your install type

  • Hortonworks Data Platform Apr 1, 2014

    5

    1.4.Prepare the EnvironmentTo deploy your Hadoop instance, you need to prepare your deploy environment:

    Check Existing Installs

    Set up Password-less SSH

    Set up Users and Groups

    Enable NTP on the Cluster

    Check DNS

    Configure iptables

    Disable SELinux, PackageKit and Check umask Value

    1.4.1.Check Existing InstallsAmbari automatically installs the correct versions of the files that are necessary for Ambariand Hadoop to run. Versions other than the ones that Ambari uses can cause problems inrunning the installer, so remove any existing installs that do not match the following lists.

    RHEL/CentOS/Oracle Linuxv5

    RHEL/CentOS/Oracle Linuxv6

    SLES 11

    Ambari Server libffi 3.0.5-1.el5

    python26 2.6.8-2.el5

    python26-libs 2.6.8-2.el5

    postgresql 8.4.13-1.el6_3

    postgresql-libs8.4.13-1.el6_3

    postgresql-server8.4.13-1.el6_3

    postgresql 8.4.13-1.el6_3

    postgresql-libs8.4.13-1.el6_3

    postgresql-server8.4.13-1.el6_3

    postgresql 8.3.5-1

    postgresql-server 8.3.5-1

    postgresql-libs 8.3.5-1

    Ambari Agent [a libffi 3.0.5-1.el5

    python26 2.6.8-2.el5

    python26-libs 2.6.8-2.el5

    None None

    Nagios Serverb nagios 3.5.0-99

    nagios-devel 3.5.0-99

    nagios-www 3.5.0-99

    nagios-plugins 1.4.9-1

    nagios 3.5.0-99

    nagios-devel 3.5.0-99

    nagios-www 3.5.0-99

    nagios-plugins 1.4.9-1

    nagios 3.5.0-99

    nagios-devel 3.5.0-99

    nagios-www 3.5.0-99

    nagios-plugins 1.4.9-1

    Ganglia Serverc ganglia-gmetad 3.5.0-99

    ganglia-devel 3.5.0-99

    libganglia 3.5.0-99

    ganglia-web 3.5.7-99

    rrdtool 1.4.5-1.el5

    ganglia-gmetad 3.5.0-99

    ganglia-devel 3.5.0-99

    libganglia 3.5.0-99

    ganglia-web 3.5.7-99

    rrdtool 1.4.5-1.el6

    ganglia-gmetad 3.5.0-99

    ganglia-devel 3.5.0-99

    libganglia 3.5.0-99

    ganglia-web 3.5.7-99

    rrdtool 1.4.5-4.5.1

    Ganglia Monitord ganglia-gmond 3.5.0-99 ganglia-gmond 3.5.0-99 ganglia-gmond 3.5.0-99

  • Hortonworks Data Platform Apr 1, 2014

    6

    RHEL/CentOS/Oracle Linuxv5

    RHEL/CentOS/Oracle Linuxv6

    SLES 11

    libganglia 3.5.0-99 libganglia 3.5.0-99 libganglia 3.5.0-99aInstalled on each host in your cluster. Communicates with the Ambari Server to execute commands.bThe host that runs the Nagios servercThe host that runs the Ganglia ServerdInstalled on each host in the cluster. Sends metrics data to the Ganglia Collector.

    1.4.2.Set Up Password-less SSHTo have Ambari Server automatically install Ambari Agents in all your cluster hosts, youmust set up password-less SSH connections between the main installation (Ambari Server)host and all other machines. The Ambari Server host acts as the client and uses the key-pairto access the other hosts in the cluster to install the Ambari Agent.

    Note

    You can choose to install the Agents on each cluster host manually. In thiscase you do not need to setup SSH. See Appendix: Installing Ambari AgentsManually for more information.

    1. Generate public and private SSH keys on the Ambari Server host

    ssh-keygen

    2. Copy the SSH Public Key (id_rsa.pub) to the root account on your target hosts.

    .ssh/id_rsa

    .ssh/id_rsa.pub

    3. Add the SSH Public Key to the authorized_keys file on your target hosts.

    cat id_rsa.pub >> authorized_keys

    4. Depending on your version of SSH, you may need to set permissions on the .ssh directory(to 700) and the authorized_keys file in that directory (to 600) on the target hosts.

    chmod 700 ~/.sshchmod 600 ~/.ssh/authorized_keys

    5. From the Ambari Server, make sure you can connect to each host in the cluster usingSSH.

    ssh root@{remote.target.host}

    You may see this warning. This happens on your first connection and is normal.

    Are you sure you want to continue connecting (yes/no)?

    6. Retain a copy of the SSH Private Key on the machine from which you will run the web-based Ambari Install Wizard.

    Note

    It is possible to use a non-root SSH account, if that account can execute sudowithout entering a password.

  • Hortonworks Data Platform Apr 1, 2014

    7

    1.4.3.Set up Users and GroupsThe Ambari cluster installer automatically creates user and group accounts for you. Ambaripreserves any existing user and group accounts, and uses these accounts when configuringHadoop services. User and group creation applies to user/group accounts on the localoperating system and to LDAP/AD accounts.

    To set up custom accounts before running the Ambari installer, see Service Users andGroups (for the 1.x stack) or Service Users and Groups (for the 2.x stack) for moreinformation about customizing service users and groups.

    1.4.4.Enable NTP on the Cluster and on the Browser HostThe clocks of all the nodes in your cluster and the machine that runs the browser throughwhich you access Ambari Web must be able to synchronize with each other.

    1.4.5.Check DNSAll hosts in your system must be configured for DNS and Reverse DNS.

    If you are unable to configure DNS and Reverse DNS, you must edit the hosts file onevery host in your cluster to contain the address of each of your hosts and to set the FullyQualified Domain Name hostname of each of those hosts. The following instructions coverbasic hostname network setup for generic Linux hosts. Different versions and flavors ofLinux might require slightly different commands. Please refer to your specific operatingsystem documentation for the specific details for your system.

    1.4.5.1.Edit the Host File1. Using a text editor, open the hosts file on every host in your cluster. For example:

    vi /etc/hosts

    2. Add a line for each host in your cluster. The line should consist of the IP address and theFQDN. For example:

    1.2.3.4 fully.qualified.domain.name

    NoteDo not remove the following two lines from your host file, or variousprograms that require network functionality may fail.

    127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6

    1.4.5.2.Set the Hostname1. Use the "hostname" command to set the hostname on each host in your cluster. For

    example:

    hostname fully.qualified.domain.name

    2. Confirm that the hostname is set by running the following command:

  • Hortonworks Data Platform Apr 1, 2014

    8

    hostname -f

    This should return the name you just set.

    1.4.5.3.Edit the Network Configuration File1. Using a text editor, open the network configuration file on every host. This file is used to

    set the desired network configuration for each host. For example:

    vi /etc/sysconfig/network

    2. Modify the HOSTNAME property to set the fully.qualified.domain.name.

    NETWORKING=yesNETWORKING_IPV6=yesHOSTNAME=fully.qualified.domain.name

    1.4.6.Configuring iptablesFor Ambari to communicate during setup with the hosts it deploys to and manages, certainports must be open and available. The easiest way to do this is to temporarily disableiptables.

    chkconfig iptables off/etc/init.d/iptables stop

    You can restart iptables after setup is complete.

    If the security protocols at your installation do not allow you to disable iptables, you canproceed with them on, as long as all of the relevant ports are open and available.

    During the Ambari Server setup process, Ambari checks to see if iptables is running. If itis, a warning prints to remind you to check that the necessary ports are open and available.The Host Confirm step of the Cluster Install Wizard will also issue a warning for each hostthat has iptables running.

    Important

    If you leave iptables enabled and do not set up the necessary ports, the clusterinstallation will fail.

    1.4.7.Disable SELinux and PackageKit and check the umaskValue

    1. SELinux must be temporarily disabled for the Ambari setup to function. Run thefollowing command on each host in your cluster:

    setenforce 0

    2. On the RHEL/CentOS installation host, if PackageKit is installed, open /etc/yum/pluginconf.d/refresh-packagekit.conf with a text editor and make thischange:

    enabled=0

  • Hortonworks Data Platform Apr 1, 2014

    9

    Note

    PackageKit is not enabled by default on SLES. Unless you have specificallyenabled it, you do not need to do this step.

    3. Make sure umask is set to 022.

    1.5.Optional: Configure Local RepositoriesIf your cluster includes a firewall that prevents or limits Internet access, you can set up localrepositories to deploy Ambari and HDP.

    Before deploying HDP on a cluster having no or limited Internet access:

    1. Review your deployment strategies.

    2. Compare specific deployment options.

    3. Write down the Base URL of the local mirror repository for each operating system. Youselect these Base URLs during the cluster install procedure.

    For example, if your system includes hosts running CentOS 6, pointed to a HDP 2.0.10.0repository, your local repository Base URL would look something like this:

    http://{your.hosted.local.repository}/HDP-2.0.10.0/repos/centos6

    Important

    Each relative path must reference your local repository configuration. If yourcluster includes multiple operating systems (for example, CentOS 6 and RHEL6), you must configure a unique repository for each OS. A host in your clustercan retrieve software packages only from a repository configured for thatoperating system.

    4. Choose a JDK version to deploy, how to download it, and where to install it on eachhost. See JDK Requirements for more information on supported JDKs.

    If you have not already installed the JDK on all hosts, and plan to use Oracle JDK 1.7,download jdk-7u45-linux-x64.tar.gz to /var/lib/ambari-server/resources.

    If you plan to use a JDK other than Oracle JDK 1.7, you must install that JDK on eachhost in your cluster.

    Set the Java Home path on each host, using -j option when running ambari-serversetup. See Setup Options for more information about using the -j option to setJAVA_Home.

    Note

    If you have already installed the JDK on all your hosts, you must includethe -j option when running ambari-server setup.

  • Hortonworks Data Platform Apr 1, 2014

    10

    5. Set up local repositories for Ambari, and HDP based on your preferred strategy, asdescribed in one of the following sections:

    Local Server Having no Internet Access

    Local Server Having Limited Internet Access

    6. Set up the Ambari server.

    1.5.1.Setting Up a Local Server Having No Internet AccessComplete the following instructions to set up a mirror server that has no access to theInternet:

    Check Your Prerequisites

    Install the Repos

    1.5.1.1.Check Your Prerequisites

    Select a mirror server host with the following characteristics:

    This server runs on either CentOS (v5.x, v6.x), RHEL (v5.x, v6.x), Oracle Linux(v5.x, v6.x),or SLES 11 and has several GB of storage available.

    This server and the cluster nodes are all running the same OS.

    Note

    To support repository mirroring for heterogeneous clusters requires a morecomplex procedure than the one documented here.

    The firewall lets all cluster nodes (the servers on which you want to install HDP) to accessthis server.

    Ensure that the mirror server has yum installed.

    Add the yum-utils and createrepo packages on the mirror server.

    yum install yum-utils createrepo

    1.5.1.2.Install the Repos

    1. Use a workstation with access to the Internet and download the tarball image of theappropriate Hortonworks yum repository.

    Table1.1.Deploying HDP - Option I

    ClusterOS

    HDP Repository Tarballs

    RHEL/CentOS/OracleLinux 5.x

    wget http://public-repo-1.hortonworks.com/HDP/centos5/HDP-2.0.10.0-centos5-rpm.tar.gz

    wget http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.16/repos/centos5/HDP-UTILS-1.1.0.16-centos5.tar.gz

  • Hortonworks Data Platform Apr 1, 2014

    11

    ClusterOS

    HDP Repository Tarballs

    wget http://public-repo-1.hortonworks.com/ambari/centos5/ambari-1.5.0-centos5.tar.gz

    RHEL/CentOS/OracleLinux 6.x

    wget http://public-repo-1.hortonworks.com/HDP/centos6/HDP-2.0.10.0-centos6-rpm.tar.gz

    wget http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.16/repos/centos6/HDP-UTILS-1.1.0.16-centos6.tar.gz

    wget http://public-repo-1.hortonworks.com/ambari/centos6/ambari-1.5.0-centos6.tar.gz

    SLES 11 wget http://public-repo-1.hortonworks.com/HDP/suse11/HDP-2.0.10.0-suse11-rpm.tar.gz

    wget http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.16/repos/suse11/HDP-UTILS-1.1.0.16-suse11.tar.gz

    wget http://public-repo-1.hortonworks.com/ambari/suse11/ambari-1.5.0-suse11.tar.gz

    2. Create an HTTP server.

    a. On the mirror server, install an HTTP server (such as Apache httpd) using theinstructions provided here.

    b. Activate this web server.

    c. Ensure that the firewall settings (if any) allow inbound HTTP access from your clusternodes to your mirror server.

    Note

    If you are using EC2, make sure that SELinux is disabled.

    3. On your mirror server, create a directory for your web server.

    For example, from a shell window, type:

    For RHEL/CentOS/Oracle:

    mkdir p /var/www/html/hdp/

    For SLES:

    mkdir p /srv/www/htdocs/rpms

    If you are using a symlink, enable the followsymlinks on your web server.

    4. Copy the HDP Repository Tarball to the directory created in step 3, and untar it.

    5. Verify the configuration.

    The configuration is successful, if you can access the above directory through your webbrowser.

    To test this out, browse to the following location: http://$yourwebserver/hdp/$os/HDP-2.0.10.0/.

  • Hortonworks Data Platform Apr 1, 2014

    12

    You should see directory listing for all the HDP components along with the RPMs at:$os/HDP-2.0.10.0.

    Note

    If you are installing a 2.x.0 release, use: http://$yourwebserver/hdp/$os/2.x/GA

    If you are installing a 2.x.x release, use:http://$yourwebserver/hdp/$os/2.x/updates

    where

    $os can be centos5, centos6, or suse11, . Use the following options table for$os parameter:

    Table1.2.Options for $os parameter in repo URLOperating System Value

    CentOS 5

    RHEL 5

    Oracle Linux 5

    centos5

    CentOS 6

    RHEL 6

    Oracle Linux 6

    centos6

    SLES 11 suse11

    6. Configure the yum clients on all the nodes in your cluster.

    a. Fetch the yum configuration file from your mirror server.

    http:///hdp/$os/2.x/updates/2.0.10.0/hdp.repo

    b. Store the hdp.repo file to a temporary location.

    c. Edit hdp.repo file changing the value of the baseurl property to point to your localrepositories based on your cluster OS.

    [HDP-2.x]name=Hortonworks Data Platform Version - HDP-2.xbaseurl=http://$yourwebserver/HDP/$os/2.x/GAgpgcheck=1gpgkey=http://public-repo-1.hortonworks.com/HDP/$os/RPM-GPG-KEY/RPM-GPG-KEY-Jenkinsenabled=1priority=1

    [HDP-UTILS-1.1.0.16]name=Hortonworks Data Platform Utils Version - HDP-UTILS-1.1.0.16baseurl=http://$yourwebserver/HDP-UTILS-1.1.0.16/repos/$osgpgcheck=1gpgkey=http://public-repo-1.hortonworks.com/HDP/$os/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins

  • Hortonworks Data Platform Apr 1, 2014

    13

    enabled=1priority=1

    [HDP-2.0.10.0]name=Hortonworks Data Platform HDP-2.0.10.0baseurl=http://$yourwebserver/HDP/$os/2.x/updates/2.0.10.0gpgcheck=1gpgkey=http://public-repo-1.hortonworks.com/HDP/$os/RPM-GPG-KEY/RPM-GPG-KEY-Jenkinsenabled=1priority=1

    where

    $yourwebserver is FQDN of your local mirror server.

    $os can be centos5, centos6, or suse11 . Use the following options table for$os parameter:

    Table1.3.Options for $os parameter in repo URLOperating System Value

    CentOS 5

    RHEL 5

    Oracle Linux 5

    centos5

    CentOS 6

    RHEL 6

    Oracle Linux 6

    centos6

    SLES 11 suse11

    Use scp or pdsh to copy the client yum configuration file to /etc/yum.repos.d/directory on every node in the cluster.

    d. [Conditional]: If you have multiple repositories configured in your environment,deploy the following plugin on all the nodes in your cluster.

    i. Install the plugin.

    For RHEL and CentOs v5.x

    yum install yum-priorities

    For RHEL and CentOs v6.x

    yum install yum-plugin-priorities

    ii. Edit the /etc/yum/pluginconf.d/priorities.conf file to add thefollowing:

    [main]enabled=1gpgcheck=0

  • Hortonworks Data Platform Apr 1, 2014

    14

    1.5.2.Setting up a Local Server Having Limited InternetAccess

    Complete the following instructions to set up a mirror server that has limited, or temporaryaccess to the Internet:

    1. Check Your Prerequisites

    2. Install the Repos

    1.5.2.1.Check Your Prerequisites

    The local mirror server host must have the following characteristics:

    This server runs on either CentOS/RHEL/Oracle Linux 5.x or 6.x, or SLES 11 and hasseveral GB of storage available.

    The local mirror server and the cluster nodes must have the same OS. If they are notrunning CentOS or RHEL, the mirror server must not be a member of the Hadoop cluster.

    Note

    To support repository mirroring for heterogeneous clusters requires a morecomplex procedure than the one documented here.

    The firewall allows all cluster nodes (the servers on which you want to install HDP) toaccess this server.

    Ensure that the mirror server has yum installed.

    Add the yum-utils and createrepo packages on the mirror server.

    yum install yum-utils createrepo

    1.5.2.2.Install the Repos

    Temporarily reconfigure your firewall to allow Internet access from your mirror serverhost.

    Execute the following command to download the appropriate Hortonworks yum clientconfiguration file and save it in /etc/yum.repos.d/ directory on the mirror serverhost.

    Table1.4.Deploying HDP - Option II

    ClusterOS

    HDP Repository Tarballs

    RHEL/CentOS/OracleLinux 5.x

    wget http://public-repo-1.hortonworks.com/HDP/centos5/2.x/updates/2.0.10.0/hdp.repo -O /etc/yum.repos.d/hdp.repo

    wget http://public-repo-1.hortonworks.com/ambari/centos5/1.x/updates/1.5.0/ambari.repo -O /etc/yum.repos.d/ambari.repo

    RHEL/CentOS/

    wget http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.0.10.0/hdp.repo -O /etc/yum.repos.d/hdp.repo

  • Hortonworks Data Platform Apr 1, 2014

    15

    ClusterOS

    HDP Repository Tarballs

    OracleLinux 6.x

    wget http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.5.0/ambari.repo -O /etc/yum.repos.d/ambari.repo

    SLES 11 wget http://public-repo-1.hortonworks.com/HDP/suse11/2.x/updates/2.0.10.0/hdp.repo -O /etc/zypp/hdp.repo

    wget http://public-repo-1.hortonworks.com/ambari/suse11/1.x/updates/1.5.0/ambari.repo -O /etc/zypp/ambari.repo

    Create an HTTP server.

    1. On the mirror server, install an HTTP server (such as Apache httpd) using theinstructions provided http://httpd.apache.org/download.cgi

    2. Activate this web server.

    3. Ensure that the firewall settings (if any) allow inbound HTTP access from your clusternodes to your mirror server.

    Note

    If you are using EC2, make sure that SELinux is disabled.

    4. Optional - If your mirror server uses SLES, modify the default-server.conf file toenable the docs root folder listing.

    sed -e "s/Options None/Options Indexes MultiViews/ig" /etc/apache2/default-server.conf > /tmp/tempfile.tmpmv /tmp/tempfile.tmp /etc/apache2/default-server.conf

    On your mirror server, create a directory for your web server.

    For example, from a shell window, type:

    For RHEL/CentOS/Oracle:

    mkdir p /var/www/html/hdp/

    For SLES:

    mkdir p /srv/www/htdocs/rpms

    If you are using a symlink, enable the followsymlinks on your web server.

    Copy the contents of entire HDP repository for your desired OS from the remote yumserver to your local mirror server.

    Continuing the previous example, from a shell window, type:

    For RHEL/CentOS/Oracle:

    cd /var/www/html/hdp

    For SLES:

    cd /srv/www/htdocs/rpms

  • Hortonworks Data Platform Apr 1, 2014

    16

    Then for all hosts, type:

    HDP Repository

    reposync -r HDPreposync -r HDP-2.0.10.0reposync -r HDP-UTILS-1.1.0.16

    You should see both an HDP-2.0.10.0 directory and an HDP-UTILS-1.1.0.16directory, each with several subdirectories.

    Optional - Ambari Repository

    reposync r ambari-1.xreposync r $release_type-ambari-1.5.0

    Generate appropriate metadata.

    This step defines each directory as a yum repository. From a shell window, type:

    For RHEL/CentOS/Oracle:

    HDP Repository:

    createrepo /var/www/html/hdp/HDP-2.0.10.0createrepo /var/www/html/hdp/HDP-UTILS-1.1.0.16

    Optional - Ambari Repository:

    createrepo /var/www/html/hdp/ambari-1.xcreaterepo /var/www/html/hdp/$release_type-ambari-1.5.0

    For SLES:

    HDP Repository:

    createrepo /srv/www/htdocs/rpms/hdp/HDP

    Optional - Ambari Repository:

    createrepo /srv/www/htdocs/rpms/hdp/ambari-1.xcreaterepo /srv/www/htdocs/rpms/hdp/$release_type-ambari-1.5.0

    You should see a new folder called repodata inside both HDP directories.

    Verify the configuration.

    The configuration is successful, if you can access the above directory through your webbrowser.

    To test this out, browse to the following location:

    HDP: http://$yourwebserver/hdp/HDP-2.0.10.0/

    Optional - Ambari Repository: http://$yourwebserver/hdp/ambari/$os/1.x/updates/1.5.0

  • Hortonworks Data Platform Apr 1, 2014

    17

    You should now see directory listing for all the HDP components.

    At this point, you can disable external Internet access for the mirror server, so that themirror server is again entirely within your data center firewall.

    Depending on your cluster OS, configure the yum clients on all the nodes in your cluster

    1. Edit the repo files, changing the value of the baseurl property to the local mirror URL.

    Edit the /etc/yum.repos.d/hdp.repo file, changing the value of the baseurlproperty to point to your local repositories based on your cluster OS.

    [HDP-2.x]name=Hortonworks Data Platform Version - HDP-2.xbaseurl=http://$yourwebserver/HDP/$os/2.x/GA gpgcheck=1gpgkey=http://public-repo-1.hortonworks.com/HDP/$os/RPM-GPG-KEY/RPM-GPG-KEY-Jenkinsenabled=1priority=1

    [HDP-UTILS-1.1.0.16]name=Hortonworks Data Platform Utils Version - HDP-UTILS-1.1.0.16baseurl=http://$yourwebserver/HDP-UTILS-1.1.0.16/repos/$osgpgcheck=1gpgkey=http://public-repo-1.hortonworks.com/HDP/$os/RPM-GPG-KEY/RPM-GPG-KEY-Jenkinsenabled=1priority=1

    [HDP-2.0.10.0]name=Hortonworks Data Platform HDP-2.0.10.0baseurl=http://$yourwebserver/HDP/$os/2.x/updates/2.0.10.0gpgcheck=1gpgkey=http://public-repo-1.hortonworks.com/HDP/$os/RPM-GPG-KEY/RPM-GPG-KEY-Jenkinsenabled=1priority=1

    where

    $yourwebserver is FQDN of your local mirror server.

    $os can be centos5, centos6, or suse11. Use the following options table for$os parameter:

    Table1.5.Options for $os parameter in repo URLOperating System Value

    CentOS 5

    RHEL 5

    Oracle Linux 5

    centos5

    CentOS 6

    RHEL 6

    Oracle Linux 6

    centos6

  • Hortonworks Data Platform Apr 1, 2014

    18

    Operating System Value

    SLES 11 suse11

    Edit the /etc/yum.repos.d/ambari.repo file, changing the value of thebaseurl property to point to your local repositories based on your cluster OS.

    [ambari-1.x]name=Ambari 1.xbaseurl=http://$yourwebserver/hdp/ambari/$os/1.x/updates/ambari.repogpgcheck=0gpgkey=http://public-repo-1.hortonworks.com/ambari/$os/RPM-GPG-KEY/RPM-GPG-KEY-Jenkinsenabled=1priority=1

    [HDP-UTILS-1.1.0.16]name=Hortonworks Data Platform Utils Version - HDP-UTILS-1.1.0.16baseurl=http://$yourwebserver/HDP-UTILS-1.1.0.16/repos/$osgpgcheck=0gpgkey=http://public-repo-1.hortonworks.com/ambari/$os/RPM-GPG-KEY/RPM-GPG-KEY-Jenkinsenabled=1priority=1

    [$release_type-ambari-1.5.0name=ambari-1.5.0 - updatesbaseurl=http://$yourwebserver/ambari/$os/1.x/updates/1.5.0gpgcheck=0gpgkey=http://public-repo-1.hortonworks.com/ambari/$os/RPM-GPG-KEY/RPM-GPG-KEY-Jenkinsenabled=1priority=1

    $yourwebserver is FQDN of your local mirror server.

    $os can be centos5, centos6, or suse11. Use the following options table for$os parameter:

    Table1.6.Options for $os parameter in repo URLOperating System Value

    CentOS 5

    RHEL 5

    Oracle Linux 5

    centos5

    CentOS 6

    RHEL 6

    Oracle Linux 6

    centos6

    SLES 11 suse11

    2. Copy the yum/zypper client configuration file to all nodes in your cluster.

    RHEL/CentOS/Oracle Linux:

  • Hortonworks Data Platform Apr 1, 2014

    19

    Use scp or pdsh to copy the client yum configuration file to /etc/yum.repos.d/directory on every node in the cluster.

    For SLES:

    On every node, invoke the following command:

    HDP Repository: zypper addrepo -r http://$yourwebserver/hdp/HDP/suse11/2.x/updates/2.0.10.0/hdp.repo

    Optional - Ambari Repository: zypper addrepo -r http://$yourwebserver/hdp/ambari/suse11/1.x/updates/1.5.0/ambari.repo

    If using Ambari, verify the configuration by deploying Ambari server on one ofthe cluster nodes. yum install ambari-server

    If your cluster runs CentOS, Oracle, or RHEL and if you have multiple repositoriesconfigured in your environment, deploy the following plugin on all the nodes in yourcluster.

    1. Install the plugin.

    For RHEL and CentOs v5.x

    yum install yum-priorities

    For RHEL and CentOs v6.x

    yum install yum-plugin-priorities

    2. Edit the /etc/yum/pluginconf.d/priorities.conf file to add the following:

    [main]enabled=1gpgcheck=0

  • Hortonworks Data Platform Apr 1, 2014

    20

    2. Running the Ambari InstallerThis section describes the process for installing Apache Ambari. Ambari manages installingand deploying Hadoop.

    2.1.Set Up the Bits1. Log into the machine that is to serve as the Ambari Server as root. You may login

    and sudo as su if this is what your environment requires. This machine is the maininstallation host.

    2. Download the the Ambari repository file and copy it to your repos.d.

    Table2.1.Download the repo

    Platform Access

    RHEL, CentOS, and Oracle Linux 5 wget http://public-repo-1.hortonworks.com/ambari/centos5/1.x/updates/1.5.0/ambari.repo

    cp ambari.repo /etc/yum.repos.d

    RHEL, CentOS and Oracle Linux 6 wget http://public-repo-1.hortonworks.com/ambari/centos6/1.x/updates/1.5.0/ambari.repo

    cp ambari.repo /etc/yum.repos.d

    SLES 11 wget http://public-repo-1.hortonworks.com/ambari/suse11/1.x/updates/1.5.0/ambari.repo

    cp ambari.repo /etc/zypp/repos.d

    Note

    If your cluster does not have access to the Internet, or you are creating a largecluster and you want to conserve bandwidth, you need to provide access tothe bits using an alternative method. For more information, see Optional:Configure the Local Repositories section.

    When you have the software, continue your installation based on your base platform.

    2.1.1.RHEL/CentOS/Oracle Linux 5.x1. Confirm the repository is configured by checking the repo list.

    yum repolist

    You should see the Ambari and HDP utilities repositories in the list. The version valuesvary depending the installation.

    repo id repo name | AMBARI-1.x | Ambari 1.x | HDP-UTILS-1.1.0.16 | Hortonworks Data Platform Utils

    2. Install the Ambari bits using yum. This also installs PostgreSQL.

  • Hortonworks Data Platform Apr 1, 2014

    21

    yum install ambari-server

    2.1.2.RHEL/CentOS/Oracle Linux 6.x1. Confirm the repository is configured by checking the repo list.

    yum repolist

    You should see the Ambari and HDP utilities repositories in the list. The version valuesvary depending the installation.

    repo id repo name | AMBARI-1.x | Ambari 1.x | HDP-UTILS-1.1.0.16 | Hortonworks Data Platform Utils

    2. Install the Ambari bits using yum. This also installs PostgreSQL.

    yum install ambari-server

    2.1.3.SLES 111. Verify that php_curl is installed before you begin.

    zypper se 'php_curl*'

    If php_curl is not installed, install it:

    zypper in php_curl

    2. Confirm the downloaded repository is configured by checking the repo list.

    zypper repos

    You should see the Ambari and HDP utilities in the list. The version values varydepending the installation.

    # | Alias | Name1 | AMBARI.dev-1.x | Ambari 1.x2 | HDP-UTILS-1.1.0.16 | Hortonworks Data Platform Utils

    3. Install the Ambari bits using zypper. This also installs PostgreSQL.

    zypper install ambari-server

    2.2.Set Up the ServerThe ambari-server command manages the setup process. Run the following commandand respond to the prompts:

    ambari-server setup

    1. If you have not temporarily disabled SELinux, you may get a warning. Enter y tocontinue.

  • Hortonworks Data Platform Apr 1, 2014

    22

    2. By default, Ambari Server runs under root. If you want to create a different user to runthe Ambari Server instead, or to assign a previously created user, select y at Customizeuser account for ambari-server daemon and give the prompt the username you want touse.

    3. If you have not temporarily disabled iptables you may get a warning. Enter y tocontinue.

    4. Agree to the Oracle JDK license when asked. You must accept this license to be able todownload the necessary JDK from Oracle. The JDK is installed during the deploy phase.

    Note

    By default, Ambari Server setup will download and install Oracle JDK 1.7. Ifyou plan to download this JDK and install on all your hosts, or plan to use adifferent version of the JDK, skip this step and see Setup Options for moreinformation

    5. At Enter advanced database configuration:

    To use the default PostgreSQL database, named ambari, with the default usernameand password (ambari/bigdata), enter 1.

    Important

    If you are using an existing Oracle or MySQL database instance, youmust prepare the instance using the steps detailed in Using Non-DefaultDatabases before running the installer.

    To use an existing Oracle 11g r2 instance, and select your own database name,username and password for that database, enter 2.

    Select the database you want to use and provide any information required by theprompts, including hostname, port, Service Name or SID, username, and password.

    To use an existing MySQL 5.x database, and select your own database name,username and password for that database, enter 3.

    Select the database you want to use and provide any information required by theprompts, including hostname, port, database name, username, and password.

    6. Setup completes.

    Note

    If your host accesses the Internet through a proxy server, you must configureAmbari Server to use this proxy server. See Configure Ambari Server forInternet Proxy for more information.

    2.2.1.Setup OptionsThe following table describes options frequently used for Ambari Server setup.

  • Hortonworks Data Platform Apr 1, 2014

    23

    Option Description

    -j

    --java-home

    Specifies the JAVA_HOME path to use on the AmbariServer and all hosts in the cluster. By default when you donot specify this option, Setup downloads the Oracle JDK1.7 binary to /var/lib/ambari-server/resourceson the Ambari Server and installs the JDK to /usr/jdk64.

    Use this option when you plan to use a JDK other thanthe default Oracle JDK 1.7. See JDK Requirements formore information on the supported JDKs. If you are usingan alternate JDK, you must manually install the JDK onall hosts and specify the Java Home path during AmbariServer setup.

    This path must be valid on all hosts. For example.

    ambari-server setup j /usr/java/default

    -s

    --silent

    Setup runs silently. Accepts all default prompt values.

    -v

    --verbose

    Prints verbose info and warning messages to the consoleduring Setup.

    -i

    --jdk-location

    Use specified JDK file in local filesystem instead ofdownloading

    -g

    --debug

    Start Ambari Server in debug mode

    2.3.Start the Ambari Server To start the Ambari Server:

    ambari-server start

    To check the Ambari Server processes:

    ps -ef | grep Ambari

    To stop the Ambari Server:

    ambari-server stop

  • Hortonworks Data Platform Apr 1, 2014

    24

    3. Hadoop 1.x - Installing, Configuring,and Deploying the Cluster

    This section describes using the Ambari install wizard in your browser to complete yourinstallation, configuration and deployment of Hadoop.

    3.1.Log into Apache AmbariOnce you have started the Ambari service, you can access the Ambari Install Wizardthrough your browser.

    1. Point your browser to http://{main.install.hostname}:8080.

    2. Log in to the Ambari Server using the default username/password: admin/admin. Youcan change this later to whatever you want.

    3.2.WelcomeThe first step creates the cluster name.

    1. At the Welcome page, type a name for the cluster you want to create in the text box.No whitespaces or special characters can be used in the name.

    2. Click the Next button.

    3.3.Select StackThe Service Stack (or simply the Stack) is a coordinated and tested set of Hadoopcomponents. Use the radio button to select the Stack version you want to install. To installa Hadoop 1 stack, select HDP 1.3.3 under Stacks.

    Under Advanced Repository Options, you can specify the Base URLs of your localrepositories for each Operating System you plan to use in your cluster. You should haveconfigured the Base URLs for your local repositories in Optional: Configure Ambari for LocalRepositories.

  • Hortonworks Data Platform Apr 1, 2014

    25

    3.4.Install OptionsIn order to build up the cluster, the install wizard needs to know general information abouthow you want to set up your cluster. You need to supply the FQDN of each of your hosts.The wizard also needs to access the private key file you created in Set Up Password-lessSSH. It uses these to locate all the hosts in the system and to access and interact with themsecurely.

    1. Use the Target Hosts text box to enter your list of host names, one per line.You can use ranges inside brackets to indicate larger sets of hosts. For example,for host01. through host10. usehost[01-10].

    Note

    If you are deploying on EC2, use the internal Private DNS hostnames.

    2. If you want to let Ambari automatically install the Ambari Agent on all your hosts usingSSH, select Provide your SSH Private Key and either use the Choose File button in theHost Registration Information section to find the private key file that matches thepublic key you installed earlier on all your hosts or cut and paste the key into the textbox manually.

    Note

    If you are using IE 9, the Choose File button may not appear. Use the textbox to cut and paste your private key manually.

    Fill in the username for the SSH key you have selected. If you do not want to use root,you must provide the username for an account that can execute sudo without enteringa password.

  • Hortonworks Data Platform Apr 1, 2014

    26

    3. If you do not want Ambari to automatically install the Ambari Agents, select Performmanual registration. See Appendix: Installing Ambari Agents Manually for moreinformation.

    4. Click the Register and Confirm button to continue.

    3.5.Confirm HostsThis screen lets you confirm that Ambari has located the correct hosts for your clusterand to check those hosts to make sure they have the correct directories, packages, andprocesses to continue the install.

    If any hosts were selected in error, you can remove them by selecting the appropriatecheckboxes and clicking the grey Remove Selected button. To remove a single host, clickthe small white Remove button in the Action column.

    At the bottom of the screen, you may notice a yellow box that indicates some warningswere encountered during the check process. For example, your host may have already hada copy of wget or curl. Select Click here to see the warnings to see a list of what waschecked and what caused the warning. On the same page you can get access to a pythonscript that can help you clear any issues you may encounter and let you run Rerun Checks.

    Important

    If you are deploying HDP using Ambari 1.4 or later on RHEL 6.5 you willlikely see Ambari Agents fail to register with Ambari Server during theConfirm Hosts step in the Cluster Install wizard. Click the Failed link onthe Wizard page to display the Agent logs. The following log entry indicatesthe SSL connection between the Agent and Server failed during registration:INFO 2014-04-02 04:25:22,669 NetUtil.py:55 - Failedto connect to https://:8440/cert/ca dueto [Errno 1] _ssl.c:492: error:100AE081:elliptic curveroutines:EC_GROUP_new_by_curve_name:unknown group

    For more information about this issue, see the Ambari Troubleshooting Guide.

    When you are satisfied with the list of hosts, click Next.

    3.6.Choose ServicesHortonworks Data Platform is made up of a number of services. You must at minimuminstall HDFS. You can decide which of the other services to install.

    1. Select all to preselect all items or minimum to preselect only HDFS.

    2. Use the checkboxes to unselect (if you have used all) or select (if you have usedminimum) to arrive at your desired list of components.

    Note

    If you want to use Ambari for monitoring your cluster, make sure you selectNagios and Ganglia. If you do not select them, you get a warning popup

  • Hortonworks Data Platform Apr 1, 2014

    27

    when you finish this section. If you are using other monitoring tools, you canignore the warning.

    3. When you have made your selections, click Next.

    3.7.Assign MastersThe Ambari install wizard attempts to assign the master nodes for various services you haveselected to appropriate hosts in your cluster. The right column shows the current serviceassignments by host, with the hostname and its number of CPU cores and amount of RAMindicated.

    1. To change locations, click the dropdown list next to the service in the left column andselect the appropriate host.

    2. To remove a ZooKeeper instance, click the green minus icon next to the host address youwant to remove.

    3. When you are satisfied with the assignments, click the Next button.

    3.8.Assign Slaves and ClientsThe Ambari install wizard attempts to assign the slave components (DataNodes,NodeManagers, and RegionServers) to appropriate hosts in your cluster. It also attempts toselect hosts for installing the appropriate set of clients.

    1. Use all or none to select all of the hosts in the column or none of the hosts, respectively.

    If a host has a red asterisk next to it, that host is also running one or more mastercomponents. Hover your mouse over the asterisk to see which master components areon that host.

    2. Fine-tune your selections by using the checkboxes next to specific hosts.

    Note

    As an option you can start the HBase REST server manually after the installprocess is complete. It can be started on any host that has the HBase Masteror the Region Server installed. If you attempt to start it on the same host asthe Ambari server, however, you need to start it with the -p option, as itsdefault port is 8080 and that conflicts with the Ambari Web default port.

    /usr/lib/hbase/bin/hbase-daemon.sh start rest -p

    3. When you are satisfied with your assignments, click the Next button.

    3.9.Customize ServicesThe Customize Services screen presents you with a set of tabs that let you manageconfiguration settings for Hadoop components. The wizard attempts to set reasonable

  • Hortonworks Data Platform Apr 1, 2014

    28

    defaults for each of the options here, but you can use this set of tabs to tweak thosesettings. and you are strongly encouraged to do so, as your requirements may be slightlydifferent. Pay particular attention to the directories suggested by the installer.

    Note

    In HDFS Services Configs General, make sure to enter an integer value, inbytes, that sets the HDFS maximum edit log size for checkpointing. A typicalvalue is 500000000.

    Hover your mouse over each of the properties to see a brief description of what it does.The number of tabs you see is based on the type of installation you have decided to do. Ina complete installation there are nine groups of configuration properties and other relatedoptions, such as database settings for Hive and Oozie and admin name/password and alertemail for Nagios.

    The install wizard sets reasonable defaults for all properties except for those related todatabases in the Hive tab and the Oozie tab, and two related properties in the Nagios tab.These four are marked in red and are the only ones you must set yourself.

    Note

    If you decide to use an existing database instance for Hive/HCatalog or forOozie, you must have completed the preparations described in Using Non-Default Databases prior to running the install wizard.

    Click the name of the group in each tab to expand and collapse the display.

    3.9.1.Service Users and GroupsThe individual services in Hadoop are each run under the ownership of a correspondingUnix account. These accounts are known as service users. These service users belong to aspecial Unix group. In addition there is a special service user for running smoke tests oncomponents during installation and on-demand using the Management Header in theServices View of the Ambari Web GUI. Any of these users and groups can be customizedusing the Misc tab of the Customize Services step.

    If you choose to customize names, Ambari checks to see if these custom accounts alreadyexist. If they do not exist, Ambari creates them. The default accounts are always createdduring installation whether or not custom accounts are specified. These default accountsare not used and can be removed post-install.

    Note

    All new service user accounts, and any existing user accounts used as serviceusers, must have a UID >= 1000.

    Table3.1.Service Users

    Service Component Default User Account

    HDFS NameNode hdfs

  • Hortonworks Data Platform Apr 1, 2014

    29

    Service Component Default User AccountSecondaryNameNode

    DataNode

    MapReduce JobTracker

    HistoryServer

    TaskTracker

    mapred

    Hive Hive Metastore

    HiveServer2

    hive

    HCat HCatalog Server hcat

    WebHCat WebHCat Server hcat

    Oozie Oozie Server oozie

    HBase MasterServer

    RegionServer

    hbase

    ZooKeeper ZooKeeper zookeeper

    Ganglia Ganglia Server

    Ganglia Collectors

    nobody

    Nagios Nagios Server nagiosa

    Smoke Testb All ambari-qaaIf you plan to use an existing user account named "nagios", that "nagios" account must be in a group named "nagios". If youcustomize this account, that account will be created and put in a group "nagios".bThe Smoke Test user performs smoke tests against cluster services as part of the install process. It also can perform these on-demand from the Ambari Web GUI.

    Table3.2.Service Group

    Service Components Default Group Account

    All All hadoop

    3.9.2.Properties That Depend on Service Usernames/Groups

    Some properties must be set to match specific service usernames or service groups. If youhave set up non-default, customized service usernames for the HDFS or HBase service or theHadoop group name, you must edit the following properties:

    Table3.3.HDFS Settings: Advanced

    Property Name Value

    dfs.permissions.supergroup The same as the HDFS username. The default is "hdfs"

    dfs.cluster.administrators A single space followed by the HDFS username.

    dfs.block.local-path-access.user The HBase username. The default is "hbase".

    Table3.4.MapReduce Settings: Advanced

    Property Name Value

    mapreduce.tasktracker.group The Hadoop group name. The default is "hadoop".

    mapreduce.cluster.administrators A single space followed by the Hadoop group name.

  • Hortonworks Data Platform Apr 1, 2014

    30

    3.9.3.Recommended Memory Configurations for theMapReduce Service

    The following recommendations can help you determine appropriate memoryconfigurations based on your usage scenario:

    Make sure that there is enough memory for all the processes. Remember that systemprocesses take around 10% of the available memory.

    For co-deploying an HBase RegionServer and MapReduce service on the same node,reduce the RegionServer's heap size (use the HBase Settings: RegionServer: HBaseRegion Servers maximum Java heap size property to modify the RegionServerheap size).

    For co-deploying an HBase RegionServer and the MapReduce service on the same node,or for memory intensive MapReduce applications, modify the map and reduce slots assuggested in the following example:

    EXAMPLE: For co-deploying an HBase RegionServer and the MapReduce service ona machine with 16GB of available memory, the following would be a recommendedconfiguration:

    2 GB: system processes

    8 GB: MapReduce slots. 6 Map + 2 Reduce slots per 1 GB task

    4 GB: HBase RegionServer

    1 GB: TaskTracker

    1 GB: DataNode

    To change the number of Map and Reduce slots based on the memory requirements ofyour application, use the following properties:

    MapReduce Settings: TaskTracker: Number of Map slots per node

    MapReduce Settings: TaskTracker: Number of Reduce slots per node

    3.10.ReviewThe assignments you have made are displayed. Check to make sure everything is correct. Ifyou need to make changes, use the left navigation bar to return to the appropriate screen.

    To print your information for later reference, click Print.

    When you are satisfied with your choices, click the Deploybutton.

    3.11.Install, Start and TestThe progress of the install is shown on the screen. Each component is installed and startedand a simple test is run on the component. You are given an overall status on the process inthe progress bar at the top of the screen and a host by host status in the main section.

  • Hortonworks Data Platform Apr 1, 2014

    31

    To see specific information on what tasks have been completed per host, click the link inthe Message column for the appropriate host. In the Tasks pop-up, click the individual taskto see the related log files. You can select filter conditions by using the Show dropdownlist. To see a larger version of the log contents, click the Open icon or to copy the contentsto the clipboard, use the Copy icon.

    Depending on which components you are installing, the entire process may take 40 ormore minutes. Please be patient.

    When Successfully installed and started the services appears, click Next.

    3.12.SummaryThe Summary page gives you a summary of the accomplished tasks. Click Complete. Youare taken to the Ambari Web GUI.

  • Hortonworks Data Platform Apr 1, 2014

    32

    4. Hadoop 2.x - Installing, Configuring,and Deploying the Cluster

    This section describes using the Ambari install wizard in your browser to complete yourinstallation, configuration and deployment of Hadoop.

    4.1.Log into Apache AmbariOnce you have started the Ambari service, you can access the Ambari Install Wizardthrough your browser.

    1. Point your browser to http://{your.ambari.server}:8080.

    2. Log in to the Ambari Server using the default username/password: admin/admin. Youcan change this later to whatever you want.

    4.2.WelcomeThe first step creates the cluster name.

    1. At the Welcome page, type a name for the cluster you want to create in the text box.No whitespaces or special characters can be used in the name.

    2. Click the Next button.

    4.3.Select StackThe Service Stack (or simply the Stack) is a coordinated and tested set of Hadoopcomponents. Use the radio button to select the Stack version you want to install. To installa Hadoop 2 stack, select HDP 2.0.6, in Stacks.

    Under Advanced Repository Options, you can specify the Base URLs of your localrepositories for each Operating System you plan to use in your cluster. You should have

  • Hortonworks Data Platform Apr 1, 2014

    33

    configured the Base URLs for your local repositories in Optional: Configure Ambari for LocalRepositories.

    4.4.Install OptionsIn order to build up the cluster, the install wizard needs to know general information abouthow you want to set it up. You need to supply the FQDN of each of your hosts. The wizardalso needs to access the private key file you created in Set Up Password-less SSH. It usesthese to locate all the hosts in the system and to access and interact with them securely.

    1. Use the Target Hosts text box to enter your list of host names, one per line. You can useranges inside brackets to indicate larger sets of hosts. For example, for host01.domainthrough host10.domain use host[01-10].domain

    Note

    If you are deploying on EC2, use the internal Private DNS hostnames.

    2. If you want to let Ambari automatically install the Ambari Agent on all your hosts usingSSH, select Provide your SSH Private Key and either use the Choose File button in theHost Registration Information section to find the private key file that matches thepublic key you installed earlier on all your hosts or cut and paste the key into the textbox manually.

    Note

    If you are using IE 9, the Choose File button may not appear. Use the textbox to cut and paste your private key manually.

    Fill in the username for the SSH key you have selected. If you do not want to use root,you must provide the username for an account that can execute sudo without enteringa password.

  • Hortonworks Data Platform Apr 1, 2014

    34

    3. If you do not want Ambari to automatically install the Ambari Agents, select Performmanual registration. See Appendix: Installing Ambari Agents Manually for moreinformation.

    4. Click the Register and Confirm button to continue.

    4.5.Confirm HostsThis screen lets you confirm that Ambari has located the correct hosts for your clusterand to check those hosts to make sure they have the correct directories, packages, andprocesses to continue the install.

    If any hosts were selected in error, you can remove them by selecting the appropriatecheckboxes and clicking the grey Remove Selected button. To remove a single host, clickthe small white Remove button in the Action column.

    At the bottom of the screen, you may notice a yellow box that indicates some warningswere encountered during the check process. For example, your host may have already hada copy of wget or curl. Click Click here to see the warnings to see a list of what waschecked and what caused the warning. On the same page you can get access to a pythonscript that can help you clear any issues you may encounter and let you run Rerun Checks.

    Important

    If you are deploying HDP using Ambari 1.4 or later on RHEL 6.5 you willlikely see Ambari Agents fail to register with Ambari Server during theConfirm Hosts step in the Cluster Install wizard. Click the Failed link onthe Wizard page to display the Agent logs. The following log entry indicatesthe SSL connection between the Agent and Server failed during registration:INFO 2014-04-02 04:25:22,669 NetUtil.py:55 - Failedto connect to https://:8440/cert/ca dueto [Errno 1] _ssl.c:492: error:100AE081:elliptic curveroutines:EC_GROUP_new_by_curve_name:unknown group

    For more information about this issue, see the Ambari Troubleshooting Guide.

    When you are satisfied with the list of hosts, click Next.

    4.6.Choose ServicesHortonworks Data Platform is made up of a number of services. You must at minimuminstall HDFS. You can decide which of the other services to install.

    1. Select all to preselect all items or minimum to preselect only HDFS.

    2. Use the checkboxes to unselect (if you have used all) or select (if you have usedminimum) to arrive at your desired list of components.

    Note

    If you want to use Ambari for monitoring your cluster, make sure you selectNagios and Ganglia. If you do not select them, you get a warning popup

  • Hortonworks Data Platform Apr 1, 2014

    35

    when you finish this section. If you are using other monitoring tools, you canignore the warning.

    3. When you have made your selections, click Next.

    4.7.Assign MastersThe Ambari install wizard attempts to assign the master nodes for various services you haveselected to appropriate hosts in your cluster. The right column shows the current serviceassignments by host, with the hostname and its number of CPU cores and amount of RAMindicated.

    1. To change locations, click the dropdown list next to the service in the left column andselect the appropriate host.

    2. To remove a ZooKeeper instance, click the green minus icon next to the host address youwant to remove.

    3. When you are satisfied with the assignments, click the Next button.

    4.8.Assign Slaves and ClientsThe Ambari install wizard attempts to assign the slave components (DataNodes,NodeManagers, and RegionServers) to appropriate hosts in your cluster. It also attempts toselect hosts for installing the appropriate set of clients.

    1. Use all or none to select all of the hosts in the column or none of the hosts, respectively.

    If a host has a red asterisk next to it, that host is also running one or more mastercomponents. Hover your mouse over the asterisk to see which master components areon that host.

    2. Fine-tune your selections by using the checkboxes next to specific hosts.

    Note

    As an option you can start the HBase REST server manually after the installprocess is complete. It can be started on any host that has the HBase Masteror the Region Server installed. If you attempt to start it on the same host asthe Ambari server, however, you need to start it with the -p option, as itsdefault port is 8080 and that conflicts with the Ambari Web default port.

    /usr/lib/hbase/bin/hbase-daemon.sh start rest -p

    3. When you are satisfied with your assignments, click the Next button.

    4.9.Customize ServicesThe Customize Services screen presents you with a set of tabs that let you manageconfiguration settings for Hadoop components. The wizard attempts to set reasonable

  • Hortonworks Data Platform Apr 1, 2014

    36

    defaults for each of the options here, but you can use this set of tabs to tweak thosesettings. and you are strongly encouraged to do so, as your requirements may be slightlydifferent. Pay particular attention to the directories suggested by the installer.

    Note

    In HDFS Services Configs General, make sure to enter an integer value, inbytes, that sets the HDFS maximum edit log size for checkpointing. A typicalvalue is 500000000.

    Hover over each of the properties to see a brief description of what it does. The numberof tabs you see is based on the type of installation you have decided to do. In a completeinstallation there are ten groups of configuration properties and other related options,such as database settings for Hive/HCat and Oozie, and admin name/password and alertemail for Nagios.

    The install wizard sets reasonable defaults for all properties except for those related todatabases in the Hive and the Oozie tabs, and two related properties in the Nagios tab.These four are marked in red and are the only ones you must set yourself.

    Note

    If you decide to use an existing database instance for Hive/HCatalog or forOozie, you must have completed the preparations described in Using Non-Default Databases prior to running the install wizard.

    Click the name of the group in each tab to expand and collapse the display.

    4.9.1.Service Users and GroupsThe individual services in Hadoop are each run under the ownership of a correspondingUnix account. These accounts are known as service users. These service users belong to aspecial Unix group. In addition there is a special service user for running smoke tests oncomponents during installation and on-demand using the Management Header in theServices View of the Ambari Web GUI. Any of these users and groups can be customizedusing the Misc tab of the Customize Services step.

    If you choose to customize names, Ambari checks to see if these custom accounts alreadyexist. If they do not exist, Ambari creates them. The default accounts are always createdduring installation whether or not custom accounts are specified. These default accountsare not used and can be removed post-install.

    Note

    All new service user accounts, and any existing user accounts used as serviceusers, must have a UID >= 1000.

    Table4.1.Service UsersService Component Default User Account

    HDFS NameNode

    SecondaryNameNode

    hdfs

  • Hortonworks Data Platform Apr 1, 2014

    37

    Service Component Default User AccountDataNode

    MapReduce2 HistoryServer mapred

    YARN NodeManager

    ResourceManager

    yarn

    Hive Hive Metastore

    HiveServer2

    hive

    HCat HCatalog Server hcat

    WebHCat WebHCat Server hcat

    Oozie Oozie Server oozie

    HBase MasterServer

    RegionServer

    hbase

    ZooKeeper ZooKeeper zookeeper

    Ganglia Ganglia Server

    Ganglia Monitors

    nobody

    Ganglia RRDTool (with Ganglia Server) rrdcaheda

    Ganglia Apache HTTP Server apacheb

    PostgreSQL PostgreSQL (with Ambari Server) postgresc

    Nagios Nagios Server nagiosd

    Smoke Teste All ambari-qaaCreated as part of installing RRDTool, which is used to store metrics data collected by Ganglia.bCreated as part of installing Apache HTTP Server, which is used to serve the Ganglia web UI.cCreated as part of installing the default PostgreSQL database with Ambari Server. If you are not using the Ambari PostgreSQLdatabase, this user is not needed.dIf you plan to use an existing user account named nagios, that nagios account must either be in a group named nagios oryou must customize the Nagios Group.eThe Smoke Test user performs smoke tests against cluster services as part of the install process. It also can perform these on-demand from the Ambari Web GUI.

    Table4.2.Service GroupService Components Default Group Account

    All All hadoop

    Nagios Nagios Server nagios

    Ganglia Ganglia Server

    Ganglia Monitor

    nobody

    4.9.2.Properties That Depend on Service Usernames/Groups

    Some properties must be set to match specific service usernames or service groups. If youhave set up non-default, customized service usernames for the HDFS or HBase service or theHadoop group name, you must edit the following properties:

    Table4.3.HDFS Settings: AdvancedProperty Name Value

    dfs.permissions.superusergroup The same as the HDFS username. The default is "hdfs"

  • Hortonworks Data Platform Apr 1, 2014

    38

    Property Name Value

    dfs.cluster.administrators A single space followed by the HDFS username.

    dfs.block.local-path-access.user The HBase username. The default is "hbase".

    Table4.4.MapReduce Settings: Advanced

    Property Name Value

    mapreduce.cluster.administrators A single space followed by the Hadoop group name.

    4.10.ReviewThe assignments you have made are displayed. Check to make sure everything is correct. Ifyou need to make changes, use the left navigation bar to return to the appropriate screen.

    To print your information for later reference, click Print.

    When you are satisfied with your choices, click the Deploybutton.

    4.11.Install, Start and TestThe progress of the install is shown on the screen. Each component is installed and startedand a simple test is run on the component. You are given an overall status on the process inthe progress bar at the top of the screen and a host by host status in the main section.

    To see specific information on what tasks have been completed per host, click the link inthe Message column for the appropriate host. In the Tasks pop-up, click the individual taskto see the related log files. You can select filter conditions by using the Show dropdownlist. To see a larger version of the log contents, click the Open icon or to copy the contentsto the clipboard, use the Copy icon.

    Depending on which components you are installing, the entire process may take 40 ormore minutes. Please be patient.

    When Successfully installed and started the services appears, click Next.

    4.12.SummaryThe Summary page gives you a summary of the accomplished tasks. Click Complete. Youare taken to the Ambari Web GUI.

  • Hortonworks Data Platform Apr 1, 2014

    39

    5. NameNode High AvailabiltyUse these instructions to set up NameNode HA using Ambari Web.

    5.1.Setting Up NameNode High AvailabiltyOn Ambari Web, go to the Admin view. Select High Availabilty in the left navigation bar.

    1. Check to make sure you have at least three hosts in your cluster and are running at leastthree ZooKeeper servers.

    2. Click Enable NameNode HA and follow the Enable NameNode HA Wizard. The wizarddescribes a set of automated and manual steps you must take to set up NameNode highavailability.

    3. Get Started: This step gives you an overview of the process and allows you to select aNameservice ID. You use this Nameservice ID instead of the NameNode FQDN once HAhas been set up. Click Next to proceed.

  • Hortonworks Data Platform Apr 1, 2014

    40

    4. Select Hosts: Select a host for the additional NameNode and the JournalNodes. Thewizard suggests options that you can adjust using the dropdown lists. Click Next toproceed.

    5. Review: Confirm your host selections and click Next.

  • Hortonworks Data Platform Apr 1, 2014

    41

    6. Create Checkpoints: Follow the instructions in the step. You need to login to yourcurrent NameNode host to run the commands to put your NameNode into safe modeand create a checkpoint. When Ambari detects success, the message on the bottom ofthe window changes. Click Next.

    7. Configure Components: The wizard configures your components, displaying progressbars to let you track the steps. Click Next to continue.

  • Hortonworks Data Platform Apr 1, 2014

    42

    8. Initialize JournalNodes: Follow the instructions in the step. You need to login to yourcurrent NameNode host to run the command to initialize the JournalNodes. WhenAmbari detects success, the message on the bottom of the window changes. Click Next.

    9. Start Components: The wizard starts the ZooKeeper servers and the NameNode,displaying progress bars to let you track the steps. Click Next to continue.

    10.Initialize Metadata: Follow the instructions in the step. For this step you must login toboth the current NameNode and the additional NameNode. Make sure you are loggedinto the correct host for each command. Click Next when you have completed the twocommands. A Confirmation popup appears to remind you that you must do both steps.Click OK to confirm.

  • Hortonworks Data Platform Apr 1, 2014

    43

    11.Finalize HA Setup: The wizard the setup, displaying progress bars to let you track thesteps. Click Done to finish the wizard. After the Ambari Web GUI reloads, you maysee some alert notifications. Wait a few minutes until the services come back up. Ifnecessary, restart any components using Ambari Web.

    Note

    Choose Services, then start Nagios, after completing all steps in the HAwizard.

    12.If you are using Hive, you must manually change the Hive Metastore FS root to point tothe Nameservice URI instead of the NameNode URI. You created the Nameservice ID inthe Get Started step.

    a. Check the current FS root. On the Hive host:

    hive --config /etc/hive/conf.server -listFSRoot

    The output might look like this:

    Listing FS Roots..hdfs://:8020/apps/hive/warehouse

    b. Use this command to change the FS root:

    $ hive --config /etc/hive/conf.server --service metatool -updateLocation

  • Hortonworks Data Platform Apr 1, 2014

    44

    For example, where the Nameservice ID is mycluster: $ hive --config /etc/hive/conf.server --service metatool -updateLocation hdfs://mycluster:8020/apps/hive/warehouse hdfs://c6401.ambari.apache.org:8020/apps/hive/warehouse

    The output might look like this:

    Successfully updated the following locations..Updated X records in SDS table

    13.If you are using Oozie, you must use the Nameservice URI instead of the NameNode URIin your workflow files. For example, where the Nameservice ID is mycluster:

    ${jobTracker} hdfs://mycluster

    14.If you are using Hue, to enable NameNode High Availability, you must use httpfs insteadof webhdfs to communicate with name nodes inside the cluster. After successfullysetting up NameNode High Availability:

    a. Install an httpfs server on any node in the cluster:

    yum install hadoop-httpfs

    b. Ensure that Hue hosts and groups use the httpfs server.

    For example, on the httpfs server host, add to httpfs-site.xml the following lines:

    httpfs.proxyuser.hue.hosts * httpfs.proxyuser.hue.groups *

    c. Ensure that groups and hosts in the cluster use the httpfs server. For example, UsingAmbari, in Services > HDFS > Configs add to core-site.xml the following propertiesand values:

    Table5.1.Core-site.xml properties and values for NameNode HA on acluster using Hue

    Property Value

    hadoop.proxyuser.httpfs.groups *

    hadoop.proxyuser.httpfs.hosts *

    d. Using Ambari, in Services > HDFS restart the HDFS service in your cluster.

    e. On the Hue host, configure Hue to use the httpfs server by editing hue.ini to includethe following line:

    webhdfs_url=http://{fqdn of httpfs server}:14000/webhdfs/v1/

    f. Restart the Hue service.

  • Hortonworks Data Platform Apr 1, 2014

    45

    5.2.Rolling Back NameNode HATo roll back NameNode HA to the previous non-HA state use the following step-by-stepmanual process. Some of the steps are optional depending on your installation.

    1. Stop HBase

    2. Checkpoint the Active NameNode

    3. Stop All Services

    4. Prepare the Ambari Server Host for Rollback

    5. Restore the HBase Configuration

    6. Delete ZK Failover Controllers

    7. Modify HDFS Configurations

    8. Recreate the Secondary NameNode

    9. Re-enable the Secondary NameNode

    10.Delete All JournalNodes

    11.Delete the Additional NameNode

    12.Verify the HDFS Components

    13.Start HDFS

    5.2.1.Stop HBase1. From Ambari Web, go to the Services view and select HBase.

    2. Click Stop on the Management Header.

    3. Wait until HBase has stopped completely before continuing.

    5.2.2.Checkpoint the Active NameNodeIf HDFS has been in use after you enabled NameNode HA, but you wish to revert back to anon-HA state, you must checkpoint HDFS state before proceeding with the rollback.

    If the Enable NameNode HA wizard failed and you need to revert back, you can skip thisstep and move on to Stop All Services.

    If Kerberos security has not been enabled on the cluster:

    On the Active NameNode host, execute the following commands to save the namespace.You must be the HDFS service user ($HDFS_USER) to do this.sudo su -l $HDFS_USER -c 'hdfs dfsadmin -safemode enter'sudo su -l $HDFS_USER -c 'hdfs dfsadmin -saveNamespace'

  • Hortonworks Data Platform Apr 1, 2014

    46

    If Kerberos security has been enabled on the cluster:

    sudo su -l $HDFS_USER -c 'kinit -kt /etc/security/keytabs/nn.service.keytab nn/$HOSTNAME@$REALM;hdfs dfsadmin -safemode enter'sudo su -l $HDFS_USER -c 'kinit -kt /etc/security/keytabs/nn.service.keytab nn/$HOSTNAME@$REALM;hdfs dfsadmin -saveNamespace'

    Where $HDFS_USER is the HDFS service user, $HOSTNAME is the Active NameNodehostname, and $REALM is your Kerberos realm.

    5.2.3.Stop All ServicesUse the Services view in Ambari Web and click Stop All in the Services navigation panel.You must wait until all the services are completely stopped.

    5.2.4.Prepare the Ambari Server Host for RollbackLog into the Ambari server host and set the following environment variables to prepare forthe rollback procedure:

    Table5.2.Set Environment VariablesVariable Value

    export AMBARI_USER=AMBARI_USERNAME Substitute the value of the administrative user for AmbariWeb. The default value is admin.

    export AMBARI_PW=AMBARI_PASSWORD Substitute the value of the administrative password forAmbari Web. The default value is admin.

    export AMBARI_PORT=AMBARI_PORT Substitute the Ambari Web port. The default value is8080.

    export AMBARI_PROTO=AMBARI_PROTOCOL Substitute the value of the protocol for connecting toAmbari Web. Options are http or https. The defaultvalue is http.

    export CLUSTER_NAME=CLUSTER_NAME Substitute the name of your cluster, set during the AmbariInstall Wizard process. For example: mycluster.

    export NAMENODE_HOSTNAME=NN_HOSTNAME Substitute the FQDN of the host for the non-HANameNode. For example: nn01.mycompany.com.