with IBM Corp. · Gathering the r equir ed information ..... . 9 Installing pr er equisite softwar e ..... . 10 Chapter 3. Preparing your environment 1 1 Downloading Big Data Extension

IBM Tivoli Netcool Performance Manager Big DataExtension 1.4.3Document Revision R2E1

Installing Big Data Extension

IBM

NoteBefore using this information and the product it supports, read the information in “Notices” on page 33.

This edition applies to version 1.4.3 of Tivoli Netcool Performance Manager Big Data Extension and to allsubsequent releases and modifications until otherwise indicated in new editions.

© Copyright IBM Corporation 2017.US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contractwith IBM Corp.

Contents

About this information . . . . . . . . vIntended audience . . . . . . . . . . . . vTivoli Netcool Performance Manager Big DataExtension architecture . . . . . . . . . . . vService Management Connect . . . . . . . . viiiTivoli Netcool Performance Manager technicaltraining . . . . . . . . . . . . . . . viiiSupport information . . . . . . . . . . . viii

Chapter 1. Getting started . . . . . . . 1

Chapter 2. Planning for Big DataExtension installation . . . . . . . . . 3System requirements. . . . . . . . . . . . 3

Hardware requirements. . . . . . . . . . 3Software requirements . . . . . . . . . . 4Disk space partitioning for Big Data Extension andrelated directories. . . . . . . . . . . . 5

Platform support . . . . . . . . . . . . . 5Suggested node and services layout . . . . . 5

Opening the default ports for a typical installation . 8Gathering the required information . . . . . . . 9Installing prerequisite software . . . . . . . . 10

Chapter 3. Preparing your environment 11Downloading Big Data Extension installation media 11Extracting the Big Data Extension software . . . . 11Downloading the IBM Open Platform with ApacheSpark and Apache Hadoop . . . . . . . . . 12Preparing to run the prerequisite scanner . . . . 12Running the prerequisite scanner . . . . . . . 14Setting SSH passwordless login from Ambari serverto cluster nodes . . . . . . . . . . . . . 15

Chapter 4. Installing Big DataExtension . . . . . . . . . . . . . 17Running the Big Data Extension installationpackages . . . . . . . . . . . . . . . 17

Chapter 5. Setting up Big DataExtension clusters and services. . . . 19Setting up YARN Service . . . . . . . . . . 21Setting up HDFS Service . . . . . . . . . . 21Setting up Big Data Extension services . . . . . 22Setting up communication with Tivoli NetcoolPerformance Manager system . . . . . . . . 23

Chapter 6. Postinstallation tasks . . . 27Verifying the installation . . . . . . . . . . 27Installation directory structure . . . . . . . . 27

Chapter 7. Uninstalling Big DataExtension . . . . . . . . . . . . . 29Listing the working directories . . . . . . . . 29Uninstalling Ambari agent nodes and Ambari serverby using scripts . . . . . . . . . . . . . 30

Chapter 8. Troubleshooting installation 31

Notices . . . . . . . . . . . . . . 33Trademarks . . . . . . . . . . . . . . 35Terms and conditions for product documentation. . 36

© Copyright IBM Corp. 2017 iii

iv Installing Big Data Extension

About this information

This information provides instructions and general information on how to installIBM Tivoli® Netcool® Performance Manager Big Data Extension software and usethe performance metric data that is collected by Tivoli Netcool PerformanceManager Wireline component.

Intended audienceThe intended audiences for this information are as follows:v Managers and others who track wireline network performance metrics and want

to take the advantages that the Big Data Extension offers on the existing TivoliNetcool Performance Manager system.

v Technicians and engineers who use the Tivoli Netcool Performance Manager -wireline software to manage and analyze network performance.

Required skills and knowledge

Readers need to be familiar with the following:v Hadoop-based Big Data architecturev IBM BigInsights 4.2v Tivoli Netcool Performance Manager - Wireline component

Tivoli Netcool Performance Manager Big Data Extension architectureThe Big Data Extension is Hadoop - based, simplified microservice orchestration,configuration, and management system that is supported on Apache Ambari. TheBig Data Extension gives additional capability to the existing Wireline componentin providing comprehensive, flexible, and scalable performance data managementfor complex, multi-vendor, multi-technology networks.

The following diagram shows how data is flowing through the variouscomponents and services in Big Data Extension:

© Copyright IBM Corp. 2017 v

IBM® Open Platform with Apache Spark and Apache Hadoop

IBM BigInsights together with IBM Open Platform with Apache Spark and ApacheHadoop (IOP) provide a software platform for discovering, analyzing, andvisualizing data from disparate sources. You can use this software to help processand analyze the volume, variety, and velocity of data that continually enters yourorganization every day. Big Data Extension is a service extension that can beinstalled on the IBM Open Platform with Apache Spark and Apache Hadoop stack.

The features of IOP that are used in Big Data Extension:v 100% open source Hadoop through IBM Open Platform with Apache Spark and

Apache Hadoopv Default support for rolling upgrades for individual Hadoop servicesv Support for long-running applications within YARN for enhanced reliability

vi Installing Big Data Extension

v Integrated with Apache Spark for extra processing power and dramaticperformance increase

v Apache Ambari operational framework. Apache Ambari is an open frameworkfor provisioning, managing, and monitoring Apache Hadoop clusters. Ambariprovides an intuitive and easy-to-use Hadoop management web UI that issupported by its collection of tools and APIs to simplify the operation ofHadoop clusters.

v Essentially includes the following open source technologies for working with BigData Extension:– Ambari– HDFS– Kafka– Spark– MapReduce– YARN– ZooKeeper

Big Data Extension services

Big Data Extension components run on microservice architecture that has thesoftware application as a suite of independently deployable, small modularservices in which each service runs a unique process, and communicates through awell-defined, lightweight mechanism. In this case, Kafka message bus is used forcommunication.

Big Data Extension services:v Foundation services

– Manager– Storage– UI

v Entity Metric services– Tivoli Netcool Performance Manager Collector

For more information on these services, see TivoliNetcool Performance Manager -BigData Extension Overview.Related information:

IBM Tivoli Netcool Performance Manager on IBM Knowledge Center

Introduction to IOP and BigInsights 4.2

HDFS Architecture

About this information vii

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/ambari.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/bi_hadoop.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/bi_kafka.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/bi_spark.html

https://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/c0057842.html

https://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/yarn.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/bi_zookeeper.html

http://www.ibm.com/support/knowledgecenter/SSBNJ7_1.4.3/tnpm_kc_welcome.html

https://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/c0057605.html

https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#HDFS_Architecture

Service Management ConnectConnect, learn, and share with Service Management professionals: product supporttechnical experts who provide their perspectives and expertise.

Access Network and Service Assurance community at https://www.ibm.com/developerworks/servicemanagement/nsa/index.html. Use Service ManagementConnect in the following ways:v Become involved with transparent development, an ongoing, open engagement

between other users and IBM developers of Tivoli products. You can access earlydesigns, sprint demonstrations, product roadmaps, and prerelease code.

v Connect one-on-one with the experts to collaborate and network about Tivoliand the Network and Service Assurance community.

v Read blogs to benefit from the expertise and experience of others.v Use wikis and forums to collaborate with the broader user community.Related information:

Tivoli Netcool Performance Manager 1.4 community on developerWorks

Tivoli Netcool Performance Manager technical trainingFor Tivoli Netcool Performance Manager technical training information, see thefollowing Tivoli Netcool Performance Manager Training website at:https://tnpmsupport.persistentsys.com/training.

Support informationIf you have a problem with your IBM software, you want to resolve it quickly. IBMprovides the following ways for you to obtain the support you need:

OnlineAccess the IBM Software Support site at http://www.ibm.com/software/support/probsub.html .

IBM Support AssistantThe IBM Support Assistant is a free local software serviceability workbenchthat helps you resolve questions and problems with IBM softwareproducts. The Support Assistant provides quick access to support-relatedinformation and serviceability tools for problem determination. To installthe Support Assistant software, go to http://www.ibm.com/software/support/isa.

Troubleshooting GuideFor more information about resolving problems, see the problemdetermination information for this product.

viii Installing Big Data Extension

https://www.ibm.com/developerworks/servicemanagement/nsa/index.html

https://www.ibm.com/developerworks/servicemanagement/nsa/index.html

https://www.ibm.com/developerworks/community/groups/service/html/communityview?communityUuid=d4b565fd-04e5-47c2-965a-943ccdfabdb9

https://tnpmsupport.persistentsys.com/training

http://www.ibm.com/software/support/probsub.html

http://www.ibm.com/software/support/probsub.html

http://www.ibm.com/software/support/isa

http://www.ibm.com/software/support/isa

Chapter 1. Getting started

A bird's-eye-view of the tasks that are needed for the installing, configuring andintegrating Big Data Extension with Tivoli Netcool Performance Manager Corecomponent.

Use this image to understand the following tasks:v Setup the environmentv Install Big Data Extensionv Set up Big Data Extension cluster and services

© Copyright IBM Corp. 2017 1

2 Installing Big Data Extension

Chapter 2. Planning for Big Data Extension installation

Before you install the product, read the hardware and software requirements.

System requirementsComplete set of requirements for Tivoli Netcool Performance Manager Big DataExtension.

Lists the configurations and the supported platforms and components of Big DataExtension.

For requirements of other integrated products, see the related productdocumentation.

Hardware requirementsHardware specifications vary according to the size of your network and servertopology that you want to use.

Big Data Extension has the following minimum requirements that are based on thespecific default functions on Linux environment in a stand-alone mode ofdeployment:

Features Value Hardware specifications

Records per second 200,000,000 records per hourfor Tivoli Netcool PerformanceManager

Big Data Extension Ambari serverspecification

CPU

v 8 Core CPU (Intel [email protected] GHz)

Memory

v 32 GB RAM

Hard disk

v 100 GB

Big Data Extension Agent nodespecification

CPU

v 16 Core CPU (IntelXeon [email protected] GHz)

Memory

v 64 GB RAM

Hard disk

v 12 TB

Data retention time v Logs = 10 Days

v Entity metrics RAW data =10 Days

v Entity metrics 30 minutesaggregated = 30 Days

v Entity metrics 6 hoursaggregated = 30 Days

v Entity metrics dailyaggregated data = 30 Days


Software requirementsThe supported operating systems, modules, and third-party applications for BigData Extension.

Software requirements for Big Data Extension.

Table 1. Supported Operating System

Operating system Version

7.2Note: Big Data Extension and its relatedservices are supported on RHEL operatingsystem only.

Note: Enable JavaScript and cookies.

Table 2. Supported web browsers

Web browsers Version

Internet Explorer1011

Mozilla Firefox ESR3845

Table 3. Prerequisite software

Software Version

IBM Tivoli Netcool Performance Manager 1.4.3

Table 4. Bundled software

Product Version

IBM Front End Toolkit 1.5.x

IBM SDK, Java™ Technology Edition 64-bit 8.0.2.10 (Version 8, Service Refresh 2 FixPack 10)

Table 5. Supported hypervisors

Hypervisors Version

Red Hat Enterprise Linux with KVM RHEL 7.2

VMware ESXi5.05.1

Related information:

System requirements for BigInsights

IBM Tivoli Netcool Performance Manager - Requirements


http://www-01.ibm.com/support/docview.wss?uid=swg27027565

https://www.ibm.com/support/knowledgecenter//SSBNJ7_1.4.3/config_recommendations/ctnpm_requirements.html

Disk space partitioning for Big Data Extension and relateddirectories

Based on the hardware specification for Big Data Extension agent node, refer to thefollowing suggestion on disk space partitioning.

Recommended directory structure for Big Data Extension agent nodes in yourcluster:

Directory Disk space

/home 150 GB

/var 150 GB

/opt 150 GB

/data1 10 TBNote: Kafka logs might use between 5-7 TB

disk space from this partition.

/ Remaining

Note: You might need to partition the directories depending on your environment,the size of your network, and the amount of data you plan to store. If yourenvironment has a different hardware specification, contact IBM Support for moreinformation.Related concepts:“Hardware requirements” on page 3Hardware specifications vary according to the size of your network and servertopology that you want to use.

Platform supportAll Big Data Extension services and components must be installed on Red HatLinux, Version 7.2 only.

Co-location rules

While it is possible to deploy all the Big Data Extension and its associatedcomponents on a single instance for evaluation purpose. Typically, you must haveat least three hosts; one master Ambari server, and two Ambari agent slaves forTivoli Netcool Performance Manager cluster.

Suggested node and services layoutUse this reference architecture to understand how to configure your IBM OpenPlatform with Apache Spark and Apache Hadoop and Tivoli Netcool PerformanceManager Big Data Extension services in your cluster.

During the cluster deployment, the Big Data Extension service layer andapplication binaries are deployed to the Ambari agent hosts. The Big DataExtension services are installed in the default location, that is, /opt/IBM/tnpm-bdeand the IBM Open Platform with Apache Spark and Apache Hadoop componentsto /usr/iop/current directory.

Chapter 2. Planning for Big Data Extension installation 5

Multi-node cluster deployment

It is suggested that you have at least one Ambari server node and the rest of themas Ambari agent nodes. In the diagram, HOST A is the Ambari server and HOSTB, C, and D are the Ambari agent nodes.

Note: Make sure that you install Manager Service and Kafka Broker in all Ambariagent nodes.

Note: Because Zookeeper requires a majority, it is best to use an odd number ofmachines. For example, with four machines ZooKeeper can handle the failure of asingle machine; if two machines fail, the remaining two machines do not constitutea majority. However, with five machines ZooKeeper can handle the failure of twomachines.Related information:

Suggested services layout for IBM Open Platform with Apache Spark andApache Hadoop and BigInsights value-added services

Cluster behaviorProvides the relevance between Big Data Extension and its related services withthe node behavior in a cluster.

Big Data Extension supports the following types of node behavior.

Cluster SingletonA clustered singleton service (also known as an HA singleton) is a servicethat is deployed on multiple nodes in a cluster, but is providing its serviceon only one of the nodes. The node that is running the singleton service istypically called the oldest node.

Load BalancingLoad balancing improves the distribution of workloads across multiplenodes where each of the node serves different set of clients that aremutually exclusive.


http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.install.doc/doc/service_layout.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.install.doc/doc/service_layout.html

Managed Load BalancingManaged load balancing acts as a node to monitor the load balancingactivities. The manager node monitors and distributes the Load Balancingamong the active nodes.

Data Replication A replication strategy determines the nodes where data replicas are placed.The replicas on multiple nodes are stored to ensure reliability and faulttolerance. Data Replication requires at least two or more nodes that areconfigured for the supported services.

Monitoring NodesA service that is installed on each node in a cluster to monitor and provideinformation on the installed nodes.

Single InstanceA service that is installed on a single node in a cluster that provides itsservice across all nodes.

The following table lists the service components and their node behavior. Use thefollowing information as guidance to set up your environment.

Table 6. Cluster node behavior

Services Type Service ComponentsCluster nodebehavior

Big Data Extension Master Manager Monitoring Nodes

Slave Entity Analytics Cluster Singleton

Slave Tivoli NetcoolPerformanceManager Collector

Cluster Singleton

Slave Storage Cluster Singleton

Slave UI Load Balancing

For moreinformation, see UIService inTivoliNetcoolPerformance Manager -BigData ExtensionOverview.

HDFS Master NameNode Single Instance

Master SNameNode Single Instance

Slave DataNode Data Replication

YARN Master Timeline Server Single Instance

Master Resource Manager Single Instance

Slave Node Manager Managed LoadBalancing

ZooKeeper Master ZooKeeper Data Replication

Ambari Metrics Master Collector Single Instance

Slave Monitor Monitoring Nodes

Kafka Master Kafka Broker Data Replication

Master Kafka Connect Single Instance

MapReduce2 Master History Server Single Instance


Opening the default ports for a typical installationBefore you install IBM Open Platform with Apache Spark and Apache Hadoopsoftware, open these ports to avoid any conflicts that might exist in your system.

Table 7. Default port numbers for IOP and Big Data Extension services

Service User Protocol Port number

Ambari Metrics ams tcp

60200

6188

Ambari Metrics ams tcp6

37266

45884

61181

61310

41824

Ambari Server root tcp6

8670

8080

8440

8441

HDFS hdfs tcp

50090

8010

8020

50070

58042

50010

50075

Entity Analytics netcool tcp6

2561

21081

21443

KAFKA kafka tcp6

56969

6667

39122

8083

KAFKA Schema Registry kafka tcp6 8093

Manager netcool tcp6

2564

20081

20443

MapReduce mapred tcp

10020

19888

10033

Storage netcool tcp6

2553

13081

13443


Table 7. Default port numbers for IOP and Big Data Extension services (continued)

Service User Protocol Port number

SFTP root tcp6 22

Tivoli Netcool PerformanceManager Collector netcool tcp6

2564

24081

24443

UI netcool tcp6

2552

8081

9443

YUM Repository root tcp6 9091

YARN yarn tcp6

7337

8025

8030

8040

8042

8050

8088

8141

8188

10200

13562

45454

YARN

For Spark Executors yarn tcp6

46100 - 46600

47100 - 47600

ZooKeeper zookeeper tcp6

2182

2888

3888


IBM BigInsights - Get ready to install

Gathering the required informationCollect the following information before you start your installations.v The fully qualified domain name (FQDN) for each host in your system, and the

components that you want to set up on different hosts. The Ambari installationwizard does not support IP addresses. Use hostname -f to check for the FQDN.

v The base directories for the following components:– NameNode data– DataNodes data– MapReduce data– ZooKeeper data– Various log, pid, and database files according to your installation type


http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.install.doc/doc/inst_iop_doPrepare.html

Users and groups for Big Data Extensionv root

v netcool

The netcool user is created during Big Data Extension installation and all BigData Extension services are run as netcool user.

Users and groups for IBM Open Platform with Apache Spark and Apache Hadoop

Service Group User

HDFS hadoop hdfs

MapReduce hadoop mapred

YARN hadoop yarn

Ambari Metrics hadoop ams

Kafka hadoop kafka

Spark hadoop spark

ZooKeeper hadoop zookeeper

Note: You can see this information on the installed Ambari server from Admin >Service Accounts.

Installing prerequisite softwareInstall Tivoli Netcool Performance Manager - Wireline Component and therequired technology packs before you install Big Data Extension.Related information:

Installing Tivoli Netcool Performance Manager - Wireline Component


https://www.ibm.com/support/knowledgecenter/SSBNJ7_1.4.3/install/ctnpm_install_guide.html

Chapter 3. Preparing your environment

Before you run the installation, you must prepare your target environments.

Before you begin

Before you begin the Big Data Extension installation, install Tivoli NetcoolPerformance Manager.

Downloading Big Data Extension installation mediaHow to get the product distribution.

Licensed customers can download the Tivoli Netcool Performance ManagerWireline Big Data Extension electronic image from IBM Passport Advantagewebsite.Related information:

Download IBM Tivoli Netcool Performance Manager Version 1.4.3

Extracting the Big Data Extension softwareExtract the Tivoli Netcool Performance Manager Big Data Extension distribution.

Procedure1. Create a directory to hold the contents of your Tivoli Netcool Performance

Manager Big Data Extension distribution.For example: /tmp/INSTALLERS

Note: Any further references to this directory within the install are made byusing the token <DIST_DIR>.

2. Extract the CNIY0EN.tgz file to <DIST_DIR> by using the following command:tar –zxvf <DIST_DIR> CNIY0EN.tgz

You can find the following package files:v tnpm_bde-basecamp-repo-1.4.3.0-<timestamp>.noarch.rpm

v tnpm_bde-tnpm-bde-repo-1.4.3.0-<timestamp>.noarch.rpm

v tnpm_bde-basecamp-httpd-1.4.3.0-<timestamp>.noarch.rpm

v tnpm_bde-tnpm-bde-ambari-1.4.3.0-<timestamp>.noarch.rpm

v tnpm-bde-installer-tools-1.4.3.0-<timestamp>.noarch.rpm


http://www-01.ibm.com/software/passportadvantage/

http://www.ibm.com/support/docview.wss?uid=swg24043397

Downloading the IBM Open Platform with Apache Spark and ApacheHadoop

Download the IBM Open Platform with Apache Spark and Apache Hadoopcomponents.

About this task

Download the following packages to <DIST_DIR>..v ambari-2.2.0.el7.x86_64.tar.gz

v iop-4.2.0.0.el7.x86_64.tar.gz

v iop-utils-1.2.0.0.el7.x86_64.tar.gz

Procedure

Download the following packages:

v Ambariv IOPv IOP-UTILS

Preparing to run the prerequisite scannerIn addition to Big Data Extension- specific tasks, complete these common tasksbefore you start an installation.

Before you begin

The Ambari installer pulls some packages from the base operating systemrepositories. For example, for RHEL systems, make sure that you have the Red HatLinux and Red Hat Optional repository channels that are configured and set upbefore installation.

About this task

Use the root user account to perform the following steps.

Procedure1. Ensure that adequate disk space exists for the root partition.

You can find the Ambari service directories in /usr/iop/current. The logs foreach service can be found in /var/log/<service>.You need enough space for these directories and users. Minimum 80 GB diskspace is required

2. Edit the /etc/hosts file to include the IP address, fully qualified domainname, and short name of each host in your cluster, which is separated byspaces.Ensure that all characters in host names are lowercase. The format isIP_address domain_name short_name.In the following example, assume that node1 is the host that is used for theAmbari setup and the Ambari server:


http://ibm-open-platform.ibm.com/repos/Ambari/rhel/7/x86_64/2.2.x/GA/2.2.0/ambari-2.2.0.el7.x86_64.tar.gz

http://ibm-open-platform.ibm.com/repos/IOP/rhel/7/x86_64/4.2.x/GA/4.2.0.0/iop-4.2.0.0.el7.x86_64.tar.gz

http://ibm-open-platform.ibm.com/repos/IOP-UTILS/rhel/7/x86_64/1.2/iop-utils-1.2.0.0.el7.x86_64.tar.gz

127.0.0.1 localhost.localdomain localhost123.123.123.123 node1.abc.com node1123.123.123.124 node2.abc.com node2123.123.123.125 node3.abc.com node3

3. Disable firewalls and IPv6.a. Run the following commands in succession to disable the firewall

(iptables) on all nodes in your cluster.For RHEL 7.x:systemctl stop firewalld.service

systemctl disable firewalld.service

Note: Ensure that you reenable the firewall on all nodes in your clusterafter installation.

b. For Linux x86_64 systems only, disable the Transparent Huge Pages foreach client node in your cluster. Run the following command on eachAmbari client node:echo never > /sys/kernel/mm/transparent_hugepage/enabled

c. On all servers in your cluster, disable IPv6 on all interfaces by using thefollowing command.From the command line, enter ifconfig to check whether IPv6 is running.In the output, an entry for inet6 indicates that IPv6 is running.sysctl -w net.ipv6.conf.all.disable_ipv6=1net.ipv6.conf.all.disable_ipv6 = 1sysctl -w net.ipv6.conf.default.disable_ipv6=1net.ipv6.conf.default.disable_ipv6 = 1

4. Ensure that your environment does not include any existing Ambariinstallation files, by running a search for the string ambari.The following code returns nothing if no Ambari installation files exist:yum list installed | grep -i ambari

5. Ensure that the ulimit properties for your operating system are configured on/etc/security/limits.conf file as follows:* - nofile 65536* - nproc 65536

6. Enable the Network Time Protocol (NTP/NTPD) service on the managementnode and allow the clients to synchronize with the master node.The IBM Open Platform with Apache Spark and Apache Hadoop installationprogram synchronizes the other server clocks with the master server duringinstallation.

7. All hosts in your system must be configured for DNS and Reverse DNS.8. Disable SELinux before you install IBM Open Platform with Apache Spark

and Apache Hadoop and it must remain disabled for IBM Open Platform withApache Spark and Apache Hadoop to function.To disable SELinux temporarily, run the following command on each host inyour cluster:setenforce 0

Then, disable SELinux permanently by editing the SELinux config and set theSELINUX parameter to disabled on each host. This ensures SELinux remainsdisabled if the system is rebooted.vi /etc/selinux/config# This file controls the state of SELinux on the system.# SELINUX= can take one of these three values:# enforcing - SELinux security policy is enforced.

Chapter 3. Preparing your environment 13

# permissive - SELinux prints warnings instead of enforcing.# disabled - SELinux is fully disabled.SELINUX=disabled# SELINUXTYPE= type of policy in use. Possible values are:# targeted - Only targeted network daemons are protected.# strict - Full SELinux protection.SELINUXTYPE=targeted

9. Ensure that the ZONE parameter value is valid, which means that it mustmatch an actual file name in /usr/share/zoneinfo. Ensure the Timezone file/usr/share/zoneinfo/<Time Zone file> is valid by running the followingcommand:timedatectl status

10. Unset JAVA_HOME environment variable.11. Remove the following HTTPD and its related packages permanently by using

the following commands:rpm -e apprrpm -e apr-utilsrpm -e httpd-toolsrpm -e httpd

12. Remove the following PostgreSQL related packages:postgresql-serverpostgresqlpostgresql-libs


Get ready to install

IBM BigInsights: Preparing your environment

Directories created when installing IBM Open Platform with Apache Spark andApache Hadoop

Running the prerequisite scannerThe prereq_check.sh script that is available in the installation package is used forchecking the prerequisites. This tool is a scanning tool that performs identification,checking, and verification of prerequisites for Big Data Extension software beforethe actual installation takes place.

About this task

The prereq_check.sh tool is located in: /opt/IBM/tnpm-bde/tnpm-bde-installer-tools/ambari

Procedure

Run the tool as follows:./prereq_check.sh

The output from the scan can be located in a file at the following location:/tmp/prereq_check_<timestamp>.out


http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.install.doc/doc/inst_iop_doPrepare.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.install.doc/doc/install_prepare.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.install.doc/doc/c0057868.html

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.install.doc/doc/c0057868.html

Setting SSH passwordless login from Ambari server to cluster nodesYou must set up passwordless SSH connections for the Ambari server host toremotely connect to all other Ambari agent hosts that are in the cluster withoutentering the password. SSH (Secure SHELL) is an open source network protocolthat is used to login into remote servers for execution of commands and programs.

Procedure1. Log in to the system where you want to install Ambari server host as root user.2. On this Ambari server host, generate the public and private SSH keys with the

following command:ssh-keygen -t rsa

Accept all the default values at the prompts.A new public key (id_rsa.pub) is generated on the Ambari server host under.ssh directory as a file name authorized_keys.

3. From the Ambari server host, copy the SSH public key (id_rsa.pub) to the rootaccount on the Ambari agent hosts by using the following commands:ssh-copy-id -i ~/.ssh/id_rsa.pub root@<myserver1.ibm.com>ssh-copy-id -i ~/.ssh/id_rsa.pub root@<myserver2.ibm.com>ssh-copy-id -i ~/.ssh/id_rsa.pub root@<myserver3.ibm.com>

Note: You might have to enter the password for the first time to copy the key.4. Ensure that permissions on your .ssh directory are set to 700 and the

permissions on the authorized_keys file in that directory are set to either 600 or640.chmod 700 /root/.sshchmod 600 /root/.ssh/authorized_keys

5. From the Ambari server host, connect to each Ambari agent host in the clusterby using SSH to test your connections. For example, enter the followingcommand:ssh root@<myserver1.ibm.com>

Enter yes if you encounter this message:Are you sure you want to continue connecting (yes/no)?

6. Repeat the connection attempt from the Ambari server host to each Ambariagent host to make sure that the Ambari server can connect to each Ambariagent.

Chapter 3. Preparing your environment 15


Chapter 4. Installing Big Data Extension

Use this information to install Big Data Extension for the first time in a fresh,stand-alone environment.

Before you beginv Ensure that Tivoli Netcool Performance Manager and your entitled technology

packs are installed.v Ensure that the necessary user permissions are in place for all the installation

directories.

Running the Big Data Extension installation packagesInstall Big Data Extension and its related software by using the yum utility, which isan open-source command-line package-management utility for computers that runthe Linux operating system.

About this task

The following tasks are accomplished after the installation completes:v Set up Big Data Extension repositories.v Set up Big Data Extension as IBM BigInsights extended service stack on Ambari.

Procedure1. Create directory /var/www/html/repos if it does not exist with the following

command:mkdir –p /var/www/html/repos

2. Extract the IBM Open Platform with Apache Spark and Apache Hadoop fileswith the following commands:tar -zxvf <DIST_DIR>/ambari-2.2.0.el7.x86_64.tar.gz -C /var/www/html/repostar -zxvf <DIST_DIR>/iop-4.2.0.0.el7.x86_64.tar.gz -C /var/www/html/repostar -zxvf <DIST_DIR>/iop-utils-1.2.0.0.el7.x86_64.tar.gz -C /var/www/html/repos

3. Install the Big Data Extension installer tools package with the followingcommand:yum install -y tnpm-bde-installer-tools-1.4.3.0-<timestamp>.noarch.rpm

It contains the following scripts:v prereq_check.sh

v agent_setup_nonRoot.sh

v cleanup.sh

v host_cleanup.sh

4. Install the Big Data Extension basecamp package with the following commands:yum clean allyum install -y tnpm-bde-basecamp-repo-1.4.3.0-<timestamp>.noarch.rpmcreaterepo /var/www/html/repos/tnpm-bde/

Note: createrepo is a Linux utility that creates a repomd (XML-based rpmmetadata) repository from a set of RPMs. This command creates the necessarymetadata for your yum repository and the sqlite database for speeding up yumoperations.


5. Install the Big Data Extension services package with the following command:yum install –y tnpm-bde-tnpm-bde-repo-1.4.3.0-<timestamp>.noarch.rpmcreaterepo /var/www/html/repos/tnpm-bde/

Following actions are performed after this step is complete:v Set up Big Data Extension Tivoli Netcool Performance Manager Collector

package in /var/www/html/repos/tnpm-bde folder.6. Install the Big Data Extension basecamp httpd package with the following

commandyum install -y tnpm-bde-basecamp-httpd-1.4.3.0-&lttimestamp>.noarch.rpm

The following actions are performed after this step is complete:v Installs dependent packages:

– apr

– apr-util

– mailcap

v Installs Apache Hypertext Transfer Protocol Server (httpd).v Updates httpd port to 9091.

7. Install the Ambari services package with the following commands:yum install –y tnpm-bde-tnpm-bde-ambari-1.4.3.0-<timestamp>.noarch.rpm

The following actions are performed after this step is complete:v Installs dependent packages:

– postgresql

– postgresql-libs

– postgresql-server

v Installs Ambari server.v Set up Ambari server.v Configures Ambari server to auto restart Big Data Extension services and

components.v Updates related repo files in /etc/yum.repos.d/ to point to local yum

repositories.v Updates Big Data Extension service stack repoinfo.xml file to point to local

RPM repositories.v Starts Ambari server.


Chapter 5. Setting up Big Data Extension clusters andservices

Use the Ambari installation wizard in your browser to complete the deployment ofclusters and setting up of Big Data Extension services and Hadoop components.

Before you beginv Ensure that you have the SSH Private key for root user on Ambari server host.v Ensure that you have configured the SSH Passwordless login entry to all target

hosts.

Procedure1. Open a browser and access the Ambari server dashboard.

Use the following default URL:http://<myserver.ibm.com>:8080

Note: You can use the fully qualified domain name (FQDN) or the IP Addressof the server.The default user name is admin, and the default password is admin.

2. Click Launch Install Wizard on the Ambari Welcome page.The CLUSTER INSTALL WIZARD opens.

3. Enter a name for the cluster you want to create on the Get Started page andclick Next.

Note: The name cannot contain blank spaces or special characters.4. Select BigInsights 4.2.TNPM_BDE stack on the Select Stack page and click

Next.5. Complete the following steps on the Install Options page:

a. List all of the nodes that are used in the IBM Open Platform with ApacheSpark and Apache Hadoop cluster in Target Hosts pane.Specify one node per line, as in the following example:node1.abc.comnode2.abc.comnode3.abc.com

Note: The host name must be the FQDN. For example, <myserver.ibm.com>.b. Select Provide your SSH Private Key to automatically register hosts and

click SSH Private Key link on Host Registration Information pane.If the root user installed the Ambari server, the private key file is/root/.ssh/id_rsa. You can browse to the .ssh/id_rsa file and theAmbari web interface uploads the contents of the key file, or you can openthe file and copy and paste the contents into the SSH key field.

c. Click Register and Confirm.6. Verify that the correct hosts for your cluster are located successfully on the

Confirm Hosts page.If hosts that are selected are incorrect, remove the hosts one-by-one byfollowing these steps:a. Click the box next to the server to be removed.b. Click Remove in the Action column.


Note:

v If warnings are found during the check process, click Click here to seethe warnings to see the warnings. The Host Checks page identifies anyissues with the hosts. For example, a host might have Transparent HugePages or Firewall issues.

v Ignore the process issues that are not related to Big Data Extension.c. After you resolve the issues, click Rerun Checks on the Host Checks page.

After you have confirmed the hosts, click Next.7. Select the following services on the Choose Services page:

Service Version Description

HDFS 2.7.2 Apache Hadoop Distributed File System (HDFS)

YARN +MapReduce2

2.7.2 Apache Hadoop NextGen MapReduce (YARN)

ZooKeeper 3.4.6 Centralized service that provides reliable distributedcoordination.

Ambari Metrics 0.1.0 A system for metric collection that provides storageand retrieval capability for metrics that are collectedfrom the cluster.

Kafka 0.9.0.1 A high-throughput messaging system.

TNPM BDE 1.4.3 Tivoli Netcool Performance Manager Big DataExtension cluster service

TNPM BDE SparkClient Scala 2.11

2.0.1 Apache Spark is an engine for large-scale dataprocessing. The Apache Spark client library iscompiled on Scala 2.11 and is specific to Big DataExtension.

8. Click Next.9. Assign the master nodes to hosts in your cluster for the services you selected

on the Assign Masters page and click Next.You can accept the current default assignments. To assign a new host to runservices, click the list next to the master node in the left column and select anew host.

10. Click Next.11. Assign the slave and client components to hosts in your cluster on the Assign

Slaves and Clients page. Select all services for assignment.Click all or none to decide the host assignments. Or, you can select one ormore components next to a selected host.

12. Click Next.13. Update the configuration settings for the following services on Customize

Services pane. You can see a set of tabs from where you can manage thesettings for Hadoop and Big Data Extension components.v “Setting up YARN Service” on page 21v “Setting up HDFS Service” on page 21v “Setting up Big Data Extension services” on page 22v “Setting up communication with Tivoli Netcool Performance Manager

system” on page 2314. Click Next after you have reviewed your settings, and completed the

configuration of the services.15. Verify that your settings are correct and click Deploy on the Review page.


16. See the progress of the installation on Install, Start, and Test page.The progress bar gives the overall status and the main section of the pagegives the status for each host. When you click the task, log for a specific taskcan be displayed.

17. Click Next after the services are installed successfully.18. Review the completed tasks on the Summary page and click Complete.

Results

It might take a while for Ambari to start all the services. To see the status of all theservices in a host, click the Hosts tab in the Ambari server host, and then select ahost. You can see the services that are started from the Summary page.

Setting up YARN ServiceYARN decouples resource management and scheduling capabilities from the dataprocessing component. The YARN framework uses a ResourceManager service, aNodeManagers service, and an Application master service.

Procedure1. Click YARN > Configs > Settings.2. Configure the required settings as follows:v Node memory: 16000 MBv Minimum Container size: 3072 MBv Maximum Container size: 16000 MBv Ensure that maximum container memory per container is 16000 MB.v CPU Minimum Container Size (VCores): 1v CPU Maximum Container Size (VCores): 16v Percentage of physical CPU allocated for all containers on a node: 80%v CPU Number of virtual cores: 16


YARN

Setting up HDFS ServiceSet properties for the NameNode, Secondary NameMode, DataNodes, and somegeneral and advanced properties. Click the name of the group to expand andcollapse the display. Set these values for fine-tuning and better performance.

Procedure1. Click HDFS > Configs > Settings from Ambari web interface and specify the

following recommended settings:v NameNode directories:

– /<data1>/hadoop/1/hdfs/namenode



Note: You can place the NameNodes and DataNodes in any directory thathas 10 TB disk space.

v DataNode directories:

Chapter 5. Setting up Big Data Extension clusters and services 21

http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.product.doc/doc/yarn.html

– /<data1>/hadoop/1/hdfs/data



v NameNode Java heap size: 3 GBv NameNode server threads: 800v Minimum replicated blocks: 100%v DataNode failed disk tolerance: 0v DataNode maximum Java heap size: 1 GBv DataNode max data transfer threads: 4096

2. Click HDFS > Configs > Settings > Advanced from Ambari web interface anddo the following settings:v NameNode new generation size: 384 MBv NameNode maximum new generation size: 384 MBv NameNode permanent generation size: 128 MBv NameNode maximum permanent generation size: 512 MB

Setting up Big Data Extension servicesUse these instructions to set up all the Big Data Extension services from web-basedAmbari user interface. The configuration settings from Ambari UI are written toapplication.conf files that are located in the /conf directory of each microservice.

About this task

Typically, you can accept the following default values.

Procedure1. Click Services > TNPM BDE > Settings.2. Make sure that you are in the Configs tab and change the default values in the

following fields:

Table 8. Common settings

Option Description Default value

storage.jdbc-service Used to build the path tostorage location with httpport for JDBC service.

<myserver.ibm.com>:13081

kafka.zk-connect ZooKeeper URL with Kafkaznode. The string{{zookeeper.connect}} ispopulated with settings inzookeeper.connect.

{{zookeeper.connect}}

kafka.broker-list List of Kafka brokers. Thestring {{kafka.broker-list}} is populated withcluster's Kafka hosts andports.

{{kafka.broker-list}}

Table 9. Manager settings


manager.ambari.user Ambari user name admin


Table 9. Manager settings (continued)


manager.ambari.password Ambari password admin

Table 10. Web Services settings


http.port The http port on which BigData Extension applicationconsole can be accessed.

8081

https.port The https port on which BigData Extension applicationconsole can be accessed.

9443

Setting up communication with Tivoli Netcool Performance Managersystem

These settings are required for communication with Tivoli Netcool PerformanceManager.

About this task

Make sure that you configure the Tivoli Netcool Performance Manager databaseserver correctly. Avoid changing the server later as it might conflict with yourexisting data. If you must change the Tivoli Netcool Performance Managerdatabase server for any reason, make sure you cleanup all the existing data beforeyou point to the new server.

Procedure1. Click TNPM BDE > Settings.2. Make sure that you are in the Configs tab and change the values in the

following fields:

Note:

v Use db2jcc-4.19.49.jar JDBC driver that is available in the/opt/IBM/tnpm-bde/tnpm-bde-connect/libs folder to connect to IBM DB2database for Tivoli Netcool Performance Manager. For more informationabout compatible drivers, see DB2 JDBC Driver Versions and Downloads.

v Use ojdbc6-11gR2.jar JDBC driver that is available in the/opt/IBM/tnpm-bde/tnpm-bde-connect/libs folder to connect to Oracledatabase.

Table 11. TNPM Collector settings.

Option Description Example

tnpm.platform The database platform forTivoli Netcool PerformanceManager. You can selectOracle or DB2 from the list.

DB2 or ORACLE


http://www-01.ibm.com/support/docview.wss?uid=swg21363866

Table 11. TNPM Collector settings (continued).


tnpm.host Name of the host whereTivoli Netcool PerformanceManager database isinstalled.

<myserver.ibm.com>

tnpm.port The network port to connectto Tivoli NetcoolPerformance Manager

v60000

v1521

tnpm.username An authorized database username v

PV_ADMIN

vPV_ADMIN

tnpm.password Password for the authorizeddatabase user v

pv

vpv

tnpm.database Database name PV

tnpm.schema The schema name PV_ADMIN

tnpm.ftpUsername The FTP user name pvuser

tnpm.ftppassword The FTPp password pv

collector.tnpm.kafka.connect.rest.url

Kafka connect REST URL http://<myserver.ibm.com>:8083/connectorsNote: Make sure that theserver that is specifiedhas Storage Serviceinstalled.

collector.tnpm.kafka.connect.rest.realm

Kafka connect REST realm http://<myserver.ibm.com>:8083/connectorsNote: Make sure that theserver that is specifiedhas Storage Serviceinstalled.

collector.tnpm.kafka.connect.rest.username

User name for Kafkaconnector

You can leave it as blank.

collector.tnpm.kafka.connect.rest.password

Password for Kafkaconnector

You can leave it as blank.

kafka.zk-connect ZooKeeper URL with Kafkaznode. The string{{zookeeper.connect}} isauto-populated.

{{zookeeper.connect}}


Table 11. TNPM Collector settings (continued).


kafka.broker-list List of Kafka Brokers. Thestring {{kafka.broker-list}} is auto-populatedwith Kafka hosts and portsin the cluster.

{{kafka.broker-list}}



Chapter 6. Postinstallation tasks

Perform these postinstallation tasks after the installation of Big Data Extension iscomplete.

To make sure that all the services start automatically when the Ambari server hostis restarted, run the following command in a single line on the Ambari server host:unlink /etc/rc.d/init.d/ambari-server && cp -a /usr/sbin/ambari-server/etc/rc.d/init.d/ambari-server && systemctl daemon-reload

Note: If you do not run this script, some services that are available on the Ambariserver host might not start.

Verifying the installationYou can verify the Big Data Extension installation status.

Procedure

Log in to Ambari server as follows:Log in with default user name as admin and password as admin.

Important: If you access the Ambari server remotely, use the IP address of theserver instead of the FQDN.http://<myserver.ibm.com>:8080

Installation directory structureUse this information to understand the default directories that are created duringinstallation.

These directories are created in /opt/IBM/tnpm-bde path. Typically, all themicroservices have the directory stack as follows:

The logs directory contains a separate log file for the microservice.

tnpm-bde-connectContains the Kafka connect script that is called from Ambari to start theservice. It also contains the JDBC driver files that are needed to connect toIBM DB2, Oracle, and for Kafka to connect to Tivoli Netcool PerformanceManager database.


tnpm-bde-entity-analyticsContains the directories and files that are required for Entity Analyticsservice to function.

tnpm-bde-jreContains the JRE that is bundled with Big Data Extension.

tnpm-bde-managerContains the directories and files that are required for Big Data ExtensionManager service to function.

tnpm-bde-schema-registrySchema Registry provides a serving layer for your metadata. It stores aversioned history of all schemas, provides multiple compatibility settingsand allows evolution of schemas according to the configured compatibilitysetting.

tnpm-bde-sparkContains the Spark libraries.

tnpm-bde-storageContains the directories and files that are required for Big Data ExtensionStorage service to function.

tnpm-bde-tnpm-collectorContains the directories and files that are required for Tivoli NetcoolPerformance Manager - Collector service to function.

tnpm-bde-toolsContains the encryption script that Ambari uses for encrypting thepasswords.

tnpm-bde-uiContains the directories and files that are required for UI service tofunction.

tnpm-bde-installer-toolsContains the scripts that are required for prerequisite checking anduninstallation of Ambari. This directory is available on the Ambari serverhost only.

tnpm-bde-basecamp-httpdThis directory is available on the Ambari server host only.


Chapter 7. Uninstalling Big Data Extension

Uninstall Big Data Extension and the related software from the system.

About this task

Uninstall the following components that you installed:v IBM Open Platform with Apache Hadoop components, including YARN, HDFS,

and Zookeeper servicesv Ambari agent nodes that contain Big Data Extension instancesv Ambari server

Listing the working directoriesIf you installed Big Data Extension related components in a non-root path, youneed to manually remove the related working directories. If your installations arein root path, then the host_cleanup.sh script can remove them.

About this task

List down the working directories before you run the uninstallation scripts to makesure that they are removed.

Procedure1. Log in to Ambari server host as follows:

http://<ambari_server_host>:8080

2. Click the Services > Configs tab.3. Note down the following directories:

ServicesAmbari Componentdirectory Non-root installation path

Kafka Kafka > Configs > KafkaBroker > log.dirs

<data1>/kafka-logs

HDFS HDFS > Configs > Settings> NameNode

<data1>/hadoop/hdfs/namenode

HDFS > Configs > Settings> DataNode

<data1>/hadoop/hdfs/datanode

YARN YARN > Configs >Advanced > ApplicationTimeline Server >yarn.timeline-service.leveldb-timeline-store.path

<data1>/var/log/hadoop-yarn/timeline

Ambari Metrics Ambari Metrics > Configs >Advanced ams-hbase-site >hbase.rootdir

<data1>/var/lib/ambari-metrics-collector/hbase

ZooKeeper ZooKeeper > Configs >ZooKeeper > ZooKeeperdirectory

<data1>/hadoop/zookeeper


Uninstalling Ambari agent nodes and Ambari server by using scriptsRun the cleanup script to uninstall the Ambari server hosts.

Procedure1. Log in to Ambari server and stop all services of the Ambari agent nodes in

your cluster.2. Copy the host_cleanup.sh script from Ambari server from

/opt/IBM/tnpm-bde/tnpm-bde-installer-tools/ambari folder to all the Ambariagent nodes that you want to uninstall.For example, /tmp/host_cleanup.sh.

3. Copy the cleanup.sh script from Ambari server from /opt/IBM/tnpm-bde/tnpm-bde-installer-tools/ambari folder to any other location as the/tnpm-bde/tnpm-bde-installer-tools/ambari folder is deleted when thehost_cleanup.sh script is run.For example, /tmp/host_cleanup.sh.

4. Run the host_cleanup.sh script as root user as follows:cd /tmp./host_cleanup.sh

The host_cleanup.sh script performs the following functions:v Checks the user who is running the script is rootv Checks for the HostCleanup.ini filev Stops the Ambari server and the Ambari agent, if they are still running.v Stops the Linux processes that are started by a list of service users. The users

are defined in the HostCleanup.ini file. You can also specify a list of Linuxprocesses to be stopped.

v Removes the PRM packages that are listed in the HostCleanup.ini file.v Removes the Big Data Extension packages and working folders.v Removes the service users that are listed in the HostCleanup.ini file.v Deletes directories, symbolic links, and files that are listed in the

HostCleanup.ini file.v Deletes repositories that are defined in the HostCleanup.ini file.

5. Run the cleanup.sh on the Ambari server by using the following command:cd /tmpcleanup.sh

This script uninstalls the IBM Open Platform with Apache Spark and ApacheHadoop, Ambari server, and remaining Big Data Extension working directorieson Ambari server.


Cleaning up nodes before reinstalling software


http://www.ibm.com/support/knowledgecenter/SSPT3X_4.2.0/com.ibm.swg.im.infosphere.biginsights.install.doc/doc/bi_install_cleanup_nodes_for_reinstall.html

Chapter 8. Troubleshooting installation

Problems that might occur during an installation and how to resolve them.

About this task

For all troubleshooting issues in installation of Big Data Extension, seeTroubleshooting installation and uninstallation section in Troubleshooting TivoliNetcoolPerformance Manager - BigData Extension.



Notices

This information was developed for products and services offered in the US. Thismaterial might be available from IBM in other languages. However, you may berequired to own a copy of the product or product version in that language in orderto access it.

IBM may not offer the products, services, or features discussed in this document inother countries. Consult your local IBM representative for information on theproducts and services currently available in your area. Any reference to an IBMproduct, program, or service is not intended to state or imply that only that IBMproduct, program, or service may be used. Any functionally equivalent product,program, or service that does not infringe any IBM intellectual property right maybe used instead. However, it is the user's responsibility to evaluate and verify theoperation of any non-IBM product, program, or service.

IBM may have patents or pending patent applications covering subject matterdescribed in this document. The furnishing of this document does not grant youany license to these patents. You can send license inquiries, in writing, to:

IBM Director of LicensingIBM CorporationNorth Castle Drive, MD-NC119Armonk, NY 10504-1785US

For license inquiries regarding double-byte character set (DBCS) information,contact the IBM Intellectual Property Department in your country or sendinquiries, in writing, to:

Intellectual Property LicensingLegal and Intellectual Property LawIBM Japan Ltd.19-21, Nihonbashi-Hakozakicho, Chuo-kuTokyo 103-8510, Japan

INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THISPUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHEREXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIEDWARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESSFOR A PARTICULAR PURPOSE. Some jurisdictions do not allow disclaimer ofexpress or implied warranties in certain transactions, therefore, this statement maynot apply to you.

This information could include technical inaccuracies or typographical errors.Changes are periodically made to the information herein; these changes will beincorporated in new editions of the publication. IBM may make improvementsand/or changes in the product(s) and/or the program(s) described in thispublication at any time without notice.

Any references in this information to non-IBM websites are provided forconvenience only and do not in any manner serve as an endorsement of those


websites. The materials at those websites are not part of the materials for this IBMproduct and use of those websites is at your own risk.

IBM may use or distribute any of the information you provide in any way itbelieves appropriate without incurring any obligation to you.

Licensees of this program who wish to have information about it for the purposeof enabling: (i) the exchange of information between independently createdprograms and other programs (including this one) and (ii) the mutual use of theinformation which has been exchanged, should contact:

IBM Director of LicensingIBM CorporationNorth Castle Drive, MD-NC119Armonk, NY 10504-1785US

Such information may be available, subject to appropriate terms and conditions,including in some cases, payment of a fee.

The licensed program described in this document and all licensed materialavailable for it are provided by IBM under terms of the IBM Customer Agreement,IBM International Program License Agreement or any equivalent agreementbetween us.

The performance data discussed herein is presented as derived under specificoperating conditions. Actual results may vary.

The client examples cited are presented for illustrative purposes only. Actualperformance results may vary depending on specific configurations and operatingconditions.

Information concerning non-IBM products was obtained from the suppliers ofthose products, their published announcements or other publicly available sources.IBM has not tested those products and cannot confirm the accuracy ofperformance, compatibility or any other claims related to non-IBM products.Questions on the capabilities of non-IBM products should be addressed to thesuppliers of those products.

Statements regarding IBM's future direction or intent are subject to change orwithdrawal without notice, and represent goals and objectives only.

All IBM prices shown are IBM's suggested retail prices, are current and are subjectto change without notice. Dealer prices may vary.

This information is for planning purposes only. The information herein is subject tochange before the products described become available.

This information contains examples of data and reports used in daily businessoperations. To illustrate them as completely as possible, the examples include thenames of individuals, companies, brands, and products. All of these names arefictitious and any similarity to actual people or business enterprises is entirelycoincidental.

COPYRIGHT LICENSE:


This information contains sample application programs in source language, whichillustrate programming techniques on various operating platforms. You may copy,modify, and distribute these sample programs in any form without payment toIBM, for the purposes of developing, using, marketing or distributing applicationprograms conforming to the application programming interface for the operatingplatform for which the sample programs are written. These examples have notbeen thoroughly tested under all conditions. IBM, therefore, cannot guarantee orimply reliability, serviceability, or function of these programs. The sampleprograms are provided "AS IS", without warranty of any kind. IBM shall not beliable for any damages arising out of your use of the sample programs.

Each copy or any portion of these sample programs or any derivative work mustinclude a copyright notice as follows:

© (your company name) (year).Portions of this code are derived from IBM Corp. Sample Programs.© Copyright IBM Corp. _enter the year or years_.

TrademarksIBM, the IBM logo, and ibm.com are trademarks or registered trademarks ofInternational Business Machines Corp., registered in many jurisdictions worldwide.Other product and service names might be trademarks of IBM or other companies.A current list of IBM trademarks is available on the web at "Copyright andtrademark information" at www.ibm.com/legal/copytrade.shtml.

Adobe, Acrobat, PostScript and all Adobe-based trademarks are either registeredtrademarks or trademarks of Adobe Systems Incorporated in the United States,other countries, or both.

IT Infrastructure Library is a registered trademark of the Central Computer andTelecommunications Agency which is now part of the Office of GovernmentCommerce.

Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo,Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks orregistered trademarks of Intel Corporation or its subsidiaries in the United Statesand other countries.

Linux is a registered trademark of Linus Torvalds in the United States, othercountries, or both

Microsoft and Windows are trademarks of Microsoft Corporation in the UnitedStates, other countries, or both.

ITIL is a registered trademark, and a registered community trademark of TheMinister for the Cabinet Office, and is registered in the U.S. Patent and TrademarkOffice.

UNIX is a registered trademark of The Open Group in the United States and othercountries.

Notices 35

http://www.ibm.com/legal/us/en/copytrade.shtml

Java and all Java-based trademarks and logosare trademarks or registered trademarks ofOracle and/or its affiliates.

Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in theUnited States, other countries, or both and is used under license therefrom.

Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo aretrademarks of HP, IBM Corp. and Quantum in the U.S. and other countries.

Terms and conditions for product documentationPermissions for the use of these publications are granted subject to the followingterms and conditions.

Applicability

These terms and conditions are in addition to any terms of use for the IBMwebsite.

Personal use

You may reproduce these publications for your personal, noncommercial useprovided that all proprietary notices are preserved. You may not distribute, displayor make derivative work of these publications, or any portion thereof, without theexpress consent of IBM.

Commercial use

You may reproduce, distribute and display these publications solely within yourenterprise provided that all proprietary notices are preserved. You may not makederivative works of these publications, or reproduce, distribute or display thesepublications or any portion thereof outside your enterprise, without the expressconsent of IBM.

Rights

Except as expressly granted in this permission, no other permissions, licenses orrights are granted, either express or implied, to the publications or anyinformation, data, software or other intellectual property contained therein.

IBM reserves the right to withdraw the permissions granted herein whenever, in itsdiscretion, the use of the publications is detrimental to its interest or, asdetermined by IBM, the above instructions are not being properly followed.

You may not download, export or re-export this information except in fullcompliance with all applicable laws and regulations, including all United Statesexport laws and regulations.


IBM MAKES NO GUARANTEE ABOUT THE CONTENT OF THESEPUBLICATIONS. THE PUBLICATIONS ARE PROVIDED "AS-IS" AND WITHOUTWARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDINGBUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY,NON-INFRINGEMENT, AND FITNESS FOR A PARTICULAR PURPOSE.

Notices 37


IBM®

Printed in USA

Documents

with IBM Corp. · Gathering the r equir ed information ..... . 9 Installing pr er equisite softwar e ..... . 10 Chapter 3. Preparing your environment 1 1 Downloading Big Data Extension