aix Q

Embed Size (px)

Citation preview

  • 7/22/2019 aix Q

    1/34

    IBM-AIX Interview Questions:1. Why is the . not included in the path?2. How will you mirror a volume and how will you find if a volume is mirrored ?lsvg l (vol gp name)3. A system is echoing the ping but not able to login via telnet why ?Etc /services etc/inetd.conf4. what is the migration path from 4.3 to 5.1 ? (or any versions)5. A system is able to ping within the network but not outside why ?6. What are the components of a HACMP ? (did you use serial interface)7. What are the resource groups in HACMP?8. What are the apar install if so for what ?9. How will you log in or start the system in what mode if you dont know the root password?10. What is ip address and subnet means ?11. what types of san or nas devices were used ?12. Were the storage on the hard drive or any tape used ?13. how will you check the if a system is paging excessively ?14. There is too much of processor utilization what could be possible reason?15. How is paging space is allocated?16. How will you configure sendmail ?17. How will you assign superuser privilege to an ordinary user temporarily (sudo)18. Based on what one will choose to use shell or perl scripting ?19. Difference between telnet or ssh .

    20. How will you truncate a log file ?21. What is a sticky bit what is the effect on file and directory ?What is the command to view the active vg?How do u configure the Disk?Tell me Steps to configure the VG?How do u add a new disk in VG?What are the Attibutes of LVM?Describe about LVM Adva/Dis.Adv?How do u find the fix is installed?How to u extend the FS?Attributes of FS?List all the LV in system?How do u find the PP size of the VG?

    How do u create the pp size of 32 MB in VG?What is the Limitation of VG?How do u disable the paging space?What is LPAR?

    Can you explain the steps to Mirroring rootvg in your environment?

    Mirroring rootvg protects the operating system from a disk failure. Mirroring rootvg requires a couple extra steps compared to other volume groups. The mirrored roo

    tvg disk must be bootable *and* in the bootlist. Otherwise, if the primary diskfails, youll continue to run, but you wont be able to reboot.

    In brief, the procedure to mirror rootvg on hdisk0 to hdisk1 is

    1. Add hdisk1 to rootvg: extendvg rootvg hdisk1

    2. Mirror rootvg to hdisk1: mirrorvg rootvg hdisk1 (or smitty mirrorvg)

    3. Create boot images on hdisk1: bosboot -ad /dev/hdisk1

  • 7/22/2019 aix Q

    2/34

    4. Add hdisk1 to the bootlist:bootlist -m normal hdisk0 hdisk1

    5. Reboot to disable quorum checking on rootvg. The mirrorvg turns off quorum bydefault, but the system needs to be rebooted for it to take effect.

    What is VPN and how it works?

    A VPN is a private network that uses a public network (usually the Internet) toconnect remote sites or users together. Instead of using a dedicated, real-worldconnection such as leased line, a VPN uses virtual connections routed rough the Internet from the companys private network to the remote site or employee.

    What is daemon?

    A daemon (pronounced DEE-muhn) is a program that runs continuously and exists for the purpose of handling periodic service requests that a computer system expects to receive. The daemon program forwards the requests to other programs (or processes) as appropriate. Each server of pages on the Web has an HTTPD or Hypertext Transfer Protocol daemon that continually waits for requests to come in fromWeb clients and their users.

    There are several daemon in AIX environment, such as, sshd, inetd, and so on.

    Can you describe SAN in your won word?

    A storage area network (SAN) is a high-speed special-purpose network (or subnetwork) that interconnects different kinds of data storage devices with associateddata servers on behalf of a larger network of users. Typically, a storage area network is part of the overall network of computing resources for an enterprise.A storage area network is usually clustered in close proximity to other computing resources such as IBM Power5 boxes but may also extend to remote locations forbackup and archival storage, using wide area network carrier technologies suchas ATM or SONET .

    A storage area network can use existing communication technology such as IBMs optical fiber ESCON or it may use the newer Fibre Channel technology. Some SAN syst

    em integrators liken it to the common storage bus (flow of data) in a personal computer that is shared by different kinds of storage devices such as a hard diskor a CD-ROM player.

    SANs support disk mirroring, backup and restore, archival and retrieval of archived data, data migration from one storage device to another, and the sharing ofdata among different servers in a network. SANs can incorporate subnetworks withnetwork-attached storage (NAS) systems.

    So you mention NAS, but What is NAS?

    Network-attached storage (NAS) is hard disk storage that is set up with its ownnetwork address rather than being attached to the department computer that is se

    rving applications to a networks workstation users. By removing storage access and its management from the department server, both application programming and files can be served faster because they are not competing for the same processor resources. The network-attached storage device is attached to a local area network (typically, an Ethernet network) and assigned an IP address. File requests aremapped by the main server to the NAS file server.

    Network-attached storage consists of hard disk storage, including multi-disk RAID systems, and software for configuring and mapping file locations to the network-attached device. Network-attached storage can be a step toward and included as

  • 7/22/2019 aix Q

    3/34

    part of a more sophisticated storage system known as a storage area network (SAN).

    NAS software can usually handle a number of network protocols, including Microsofts Internetwork Packet Exchange and NetBEUI, Novells Netware Internetwork PacketExchange, and Sun Microsystems Network File System. Configuration, including thesetting of user access priorities, is usually possible using a Web browser.

    What is SMTP and how it works?

    SMTP (Simple Mail Transfer Protocol) is a TCP/IP protocol used in sending and receiving e-mail. However, since it is limited in its ability to queue messages atthe receiving end, it is usually used with one of two other protocols, POP3 orIMAP, that let the user save messages in a server mailbox and download them periodically from the server. In other words, users typically use a program that uses SMTP for sending e-mail and either POP3 or IMAP for receiving e-mail. On Unix-based systems, sendmail is the most widely-used SMTP server for e-mail. A commercial package, Sendmail, includes a POP3 server. Microsoft Exchange includes an SMTP server and can also be set up to include POP3 support.

    SMTP usually is implemented to operate over Internet port 25.

    Do you have any idea about NAT?

    Short for Network Address Translation, an Internet standard that enables a local-area network (LAN) to use one set of IP addresses for internal traffic and a second set of addresses for external traffic. A NAT box located where the LAN meets the Internet makes all necessary IP address translations.

    NAT serves three main purposes:

    * Provides a type of firewall by hiding internal IP addresses* Enables a company to use more internal IP addresses. Since theyre used inte

    rnally only, theres no possibility of conflict with IP addresses used by other companies and organizations.

    * Allows a company to combine multiple ISDN connections into a single Internet connection.

    Explain DHCP and its uses to an environment?

    Short for Dynamic Host Configuration Protocol, a protocol for assigning dynamicIP addresses to devices on a network. With dynamic addressing, a device can havea different IP address every time it connects to the network. In some systems,the devices IP address can even change while it is still connected. DHCP also supports a mix of static and dynamic IP addresses.

    Dynamic addressing simplifies network administration because the software keepstrack of IP addresses rather than requiring an administrator to manage the task.This means that a new computer can be added to a network without the hassle ofmanually assigning it a unique IP address. Many ISPs use dynamic IP addressing f

    or dial-up users.

    What does SNMP stands for?

    Short for Simple Network Management Protocol, a set of protocols for managing complex networks. SNMP works by sending messages, called Protocol Data Units, to different parts of a network. SNMP-compliant devices, called Agents, store data about themselves in Management Information Bases and return this data to the SNMPrequesters.

  • 7/22/2019 aix Q

    4/34

    What do you know about TCPDump?

    TCPdump is a common computer network debugging tool that runs under the commandline. It allows the user to intercept and display TCP/IP and other packets beingtransmitted or received over a network to which the computer is attached. Tcpdump works on most Unix-like platforms: Linux, Solaris, BSD, Mac OS X, HP-UX and AIX among others. On Windows, WinDump can be used; its a port of tcpdump to Windows.

    You must have a root or super user authority to use TCPdumps in UNIX like envrionment.

    How do I remove a volume group with no disks?

    This is a very common question about AIX LVM and I knew that you will ask me this one. Within a volume group there is a Volume Group Descriptor Area (VGDA) which is kinda a suitcase of lvm information. This is what allows you to pick up yourdrives and take them to another machine, importvg them, and get filesystems automatically defined.

    What happens, when you importvg the volume group, the command goes out and readsthe VGDA and finds out about all the logical volumes and filesystems that may exist on thevolume group. It then checks for clashes (name conflicts, etc..) on its own mach

    ine and then, populates its own database with information about the new volume group andits associated logical volumes. In cases of file systems, it will go into the /etc/filesystems file and add the new filesystem entries that came along with theimported volume group.

    The main question I see is Ive taken away the disks, but how do I get rid of the volume group. The question should really say, How do I get rid of the volume groupINFORMATION since thats all you have on the system. Youve got possible entries inthe /etc/filesystems and definitely entries in the ODM. Just do:exportvg

    It does a reverse importvg, except it doesnt go off and read the VGDA. It nukes a

    nything relating to the volume group in the /etc/filesystems and ODM. The only time this wont work is if the system detects that the volume group is varied on. Then, it would be like trying to change tires on a moving car, we wont let you doit!

    How do you you get rid of a disk that is no longer really in the VG?In this case, you DONT want to do an exportvg. What you want to do is tell the system you want to cut out the memory of the old, bad disk from the RS/6000 AND from the VGDA of the volume group. You simply do: reducevg -d -for if the hdname cant be found:reducevg -d -fBe careful with this command. Unlike the exportvg command, actions donewith this command WILL affect the VGDA information on the platter.

    What is Capacity on Demand?

    Capacity on Demand (CoD) encompasses the various capabilities for you to dynamically activate one or more resources on your server as your business peaks dictate. You can activate inactive processors or memory units that are already installed on your server on a temporary and permanent basis.

    Usually, the Capacity on Demand is used for IBM System i5 and eServer i5 and IBM System p5 and eServer p5 520, 550, 570, 590, and 595 models. Some servers include

  • 7/22/2019 aix Q

    5/34

    a number of active and inactive resources. Active processors and active memory units are resources that are available for use on your server when it comes fromthe manufacturer. Inactive processors and inactive memory units are resources that are included with your server but are not available for use until you activate them.

    What is Hardware Management Console (HMC)?

    The HMC is a server or stand alone machine that provides a graphical user interface tool to manage several Power Systems. The HMC manages system through hypervisor and operating system. From version 7 it is truly web based and you can configure, installs and manage, partitioned, virtualization most of your Power5 and 6boxes via HMC. There are many tasks you can do with HMC, such as,

    * Powering off and on of the partition* Configure and activate resources to the system* Creates and stores LPAR profiles and allocated resources to them.* HMC do the dynamic memory reconfiguration of the partition.* Setup VIO server and VIO client thru HMC and do micro-partition, create st

    orage* pool and processor pool with it* Provide virtual console to the partition

    Most of the time we installed dual HMC for redundancy and make sure to achieve m

    ore uptime in a wide system

    Why do I need a Hardware Management Console, anyway?

    You need a HMC if you plan to:

    Configure and manage logical partitions and partition profiles (selected modelscan configure LINUX partitions without a HMC). Perform DLPAR (dynamic LPAR) functions. Activate and manage Capacity on Demand resources.

    You can also use the HMC to: Perform service functions

    Manage frames (towers), IOPs and IOAs. * Note that you cannot see below the IOAto the device level. Manage system profiles (yes, you can have more than one!) Power on and power down. The Service Processor is always hot if there is power to the server. Activate and manage Virtualization Engine technologies. 5250 emulation so you can get a console up on a i5/OS partition or a virtual terminal window for AIX or LINUX.

    What is kernel?

    The kernel is the essential center of a computer operating system, the core thatprovides basic services for all other parts of the operating system. A synonym

    is nucleus. A kernel can be contrasted with a shell, the outermost part of an operating system that interacts with user commands. Kernel and shell are terms used more frequently in UNIX operating systems than in IBM mainframe or Microsoft Windows systems.

    Typically, a kernel (or any comparable center of an operating system) includes an interrupt handler that handles all requests or completed I/O operations that compete for the kernels services, a scheduler that determines which programs sharethe kernels processing time in what order, and a supervisor that actually givesuse of the computer to each process when it is scheduled. A kernel may also incl

  • 7/22/2019 aix Q

    6/34

    ude a manager of the operating systems address spaces in memory or storage, sharing these among all components and other users of the kernels services. A kernels services are requested by other parts of the operating system or by application programs through a specified set of program interfaces sometimes known as systemcalls.

    What is RMC?

    The Resource Monitoring and Control (RMC) subsystem is the scalable backbone ofRSCT that provides a generalized framework for managing resources within a single system or a cluster. Its generalized framework is used by cluster management tools to monitor, query, modify, and control cluster resources. RMC provides a single monitoring and management infrastructure for both RSCT peer domains and management domains. RMC can also be used on a single machine, enabling you to monitor and manage the resources of that machine. However, when a group of machines,each running RMC, are clustered together, the RMC framework allows a process onany node to perform an operation on one or more resources on any other node in the domain.

    What information is stored in Object Data Manager?

    It is a database of system and device configuration information integrated intoIBMs AIX operating system. The ODM is unique to AIX compared to other UNIX operating systems.

    Example of information stored in the ODM database are:

    * Network configuration* Logical volume management configuration* Installed software information* Devices that AIX has drivers for* Logical devices or software drivers* Physical hardware device installed* Menus, screens and commands that SMIT uses

    Explain a little about Vital Product Data (VPD)?

    VPD in AIX and Linux is a collection of configuration and informational data associated with a particular set of hardware or software. VPD refers to a subset ofdatabase tables in the Object Data Manager (ODM), Therefore the VPD and ODM terms are sometimes referred to interchangeably.

    Vital product data (VPD) stores information such as part numbers, serial numbers, and engineering change levels from the Customized VPD object class or platformspecific areas, not all devices contain VPD data.

    Does HACMP work on different operating systems?

    Yes. HACMP is tightly integrated with the AIX 5L operating system and System p servers allowing for a rich set of features which are not available with any othe

    r combination of operating system and hardware. HACMP V5 introduces support forthe Linux operating system on POWER servers. HACMP for Linux supports a subset of the features available on AIX 5L, however this mutli-platform support providesa common availability infrastructure for your entire enterprise.What applications work with HACMP?

    All popular applications work with HACMP including DB2, Oracle, SAP, WebSphere,etc. HACMP provides Smart Assist agents to let you quickly and easily configureHACMP with specific applications. HACMP includes flexible configuration parameters that let you easily set it up for just about any application there is.

  • 7/22/2019 aix Q

    7/34

    Does HACMP support dynamic LPAR, CUoD, On/Off CoD, or CBU?

    HACMP supports Dynamic Logical Partitioning, Capacity Upgrade on Demand, On/OffCapacity on Demand and Capacity Backup Upgrade.If a server has LPAR capability, can two or more LPARs be configured with uniqueinstances of HACMP running on them without incurring additional license charges?

    Yes. HACMP is a server product that has one charge unit: number of processors onwhich HACMP will be installed or run. Regardless of how many LPARs or instancesof AIX 5L that run in the server, you are charged based on the number of activeprocessors in the server that is running HACMP. Note that HACMP configurationscontaining multiple LPARs within a single server may represent a potential single point-of-failure. To avoid this, it is recommended that the backup for an LPARbe an LPAR on a different server or a standalone server.

    Does HACMP support non-IBM hardware or operating systems?

    Yes. HACMP for AIX 5L supports the hardware and operating systems as specified in the manual where HACMP V5.4 includes support for Red Hat and SUSE Linux.What is nmon tool do?

    The nmon tool is designed for AIX and Linux performance specialists to use for monitoring and analyzing performance data, including:

    * CPU utilization* Memory use* Kernel statistics and run queue information* Disks I/O rates, transfers, and read/write ratios* Free space on file systems* Disk adapters* Network I/O rates, transfers, and read/write ratios* Paging space and paging rates* CPU and AIX specification* Top processors* IBM HTTP Web cache* User-defined disk groups

    * Machine details and resources* Asynchronous I/O AIX only* Workload Manager (WLM) AIX only* IBM TotalStorage Enterprise Storage Server (ESS) disks AIX only* Network File System (NFS)* Dynamic LPAR (DLPAR) changes only pSeries p5 and OpenPower for either AIX

    or Linux

    Also included is a new tool to generate graphs from the nmon output and create .gif files that can be displayed on a Web site.

    What is Logical Volume Manager(LVM) means?

    The set of operating system commands, library subroutines and other tools that allow you to establish and control logical volume storage is called the Logical Volume Manager (LVM).

    What is a Logical partition?

    A logical partition (LPAR) is the division of a computers processors, memory, andhardware resources into multiple environments so that each environment can be operated independently with its own operating system and applications.

  • 7/22/2019 aix Q

    8/34

    Explain Network File Systems(NFS)?

    The Network File System (NFS) is a distributed file system that allows users toaccess files and directories of remote servers as if they were local. Suppose,

    Server A, that makes its file systems, directories, and other resources available for remote access. Clients computers, or their processes, that use a servers resources.

    Export the act of making file systems available to remote clients.

    Mount the act of a client accessing the file systems that a server exports.

    What is Network Information Service (NIS)?

    NIS was developed to simplify the task of administrating a number of machines over a network. In particular was the requirement to maintain copies of common files (e.g. password, group and host) across different systems.

    What is software RAID Levels do?Redundant Arrays of Independent Disks (RAID) is formally defined as a method tostore data on any type of disk medium.

    LDAP

    The Light Directory Access Protocol (LDAP) defines a standard method for accessing and updating information in a directory (a database) either locally or remotely in a client-server model.tags: AIX, Answers, Interview, Questions AIX commands and tools for DB2 troubleshooting newerolder Setting Defaults of a variable 1 Response to "AIX Interview Questions and Answers". Add a comment? or Follow comments by RSS?

    1. Pingback by AIX Interview Questions and Answers : Job finding tips, interviewquestions and tips, Resume & Cover Letter Samples

    9/Jun/2010 at 11:16 pm

    [...] on | June 10, 2010 | No Comments Click here. AIX Interview Questions and Answers Category: Technical Interview Questions and [...]

    * Home*

    Archive for the AIX Interview Questions & Answers Category.AIX commands and tools for DB2 troubleshootingNovember 12, 2009, 9:52 PM

    Why it is here..

    Though this site is for interview and questions related posting, but my recent web surfing experiences stuck my eye on an excellent 'how to' which forced me topost this article here. I hope our readers will get help from this post too. Enjoy..

    Introduction

  • 7/22/2019 aix Q

    9/34

    There are many scenarios where the troubleshooting of DB2 issues can involve andbenefit from gathering operating system level data and analyzing it to understand the issues further.

    This article discusses a number of problems you may face with your database including CPU usage problems, orphan processes, database corruption, memory leaks, hangs and unresponsive application.

    Here the author tried to explain some AIX utilities and commands to help you understand and resolve each of these troublesome issues. The data you collect fromrunning these commands can be sent to the IBM Technical Support Team when opening a problem management request (PMR) in order to expedite the PMR support process. The end of each section of this article discusses the documents you should gather to send to the Technical Support Team. While this article gives troubleshooting tips to use as a guideline, you should contact the IBM Technical Support Team for official advice about these problems.

    Monitor CPU usage

    In working with your database, you might notice a certain DB2 process consuminga high amount of CPU space. This section describes some AIX utilities and commands which you can use either to analyse the issue yourself or to gather data before submitting a PMR to IBM Technical Support:

    Through ps Command:

    A ps command reveals the current status of an active process. You can use

    ps -auxw | sort r +3 |head 10

    to sort and get a list of the top 10 highest CPU consuming processes. Listing 1shows the ps output:

    Listing 1. Sample ps output

    root@mavrickit $ ps auxw|sort -r +3|head -10

    USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME COMMAND

    scot 1658958 0.1 9.0 218016 214804 - A Sep 13 38:16 db2agent (idle) 0

    dpf 1036486 0.0 1.0 14376 14068 - A Sep 17 3:10 db2hmon 0

    scot 1822932 0.0 1.0 12196 11608 - A Sep 12 6:41 db2hmon 0

    dpf 1011760 0.0 0.0 9264 9060 - A Sep 17 3:03 db2hmon 3

    dpf 1532116 0.0 0.0 9264 9020 - A Sep 17 3:04 db2hmon 2

    dpf 786672 0.0 0.0 9264 8984 - A Sep 17 3:02 db2hmon 5

    dpf 1077470 0.0 0.0 9264 8968 - A Sep 17 3:03 db2hmon 1

    dpf 1269798 0.0 0.0 9248 9044 - A Sep 17 2:50 db2hmon 4

    db2inst1 454756 0.0 0.0 9012 7120 - A Jul 19 0:52 db2sysc 0

  • 7/22/2019 aix Q

    10/34

    Through topas Command

    When executing a ps -ef command, you see the CPU usage of a certain process. Youcan also use the topas command to get further details. Similar to the ps command, a topas command retrieves selected statistics about the activity on the localsystem. Listing 2 is a sample topas output that shows a DB2 process consuming 33.3% CPU. You can use the topas output to get specific information such as the process id, the CPU usage and the instance owner who started the process. It is normal to see several db2sysc processes for a single instance owner. DB2 processes are renamed depending on the utility being used to list process information:

    Listing 2. Sample topas output

    Name PID CPU% PgSp Owner

    db2sysc 105428 33.3 11.7 udbtest

    db2sysc 38994 14.0 11.9 udbtest

    test 14480 1.4 0.0 root

    db2sysc 36348 0.8 1.6 udbtest

    db2sysc 116978 0.5 1.6 udbtest

    db2sysc 120548 0.5 1.5 udbtest

    sharon 30318 0.3 0.5 root

    lrud 9030 0.3 0.0 root

    db2sysc 130252 0.3 1.6 udbtest

    db2sysc 130936 0.3 1.6 udbtest

    topas 120598 0.3 3.0 udbtest

    db2sysc 62248 0.2 1.6 udbtest

    db2sysc 83970 0.2 1.6 udbtest

    db2sysc 113870 0.2 1.7 root

    Through vmstat Command

    The vmstat command can be used to monitor CPU utilization; you can get details on the amount of user CPU utilization as well as system CPU usage. Listing 3 shows the output from a vmstat command:

    Listing 3. Sample vmstat output

    kthr memory page faults cpu

  • 7/22/2019 aix Q

    11/34

    ----- ----------- ------------------------ ------------ -----------

    r b avm fre re pi po fr sr cy in sy cs us sy id wa

    32 3 1673185 44373 0 0 0 0 0 0 4009 60051 9744 62 38 0 0

    24 0 1673442 44296 0 0 0 0 0 0 4237 63775 9214 67 33 0 0

    30 3 1678417 39478 0 0 0 0 0 0 3955 70833 8457 69 31 0 0

    33 1 1677126 40816 0 0 0 0 0 0 4101 68745 8336 68 31 0 0

    28 0 1678606 39183 0 0 0 0 0 0 4525 75183 8708 63 37 0 0

    35 1 1676959 40793 0 0 0 0 0 0 4085 70195 9271 72 28 0 0

    23 0 1671318 46504 0 0 0 0 0 0 4780 68416 9360 64 36 0 0

    30 0 1677740 40178 0 0 0 0 0 0 4326 58747 9201 66 34 0 0

    30 1 1683402 34425 0 0 0 0 0 0 4419 76528 10042 60 40 0 0

    0 0 1684160 33808 0 0 0 0 0 0 4186 72187 9661 73 27 0 0

    When reading a vmstat output, as above, you can ignore the first line. The important columns to look at are us, sy, id and wa. Whereas

    id: Time spent idle.wa: Time spent waiting for I/O.us: Time spent running non-kernel code. (user time)sy: Time spent running kernel code. (system time)

    In Listing 3, the system is hitting an average of 65% user CPU usage and 35% system CPU usage. Pi and Po values are equal to 0, thus there are no paging issues.The wa column shows there does not seem to be any I/O issues.

    Listing 4 shows the wa (waiting on I/O) to be unusually high and this indicatesthere might be I/O bottlenecks on the system which in turn causes the CPU usageto be inefficient. You can check errpt -a output to see if there are any reported issues with the media or I/O on the system.

    Listing 4. Sample vmstat output showing I/O issues

    Kthr memory page faults cpu

    ----- ----------- ------------------------ ------------ -----------

    r b avm fre re pi po fr sr cy in sy cs us sy id wa

    2 8 495803 3344 0 0 0 929 1689 0 998 6066 1832 4 3 76 16

    0 30 495807 3340 0 0 0 0 0 0 1093 4697 1326 0 2 0 98

    0 30 495807 3340 0 0 0 0 0 0 1055 2291 1289 0 1 0 99

    0 30 495807 3676 0 2 0 376 656 0 1128 6803 2210 1 2 0 97

  • 7/22/2019 aix Q

    12/34

    0 29 495807 3292 0 1 3 2266 3219 0 1921 8089 2528 14 4 0 82

    1 29 495810 3226 0 1 0 5427 7572 0 3175 16788 4257 37 11 0 52

    4 24 495810 3247 0 3 0 6830 10018 0 2483 10691 2498 40 7 0 53

    4 25 495810 3247 0 0 0 3969 6752 0 1900 14037 1960 33 5 1 61

    2 26 495810 3262 0 2 0 5558 9587 0 2162 10629 2695 50 8 0 42

    3 22 495810 3245 0 1 0 4084 7547 0 1894 10866 1970 53 17 0 30

    Through iostat Command

    An iostat command quickly tells you if your system has a disk I/O-bound performance problem. Listing 5 is an example of an iostat command output:

    Listing 5. Sample iostat output

    System configuration: lcpu=4 disk=331

    tty: tin tout avg-cpu: % user % sys % idle % iowait

    0.0 724.0 17.9 12.3 0.0 69.7

    Disks: % tm_act Kbps tps Kb_read Kb_wrtn

    hdisk119 100.0 5159.2 394.4 1560 24236

    hdisk115 100.0 5129.6 393.0 1656 23992

    hdiskpower26 100.0 10288.8 790.8 3216 48228

    %tm_act : Reports back the percentage of time that the physical disk was activeor the

    total time of disk requests.Kbps : Reports back the amount of data transferred to the drive in kilobytes.tps : Reports back the number of transfers-per-second issued to the physical disk.Kb_read : Reports back the total data (kilobytes) from your measured interval that is read

    from the physical volumes.Kb_wrtn : Reports back the amount of data (kilobytes) from your measured interval that is written to the physical volumes.

    To check if you are experiencing resource contention, you can focus on the %tm_act value from the above output. An increase in this value, especially more than40%, implies that processes are waiting for I/O to complete, and you have an I/Oissue on your hands. Checking which hard disk has higher disk activity percentage and whether DB2 uses those hard disks gives you a better idea if these two factors are related.

  • 7/22/2019 aix Q

    13/34

    What to collect

    You should collect the following information before opening a PMR with IBM Technical Support:

    *db2support.zip

    *of high cpu process

    *of high cpu process

    Technical support might also send you the db2service.perf1 script which basically collects data repeatedly over a period of time. The output of the script needsto be bundled and sent back to the support team for their further analysis.

    Troubleshoot orphan processes

    There are scenarios when, even after doing a db2stop, you notice (by doing a ps-ef | grep DB2) certain DB2 processes such as the db2fmp process still running and consuming resources. If there was a case of abnormal shutdown, it is advisedto do a ipclean after the instance has been stopped. Doing a db2stop should inherently shutdown all DB2 related processes; however, if an application using thos

    e processes was abnormally terminated, this might cause related DB2 processes tobecome orphan processes.

    Orphan DB2 processes are those which are not attached or linked to any other DB2processes. Abnormal termination of an application includes shutting it down bydoing a Ctrl+C, closing the KSH session or killing it with a -9 option.

    One way of confirming that the process is orphaned, is to try and match the process ID (PID) of the orphaned process from the ps -ef output with the Coordinatorcolumn of the db2 list applications show detail output. If the PID cannot be found in the db2 list apps output, then it is an orphan process. For example, if you issue a db2 list applications show detail command, you get this output:

    Listing 6. Sample list applications output

    CONNECT Auth Id Application Name Appl. Application Id Seq# Number of Coordinating DB

    Coordinator Status Status Change Time DB Name DB Path

    Handle Agents partition number pid/thread

    JDE test.exe 2079 AC1C5C38.G80D.011F44162421 0001 1 0 2068646

    UOW Waiting 04/04/2006 09:25:17.036230 PTPROD

    /db2pd/otprod/ptprod/otprod/NODE0000/SQL00001/

    --NOTICE PID 2068646. This is the PID on the local server.

  • 7/22/2019 aix Q

    14/34

    Part of the ps -ef output from the server:

    ps -ef |grep 2068646

    otprod 2068646 483566 0 09:06:28 - 0:59 db2agent (PTPROD) 0

    This output shows the process with PID of 2068646 is not an orphaned process andis still attached to a DB2 process.

    In order to avoid orphan processes, you may want to do the following: Make normal, clean exits at the client side so that DB2 is aware and can clean up resources on the server. Tweak values of TCPKEEPIDLE time to a number less than the default, and tune the DB2CHECKCLIENTINTERVAL and KEEPALIVE values.

    What to collect

    If you do notice orphan processes and wish to investigate this issue, you shouldcollect the following information before opening a PMR with IBM Technical Support:

    - grep db2 output

    -db2support.zip with -c option

    - A callstack of the process that is collected using dbx, db2pd -stack or kill -36 . The dbx command is a popular command line debugger used in both Solaris and AIX systems. The dbx output is helpful and can be run as follows:

    Listing 7. The dbx command

    dbx -a

    At the dbx prompt type

    th --- Displays all threads for the process

    th info --- Displays additional info about the threads

    where --- Get stack trace for thread 1

    th current 1 --- Makes t1 current

    where --- Displays stack for thread 1

    th current 2 --- Makes thread 2 current

    where --- Displays stack for thread 2.

    ... continue for all threads of the process

    detach - --- Detach from process

    dbx -a

    Detect database corruption

  • 7/22/2019 aix Q

    15/34

    You can start to investigate whether the database is corrupted if a user complains of not being able to access certain database objects or is unable to connectto a specific database partition. The following section highlights some of the errors that are logged by DB2 and how you can ensure that there are no operatingsystem (OS) level issues affecting or causing DB2 database corruption. You mightnotice errors similar to the one in Listing 8 being logged in the db2diag.log:

    Listing 8. Corruption errors

    RETCODE : ZRC=0x87040001=-2029780991=SQLD_BADPAGE "Bad Data Page"

    DIA8500C A data file error has occurred, record id is "".

    Or

    RETCODE: ZRC=0x86020019=-2046689255=SQLB_CSUM "Bad Page, Checksum Error"

    DIA8426C A invalid page checksum was found for page "".

    Or

    2007-07-09-11.29.45.696176+120 I16992C16377 LEVEL: Severe

    PID : 68098 TID : 1 PROC : db2agent (sample)

    INSTANCE: instest NODE : 000 DB : sample

    APPHDL : 0-635 APPID: *LOCAL.instest.070709082609

    FUNCTION: DB2 UDB, buffer pool services, sqlbcres, probe:20

    MESSAGE : Important: CBIT Error

    DATA #1 : Hexdump, 4096 bytes

    These errors are logged when DB2 tries to access data in a container and there is some form of corruption. In such an instance when DB2 cannot access the data,the database might be marked as bad. You can narrow down where there might be possible corruption. In the db2diag.log, look for messages similar to the following:

    Listing 9. Corruption errors showing database object details

    2006-04-15-03.15.37.271601-360 I235258C487 LEVEL: Error

    PID : 152482 TID : 1 PROC : db2reorg (SAMPLE) 0

    INSTANCE: instest NODE : 000 DB : SAMPLE

    APPHDL : 0-68 APPID: *LOCAL.SAMPLE.060415091532

  • 7/22/2019 aix Q

    16/34

    FUNCTION: DB2 UDB, buffer pool services, sqlbrdpg, probe:1146

    DATA #1 : String, 124 bytes

    Obj={pool:5;obj:517;type:0} State=x27 Parent={5;517}, EM=55456,

    PP0=55488 Page=55520 Cont=0 Offset=55552 BlkSize=12

    BadPage

    The above errors indicate corruption has occurred in tablespace:5 and tableid:517. To check which table this refers to, execute the following SQL query:

    Listing 10. Query to find a table with corruption

    db2 "select tabname, tbspace from syscat.tables where tbspaceid = 5 and tableid= 517"

    On the Operating System (OS) level, the most common causes for corruption are either hardware issues or file system corruption. For example, in the db2diag.logif you see the database being marked damaged with a ECORRUPT (89) error as follows :

    Listing 11. Sample file system-related corruption errors

    2007-05-22-13.45.52.268785-240 E20501C453 LEVEL: Error (OS)

    PID : 1646696 TID : 1 PROC : db2agent (SAMPLE) 0

    INSTANCE: tprod NODE : 000 DB : SAMPLE

    APPHDL : 0-32 APPID: GA260B45.M505.012BC2174219

    FUNCTION: DB2 UDB, oper system services, sqloopenp, probe:80

    CALLED : OS, -, unspecified_system_function

    OSERR : ECORRUPT (89) "Invalid file system control data detected."

    You can check the following

    Review the errpt -a output and look for hardware I/O or disk-related messages. Listing 12 is an example of an errpt -a output which shows a file system corruption:

    Listing 12. Sample errpt output

    LABEL: J2_FSCK_REQUIRED

    IDENTIFIER: B6DB68E0

    Date/Time: Thu Jun 7 20:59:49 DFT 2007

    Sequence Number: 139206

    Machine Id: 000BA256D600

  • 7/22/2019 aix Q

    17/34

    Node Id: cmab

    Class: O

    Type: INFO

    Resource Name: SYSJ2

    Description

    FILE SYSTEM RECOVERY REQUIRED

    Probable Causes

    INVALID FILE SYSTEM CONTROL DATA DETECTED

    Recommended Actions

    PERFORM FULL FILE SYSTEM RECOVERY USING FSCK UTILITY

    OBTAIN DUMP

    CHECK ERROR LOG FOR ADDITIONAL RELATED ENTRIES

    Detail Data

    ERROR CODE

    0000 0005

    JFS2 MAJOR/MINOR DEVICE NUMBER

    0032 0004

    CALLER

    0028 8EC8

    CALLER

    0025 D5E4

    CALLER

    002B 4AC8

    2. Run the fsck command on the file system where the container resides to be sure that it is sound. fsck interactively checks and repairs any file system malfun

    ction. From the pSeries and AIX Information Center we can find the following examples of using the fsck command.

    Listing 13. The fsck command

  • 7/22/2019 aix Q

    18/34

    To check all the default file systems enter:

    fsck

    This form of the fsck command asks you for permission

    before making any changes to a file system.

    To check the file system /dev/hd1, enter:

    fsck /dev/hd1

    This checks the unmounted file system located on the /dev/hd1 device.

    What to collect

    You should collect the following information before opening a PMR with IBM Technical Support:

    1. errpt -a2. db2support.zip3. fsck results

    Debug memory leaks

    It is important to distinguish, if possible, between a memory leak and a system-wide performance degradation due to increased demands for memory. So initially it is pertinent to check that nothing has changed in the environment that could explain increased memory usage. The rest of this section discusses how to use AIXOperating System techniques to spot, track and debug those leaks. The article does not discuss detailed DB2 tools and techniques, although there is some mention where necessary.

    What is a memory leak?

    A particular kind of unintentional memory consumption by a computer program where the program fails to release memory when no longer needed. This condition is normally the result of a bug in a program that prevents it from freeing up memorythat it no longer needs. The term is meant as a humorous misnomer, since memoryis not physically lost from the computer. Rather, memory is allocated to a program, and that program subsequently loses the ability to access it due to programlogic flaws.

    Specifically, it is a bug in the code whereby malloc() memory allocation calls are not met by corresponding free() memory calls. No corresponding free() system

    calls lead to unfreed blocks. Typically this is a slow process and occurs over days or weeks particularly if the process is left active as is often the case. Some leaks are not even detectable, particularly if the application terminates andits processes are destroyed.

    Lisitng 14 is an example of a C code snippet that demonstrates memory leak. In this instance, memory was available and pointed to by the variable 's,' but it was not saved. After this function returns, the pointer is destroyed and the allocated memory becomes unreachable, but it remains allocated.

  • 7/22/2019 aix Q

    19/34

    Listing 14. Sample c code

    #include

    #include

    void f(void)

    {

    void* s;

    s = malloc(50); /* get memory */

    return; /* memory leak - see note below */

    /*

    * Memory was available and pointed to by s, but not saved.

    * After this function returns, the pointer is destroyed,

    * and the allocated memory becomes unreachable.

    *

    * To "fix" this code, either the f() function itself

    * needs to add "free(s)" somewhere or the s needs

    * to be returned from the f() and the caller of f() needs

    * to do the free().

    */

    }

    int main(void)

    {

    /* this is an infinite loop calling the above function */

    while (1) f(); /* Malloc will return NULL sooner or later, due to lack of memory */

    return 0;

    }

    How to spot, track and debug memory leaks

    To begin with, you should call IBM if you suspect a DB2 process is leaking memory. But how do you know that you are experiencing this situation? This section discusses some of the options.

  • 7/22/2019 aix Q

    20/34

    The first option is to use the ps utility. The ps utility can be used to quicklyand simply determine if a process is leaking. This example demonstrates how a particular process is growing in size:

    Listing 15. Sample 'ps aux' output showing the process growing in size

    ps aux:

    1st iteration:

    USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME

    COMMAND

    db2inst1 225284 0.2 0.0 19468 18280 - A 11:26:06 10:34

    db2logmgr

    2nd iteration:

    db2inst1 225284 0.1 0.0 19696 18512 - A 11:26:06 10:34

    db2logmgr

    3rd iteration:

    db2inst1 225284 0.1 0.0 19908 18724 - A 11:26:06 10:36

    db2logmgr

    4th iteration:

    db2inst1 225284 0.1 0.0 20116 18932 - A 11:26:06 10:36

    db2logmgr

    5th iteration:

    db2inst1 225284 0.1 0.0 20312 19128 - A 11:26:06 10:37

    db2logmgr

    ps -kelf:

  • 7/22/2019 aix Q

    21/34

    1st iteration:

    F S UID PID PPID C PRI NI ADDR SZ WCHAN

    STIME TTY TIME CMD

    40001 A db2inst1 225284 254158 0 60 20 580e59400 18466

    11:26:06 - 10:34 db2logmgr (***) 0

    2nd iteration:

    40001 A db2inst1 225284 254158 1 60 20 580e59400 18696

    11:26:06 - 10:34 db2logmgr (***) 0

    3rd iteration:

    40001 A db2inst1 225284 254158 0 60 20 580e59400 18900

    11:26:06 - 10:36 db2logmgr (***) 0

    4th iteration:

    40001 A db2inst1 225284 254158 0 60 20 580e59400 20106

    11:26:06 - 10:36 db2logmgr (***) 0

    5th iteration:

    40001 A db2inst1 225284 254158 0 60 20 580e59400 20312

    11:26:06 - 10:37 db2logmgr (***) 0

    The SZ and RSS values in the ps aux output are the 2 key columns to focus on when trying to spot a potential memory leak. As you can see, the values in bold are

    increasing. It is not sufficient, however, to determine root cause and more debugging is certainly required. Again, please raise this issue with IBM TechnicalSupport, but what follows are some likely problem determination steps IBM will take.

    Debug using procmap and gencore

    As root:

    1. procmap > procmap.1

  • 7/22/2019 aix Q

    22/34

    2. ps aux > ps_aux.13. ps -kelf > ps_kelf.14. gencore and sleep for a period of time, then

    procmap > procmap.2

    1. ps aux > ps_aux.22. ps -kelf > ps_kelf.23. gencore < file>

    Then repeat these steps again for another 2 or 3 iterations. Please note, on 64bit AIX, the gencore creates very large files. Regardless of the word size, fullcore needs to be enabled. The following commands can be used to check that the environment is set up correctly:

    Listing 16. The lsattr command

    lsattr -El sys0| grep -i core

    fullcore true Enable full CORE dumpTrue

    And the limits for the instance owner needs to be set appropriately too. You maywell be asked to enable MALLOC_DEBUG and export this to the DB2 environment. What follows is an example of this:

    To start DB2 memory debugging for the next time the instance is started, run: db2set DB2MEMDBG=FFDC .

    > To start malloc debugging for the next time the instance is started, run: export MALLOCDEBUG log:extended stack_depth 12.And append MALLOCDEBUG to the DB2 registry variable DB2ENVLIST:

    > db2set DB2ENVLIST MALLOCDEBUG.Then stop and restart DB2.

    Once the core files have been created, you can use snapcore to bundle the core files and libraries into pax file. An example of snapcore is as follows:

    Listing 17. Sample snapcore

    snapcore /home/db2inst1/sqllib/db2dump/c123456/core

    /home/db2inst1/sqllib/adm/db2sysc

    This creates a file with a *.pax extension in /tmp/snapcore by default. The corefile is useless without the executable that cored, in this case it was db2syscnot db2logmgr, which was seen to be growing, because that is a process not an executable. DB2 support is then able to interrogate the core to track the DB2 malloc() allocations against free() calls.

    Recover from hangs

  • 7/22/2019 aix Q

    23/34

    What is a hang

    A hang occurs when a process has not moved forward or changed after a period oftime. This can happen if a thread or process reaches a point in its execution where it can go no further and is waiting for a response. It also occurs when theprocess is in a very tight loop and never completes the function.

    The first step is to identify if what you are experiencing is a hang or a severedegradation. Then you need to understand what is affected, or the scope. Some simple questions can help a lot:

    * Why do you think it has hung?* Are all DB2 commands hanging?* How long has the command been running for?* How long does it normally run for

    Then to access the scope:

    * Are OS commands hanging too? If the answer to this is yes, then you need get assistance from the AIX support team.

    * Are db2 connect statements affected?* Can SQL be issued over existing connections?* If in a DPF environment, can you issue commands against other partitions?* Can you issue commands against other databases?

    Recovery

    Remember, please collect the stacks before you recover. Once you have the stacksthe only choice you have is to issue db2_kill. Then check for any processes andIPCs shared memory, message queues and semaphores left lying around after the kill. You may have to remove any you find manually. You could also try ipclean toremove these resources. If the IPCs are not cleared out by ipclean or ipcrm andthe processes are removed by kill -9, then the process is most likely hung in the kernel and you need to call AIX support.

    Once it has come down, restart with db2start and then do a restart db command.

    What to collect

    The single most important piece of information to collect is a stack trace of the process that is believed to be hung. IBM DB2 support cannot debug a hang without this, and the stack trace must be collected prior to recovering DB2. If thisis not done, you may have another outage in the future.

    There will be pressure to restart DB2, but you must resist. The system must be in a hung state in order to diagnose the root cause of the problem and do the necessary debugging. A restart clears the situation and you have lost the window ofopportunity to make the necessary changes. More seriously, you cannot provide any confidence that it won't recur. Thus, you need to resist the pressure to restart DB2 until you have collected all the diagnostics.

    The following table describes good probelm determination (PD) and data caputre versus bad PD and data capture. Note that the best PD and data caputre requires the fewest steps and has a better change of success in determining root cause.

    Poor PD and data capture:

    * Occurrence* Detection* Recovery

  • 7/22/2019 aix Q

    24/34

    * FFDC on (requires restart)* Restart (outage #2) Schedule outage, hopefully problem does not reoccur be

    fore* Occurrence (outage #3)* Detection* Data Collection* Recovery* Diagnosis (clock ticking)

    Better PD and data capture:

    * Occurrence (outage #1)* Detection* Recovery* FFDC on* Occurrence (outage #2)* Detection* Data Collection* Recovery* Diagnosis (clock ticking)

    Good PD and data capture:

    * Occurrence (outage #1)

    * Detection* Data Collection* Recovery* Diagnosis (clock ticking)

    Stack traces

    A stack trace is a snapshot of the function calls at a particular point in time.So multiple stack traces, a few minutes apart, provide a sense of motion. Thereare a variety of ways to collect stack traces; the following lists are, in my opinion, the most reliable:Procstack >> pid.pstack.outThis is an AIX utility that just dumps the stack to a file. In this instance, I

    am appending the file because it is run again later and I do not want to have tore-write it.

    Kill -36

    This command does not kill the process, but it sends a signal to dump its stack.This actually creates a fully-formatted trap file to the DIAGPATH area of DB2.Because it gives more information than procstack and the way it works internally, it is generally more expensive, particularly if there are hundreds of processes, which is often the case. The main focus of this article is to discuss AIX operating system tools to debug DB2. No discussion of hang problem determination iscomplete without mentioning db2pd, so the following invocations can be used to

    generate stacks traces:

    db2pd -stacks (This generates stack dumps again all PID)db2pd -stack (This generates a stack dump for the PID specified)The trap file is created in the DIAGPATH area. Listing 18 shows an example of its usage:

    Listing 18. db2pd -stacks usage

  • 7/22/2019 aix Q

    25/34

    1. -stacks

    $ db2pd -stacks

    Attempting to dump all stack traces for instance.

    See current DIAGPATH for trapfiles.

    2. -stack

    $ db2pd -stack 1454326

    Attempting to dump stack trace for pid 1454326.

    See current DIAGPATH for trapfile.

    The DB2 support will ask you to tar and compress the DIAGPATH area. Most commonly they will ask you to run a db2support command which does it for you, providingthe correct flags are used. However, if you use the OS method of procstack, youhave to submit the output files.

    Truss

    The truss command can be used but is not as effective as a stack dump and is only likely to reveal anything if the processes is looping and can be reproduced. If the process is hung, only a stack dump can reveal how it got there.

    ps

    It is also a good idea to collect ps listings for all partitions, if applicable,before and after the stack dumps. If you collect the data manually the pseudo-code looks like this:Listing 19. procstack

    Procstack Pid or PIDs >> procstack.out

    Ps eafl >> pseafl.out

    Ps aux >> psaux.out

    Sleep 120

    Repeat for at least 3 iterations.

    Or:

    Kill -36 or PIDs

    Ps eafl >> pseafl.out

    Ps aux >> psaux.out

    Sleep 120

    Repeat for at least 3 iterations.

    NB: IBM DB2 support can provide a data collect script which automates this process.

  • 7/22/2019 aix Q

    26/34

    Investigate unresponsive applications

    Sometimes applications are merely unresponsive, and you have to figure out why it is unresponsive and how to get it to respond. If you issue a force applicationand it does not respond, you may be left wondering what you can do. First of all, it is important to know that force makes no guarantees to force. It is simplya wrapper around an OS kill command.

    Without going into the architectural details of DB2, there are some situations which are dangerous to force. As such, the db2agent sets its priority level to behigher than that of the force. Under these circumstances, force does not work,and this is by design.

    The bottom line is, not every unresponsive application is caused by a bug. It ispossible that the application is just doing something important and not responding to any additional commands until it completes its current task.

    Recovery

    Recovery almost certainly requires a db2stop,db2start as DB2 does not take kindly to key engine processes being killed. It tends to invoke panic and bring the instance down. I would asses the impact the rogue application is having and, if possible, leave it in situ until you can recycle. It may be holding locks that ar

    e contending with other users, for example, and this is adversely affecting theapplication, in which case you may have to take an outage to remove it.

    What to collect

    The debugging of an unresponsive application is treated in the same way as a hung, but clearly the scope is narrower. You need to collect the following elementsto send to IBM Technical Support:

    - Iterative stack traces of the db2agent or DB2 process that is unresponsive.

    - ps listings and other items, like: db2level, dbm cfg, db cfg, db2diag.log andpossibly an application snapshot.

    Conclusion

    Problem determination in DB2 is made simpler because of the tools and utilitiesavailable in AIX. Often it is necessary to use both AIX and DB2 tools and commands to figure out what the problem is. This article discusses some of the problems associated with troubleshooting in DB2 and has hopefully given you the tools you need to fix your database.

    JOB INTEVIEW QUESTIONS

    Category: AIX Interview Questions & Answers, DB2 Interview Questions, Database Interview Questions, Unix Interview Questions | CommentAIX Interview Questions and AnswersNovember 12, 2009, 5:36 PM

    I collected some useful interview questions from various sites and I thought these questions might help our readers or job seekers to strength their knowledge.Most of the questions are AIX, HACMP, Network related. Enjoy.

  • 7/22/2019 aix Q

    27/34

    Can you explain the steps to Mirroring rootvg in your environment?

    Mirroring "rootvg" protects the operating system from a disk failure. Mirroring"rootvg" requires a couple extra steps compared to other volume groups. The mirrored rootvg disk must be bootable *and* in the bootlist. Otherwise, if the primary disk fails, you'll continue to run, but you won't be able to reboot.

    In brief, the procedure to mirror rootvg on hdisk0 to hdisk1 is

    1. Add hdisk1 to rootvg: extendvg rootvg hdisk1

    2. Mirror rootvg to hdisk1: mirrorvg rootvg hdisk1 (or smitty mirrorvg)

    3. Create boot images on hdisk1: bosboot -ad /dev/hdisk1

    4. Add hdisk1 to the bootlist:bootlist -m normal hdisk0 hdisk1

    5. Reboot to disable quorum checking on rootvg. The mirrorvg turns off quorum bydefault, but the system needs to be rebooted for it to take effect.

    What is VPN and how it works?

    A VPN is a private network that uses a public network (usually the Internet) toconnect remote sites or users together. Instead of using a dedicated, real-world

    connection such as leased line, a VPN uses "virtual" connections routed rough the Internet from the company's private network to the remote site or employee.

    What is daemon?

    A daemon (pronounced DEE-muhn) is a program that runs continuously and exists for the purpose of handling periodic service requests that a computer system expects to receive. The daemon program forwards the requests to other programs (or processes) as appropriate. Each server of pages on the Web has an HTTPD or Hypertext Transfer Protocol daemon that continually waits for requests to come in fromWeb clients and their users.

    There are several daemon in AIX environment, such as, sshd, inetd, and so on.

    Can you describe SAN in your won word?

    A storage area network (SAN) is a high-speed special-purpose network (or subnetwork) that interconnects different kinds of data storage devices with associateddata servers on behalf of a larger network of users. Typically, a storage area network is part of the overall network of computing resources for an enterprise.A storage area network is usually clustered in close proximity to other computing resources such as IBM Power5 boxes but may also extend to remote locations forbackup and archival storage, using wide area network carrier technologies suchas ATM or SONET .

    A storage area network can use existing communication technology such as IBM's o

    ptical fiber ESCON or it may use the newer Fibre Channel technology. Some SAN system integrators liken it to the common storage bus (flow of data) in a personalcomputer that is shared by different kinds of storage devices such as a hard disk or a CD-ROM player.

    SANs support disk mirroring, backup and restore, archival and retrieval of archived data, data migration from one storage device to another, and the sharing ofdata among different servers in a network. SANs can incorporate subnetworks withnetwork-attached storage (NAS) systems.

  • 7/22/2019 aix Q

    28/34

    So you mention NAS, but What is NAS?

    Network-attached storage (NAS) is hard disk storage that is set up with its ownnetwork address rather than being attached to the department computer that is serving applications to a network's workstation users. By removing storage accessand its management from the department server, both application programming andfiles can be served faster because they are not competing for the same processor resources. The network-attached storage device is attached to a local area network (typically, an Ethernet network) and assigned an IP address. File requestsare mapped by the main server to the NAS file server.

    Network-attached storage consists of hard disk storage, including multi-disk RAID systems, and software for configuring and mapping file locations to the network-attached device. Network-attached storage can be a step toward and included aspart of a more sophisticated storage system known as a storage area network (SAN).

    NAS software can usually handle a number of network protocols, including Microsoft's Internetwork Packet Exchange and NetBEUI, Novell's Netware Internetwork Packet Exchange, and Sun Microsystems' Network File System. Configuration, including the setting of user access priorities, is usually possible using a Web browser.

    What is SMTP and how it works?

    SMTP (Simple Mail Transfer Protocol) is a TCP/IP protocol used in sending and receiving e-mail. However, since it is limited in its ability to queue messages atthe receiving end, it is usually used with one of two other protocols, POP3 orIMAP, that let the user save messages in a server mailbox and download them periodically from the server. In other words, users typically use a program that uses SMTP for sending e-mail and either POP3 or IMAP for receiving e-mail. On Unix-based systems, sendmail is the most widely-used SMTP server for e-mail. A commercial package, Sendmail, includes a POP3 server. Microsoft Exchange includes an SMTP server and can also be set up to include POP3 support.

    SMTP usually is implemented to operate over Internet port 25.

    Do you have any idea about NAT?

    Short for Network Address Translation, an Internet standard that enables a local-area network (LAN) to use one set of IP addresses for internal traffic and a second set of addresses for external traffic. A NAT box located where the LAN meets the Internet makes all necessary IP address translations.

    NAT serves three main purposes:

    *Provides a type of firewall by hiding internal IP addresses

    *Enables a company to use more internal IP addresses. Since they're used in

    ternally only, there's no possibility of conflict with IP addresses used by other companies and organizations.

    *Allows a company to combine multiple ISDN connections into a single Intern

    et connection.

    Explain DHCP and its uses to an environment?

    Short for Dynamic Host Configuration Protocol, a protocol for assigning dynamicIP addresses to devices on a network. With dynamic addressing, a device can have

  • 7/22/2019 aix Q

    29/34

    a different IP address every time it connects to the network. In some systems,the device's IP address can even change while it is still connected. DHCP also supports a mix of static and dynamic IP addresses.

    Dynamic addressing simplifies network administration because the software keepstrack of IP addresses rather than requiring an administrator to manage the task.This means that a new computer can be added to a network without the hassle ofmanually assigning it a unique IP address. Many ISPs use dynamic IP addressing for dial-up users.

    What does SNMP stands for?

    Short for Simple Network Management Protocol, a set of protocols for managing complex networks. SNMP works by sending messages, called Protocol Data Units, to different parts of a network. SNMP-compliant devices, called Agents, store data about themselves in Management Information Bases and return this data to the SNMPrequesters.

    What do you know about TCPDump?

    TCPdump is a common computer network debugging tool that runs under the commandline. It allows the user to intercept and display TCP/IP and other packets being transmitted or received over a network to which the computer is attached. Tcpdump works on most Unix-like platforms: Linux, Solaris, BSD, Mac OS X, HP-UX and

    AIX among others. On Windows, WinDump can be used; it's a port of tcpdump to Windows.

    You must have a root or super user authority to use TCPdumps in UNIX like envrionment.

    How do I remove a volume group with no disks?

    This is a very common question about AIX LVM and I knew that you will ask me this one. Within a volume group there is a Volume Group Descriptor Area (VGDA) which is kinda a "suitcase" of lvm information. This is what allows you to pick upyour drives and take them to another machine, importvg them, and get filesystemsautomatically defined.

    What happens, when you importvg the volume group, the command goes out and readsthe VGDA and finds out about all the logical volumes and filesystems that may exist on thevolume group. It then checks for clashes (name conflicts, etc..) on its own machine and then, populates its own database with information about the new volume group andits associated logical volumes. In cases of file systems, it will go into the /etc/filesystems file and add the new filesystem entries that came along with theimported volume group.

    The main question I see is "I've taken away the disks, but how do I get rid of the volume group". The question should really say, "How do I get rid of the volum

    e group INFORMATION" since that's all you have on the system. You've got possible entries inthe /etc/filesystems and definitely entries in the ODM. Just do:exportvg

    It does a reverse importvg, except it doesn't go off and read the VGDA. It nukesanything relating to the volume group in the /etc/filesystems and ODM. The onlytime this won't work is if the system detects that the volume group is varied on. Then, it would be like trying to change tires on a moving car, we won't let you do it!

  • 7/22/2019 aix Q

    30/34

    How do you you get rid of a disk that is no longer really in the VG?In this case, you DON'T want to do an exportvg. What you want to do is tell thesystem you want to cut out the memory of the old, bad disk from the RS/6000 ANDfrom the VGDA of the volume group. You simply do: reducevg -d -for if the hdname can't be found:reducevg -d -fBe careful with this command. Unlike the exportvg command, actions donewith this command WILL affect the VGDA information on the platter.

    What is Capacity on Demand?

    Capacity on Demand (CoD) encompasses the various capabilities for you to dynamically activate one or more resources on your server as your business peaks dictate. You can activate inactive processors or memory units that are already installed on your server on a temporary and permanent basis.

    Usually, the Capacity on Demand is used for IBM System i5 and eServer i5 and IBM System p5 and eServer p5 520, 550, 570, 590, and 595 models. Some servers includea number of active and inactive resources. Active processors and active memory units are resources that are available for use on your server when it comes fromthe manufacturer. Inactive processors and inactive memory units are resources that are included with your server but are not available for use until you activate them.

    What is Hardware Management Console (HMC)?

    The HMC is a server or stand alone machine that provides a graphical user interface tool to manage several Power Systems. The HMC manages system through hypervisor and operating system. From version 7 it is truly web based and you can configure, installs and manage, partitioned, virtualization most of your Power5 and 6boxes via HMC. There are many tasks you can do with HMC, such as,

    *Powering off and on of the partition

    *Configure and activate resources to the system

    * Creates and stores LPAR profiles and allocated resources to them.*HMC do the dynamic memory reconfiguration of the partition.

    *Setup VIO server and VIO client thru HMC and do micro-partition, create st

    orage*pool and processor pool with it

    *Provide virtual console to the partition

    Most of the time we installed dual HMC for redundancy and make sure to achieve m

    ore uptime in a wide system

    Why do I need a Hardware Management Console, anyway?

    You need a HMC if you plan to:

    Configure and manage logical partitions and partition profiles (selected modelscan configure LINUX partitions without a HMC). Perform DLPAR (dynamic LPAR) functions. Activate and manage Capacity on Demand resources.

  • 7/22/2019 aix Q

    31/34

    You can also use the HMC to: Perform service functions Manage frames (towers), IOPs and IOAs. * Note that you cannot see below the IOAto the device level. Manage system profiles (yes, you can have more than one!) Power on and power down. The Service Processor is always hot if there is power to the server. Activate and manage Virtualization Engine technologies. 5250 emulation so you can get a console up on a i5/OS partition or a virtual terminal window for AIX or LINUX.

    What is kernel?

    The kernel is the essential center of a computer operating system, the core thatprovides basic services for all other parts of the operating system. A synonymis nucleus. A kernel can be contrasted with a shell, the outermost part of an operating system that interacts with user commands. Kernel and shell are terms used more frequently in UNIX operating systems than in IBM mainframe or Microsoft Windows systems.

    Typically, a kernel (or any comparable center of an operating system) includes an interrupt handler that handles all requests or completed I/O operations that compete for the kernel's services, a scheduler that determines which programs sha

    re the kernel's processing time in what order, and a supervisor that actually gives use of the computer to each process when it is scheduled. A kernel may alsoinclude a manager of the operating system's address spaces in memory or storage,sharing these among all components and other users of the kernel's services. Akernel's services are requested by other parts of the operating system or by application programs through a specified set of program interfaces sometimes knownas system calls.

    What is RMC?

    The Resource Monitoring and Control (RMC) subsystem is the scalable backbone ofRSCT that provides a generalized framework for managing resources within a single system or a cluster. Its generalized framework is used by cluster management t

    ools to monitor, query, modify, and control cluster resources. RMC provides a single monitoring and management infrastructure for both RSCT peer domains and management domains. RMC can also be used on a single machine, enabling you to monitor and manage the resources of that machine. However, when a group of machines,each running RMC, are clustered together, the RMC framework allows a process onany node to perform an operation on one or more resources on any other node in the domain.

    What information is stored in Object Data Manager?

    It is a database of system and device configuration information integrated intoIBM's AIX operating system. The ODM is unique to AIX compared to other UNIX operating systems.

    Example of information stored in the ODM database are:

    * Network configuration* Logical volume management configuration* Installed software information* Devices that AIX has drivers for* Logical devices or software drivers* Physical hardware device installed* Menus, screens and commands that SMIT uses

  • 7/22/2019 aix Q

    32/34

    Explain a little about Vital Product Data (VPD)?

    VPD in AIX and Linux is a collection of configuration and informational data associated with a particular set of hardware or software. VPD refers to a subset ofdatabase tables in the Object Data Manager (ODM), Therefore the VPD and ODM terms are sometimes referred to interchangeably.

    Vital product data (VPD) stores information such as part numbers, serial numbers, and engineering change levels from the Customized VPD object class or platformspecific areas, not all devices contain VPD data.

    Does HACMP work on different operating systems?

    Yes. HACMP is tightly integrated with the AIX 5L operating system and System p servers allowing for a rich set of features which are not available with any other combination of operating system and hardware. HACMP V5 introduces support forthe Linux operating system on POWER servers. HACMP for Linux supports a subset of the features available on AIX 5L, however this mutli-platform support providesa common availability infrastructure for your entire enterprise.

    What applications work with HACMP?

    All popular applications work with HACMP including DB2, Oracle, SAP, WebSphere,etc. HACMP provides Smart Assist agents to let you quickly and easily configureHACMP with specific applications. HACMP includes flexible configuration parameters that let you easily set it up for just about any application there is.

    Does HACMP support dynamic LPAR, CUoD, On/Off CoD, or CBU?

    HACMP supports Dynamic Logical Partitioning, Capacity Upgrade on Demand, On/OffCapacity on Demand and Capacity Backup Upgrade.

    If a server has LPAR capability, can two or more LPARs be configured with uniqu

    e instances of HACMP running on them without incurring additional license charges?

    Yes. HACMP is a server product that has one charge unit: number of processors onwhich HACMP will be installed or run. Regardless of how many LPARs or instancesof AIX 5L that run in the server, you are charged based on the number of activeprocessors in the server that is running HACMP. Note that HACMP configurationscontaining multiple LPARs within a single server may represent a potential single point-of-failure. To avoid this, it is recommended that the backup for an LPARbe an LPAR on a different server or a standalone server.

    Does HACMP support non-IBM hardware or operating systems?

    Yes. HACMP for AIX 5L supports the hardware and operating systems as specified in the manual where HACMP V5.4 includes support for Red Hat and SUSE Linux.

    What is nmon tool do?

    The nmon tool is designed for AIX and Linux performance specialists to use for monitoring and analyzing performance data, including:

    * CPU utilization

  • 7/22/2019 aix Q

    33/34

    * Memory use* Kernel statistics and run queue information* Disks I/O rates, transfers, and read/write ratios* Free space on file systems* Disk adapters* Network I/O rates, transfers, and read/write ratios* Paging space and paging rates* CPU and AIX specification* Top processors* IBM HTTP Web cache* User-defined disk groups* Machine details and resources* Asynchronous I/O AIX only* Workload Manager (WLM) AIX only* IBM TotalStorage Enterprise Storage Server (ESS) disks AIX only* Network File System (NFS)* Dynamic LPAR (DLPAR) changes only pSeries p5 and OpenPower for either AIX

    or Linux

    Also included is a new tool to generate graphs from the nmon output and create .gif files that can be displayed on a Web site.

    What is Logical Volume Manager(LVM) means?

    The set of operating system commands, library subroutines and other tools that allow you to establish and control logical volume storage is called the Logical Volume Manager (LVM).

    What is a Logical partition?

    A logical partition (LPAR) is the division of a computers processors, memory, and hardware resources into multiple environments so that each environment can beoperated independently with its own operating system and applications.

    Explain Network File Systems(NFS)?

    The Network File System (NFS) is a distributed file system that allows users toaccess files and directories of remote servers as if they were local. Suppose,

    Server A, that makes its file systems, directories, and other resources available for remote access. Clients computers, or their processes, that use a servers resources.

    Export the act of making file systems available to remote clients.

    Mount the act of a client accessing the file systems that a server exports.

    What is Network Information Service (NIS)?

    NIS was developed to simplify the task of administrating a number of machines over a network. In particular was the requirement to maintain copies of common files (e.g. password, group and host) across different systems.

    What is software RAID Levels do?Redundant Arrays of Independent Disks (RAID) is formally defined as a method tostore data on any type of disk medium.

    LDAP

  • 7/22/2019 aix Q

    34/34

    The Light Directory Access Protocol (LDAP) defines a standard method for accessing and updating information in a directory (a database) either locally or remotely in a client-server model.

    Thank you for reading this post, hope next post will come soon. Thats all for today.