29
VSE Management Suite Troubleshooting Guide Introduction ................................................................................................................................... 3 Terms and Abbreviations .............................................................................................................. 3 Troubleshooting Matrix .................................................................................................................... 3 1. Incompatible WBEM Services and WBEM Providers ........................................................................ 4 Problem..................................................................................................................................... 4 Symptoms .................................................................................................................................. 4 Resolution .................................................................................................................................. 7 2. Missing WBEM Providers and Agents ............................................................................................ 8 Problem..................................................................................................................................... 8 Symptoms .................................................................................................................................. 8 Resolution ................................................................................................................................ 12 3. WBEM Authentication Configuration........................................................................................... 14 Problem................................................................................................................................... 14 Symptoms ................................................................................................................................ 15 Resolution ................................................................................................................................ 15 4. Out-of-date WBEM Software or Missing Patches ............................................................................ 16 Problem................................................................................................................................... 16 Symptoms ................................................................................................................................ 16 Resolution ................................................................................................................................ 17 5. Invalid or Conflicting Names ..................................................................................................... 17 Problem................................................................................................................................... 17 Symptoms ................................................................................................................................ 18 Resolution ................................................................................................................................ 19 6. Incorrect System Discovery ........................................................................................................ 20 Problem................................................................................................................................... 20 Symptoms ................................................................................................................................ 20 Resolution ................................................................................................................................ 21 7. Unreliable Hostname Resolution ................................................................................................. 21 Problem................................................................................................................................... 21 Symptoms ................................................................................................................................ 21

VSE Management Suite Troubleshooting Guide

Embed Size (px)

Citation preview

Page 1: VSE Management Suite Troubleshooting Guide

VSE Management Suite Troubleshooting Guide

Introduction ................................................................................................................................... 3 Terms and Abbreviations.............................................................................................................. 3

Troubleshooting Matrix.................................................................................................................... 3 1. Incompatible WBEM Services and WBEM Providers ........................................................................ 4

Problem..................................................................................................................................... 4 Symptoms.................................................................................................................................. 4 Resolution.................................................................................................................................. 7

2. Missing WBEM Providers and Agents............................................................................................ 8 Problem..................................................................................................................................... 8 Symptoms.................................................................................................................................. 8 Resolution................................................................................................................................ 12

3. WBEM Authentication Configuration........................................................................................... 14 Problem................................................................................................................................... 14 Symptoms................................................................................................................................ 15 Resolution................................................................................................................................ 15

4. Out-of-date WBEM Software or Missing Patches............................................................................ 16 Problem................................................................................................................................... 16 Symptoms................................................................................................................................ 16 Resolution................................................................................................................................ 17

5. Invalid or Conflicting Names ..................................................................................................... 17 Problem................................................................................................................................... 17 Symptoms................................................................................................................................ 18 Resolution................................................................................................................................ 19

6. Incorrect System Discovery ........................................................................................................ 20 Problem................................................................................................................................... 20 Symptoms................................................................................................................................ 20 Resolution................................................................................................................................ 21

7. Unreliable Hostname Resolution ................................................................................................. 21 Problem................................................................................................................................... 21 Symptoms................................................................................................................................ 21

Page 2: VSE Management Suite Troubleshooting Guide

Resolution................................................................................................................................ 22 8. SSH Authentication Configuration............................................................................................... 22

Problem................................................................................................................................... 22 Symptoms................................................................................................................................ 23 Resolution................................................................................................................................ 23

9. gWLM Auto-Start Configuration.................................................................................................. 25 Problem................................................................................................................................... 25 Symptoms................................................................................................................................ 25 Resolution................................................................................................................................ 25

10. Trial License Expiration and Missing LTU’s.................................................................................. 26 Problem................................................................................................................................... 26 Symptoms................................................................................................................................ 26 Resolution................................................................................................................................ 28

Conclusion .................................................................................................................................. 28 For more information..................................................................................................................... 29

Page 3: VSE Management Suite Troubleshooting Guide

3

Introduction This document is a guide for troubleshooting the most common issues encountered when configuring an environment to be managed by HP SIM and the VSE Management Suite (VSEMgmt). The issues described in this paper were taken directly from interactions with HP customers and early adopters of VSEMgmt. It describes the most commonly encountered configuration issues in the VSE Management suite of products. The scope of this document is the following components:

• HP Integrity Essentials Virtualization Manager (Vman) • HP Integrity Essentials Capacity Advisor (CapAd) • HP Integrity Essentials Global Workload Manager (gWLM)

The document is written for HP-UX 11i environments running on HP 9000 or HP Integrity servers.

To use this document, start by reading through the list of common problems in the Troubleshooting Matrix. Follow the given section numbers for the likely causes and read through the problem, symptoms and resolutions. This information should confirm whether the problem, symptom, and resolution described apply to the problem you are encountering. Finally, if you are experiencing a problem that is not listed in the matrix, scan through each section to determine if any of the detailed symptoms match with the behavior you are seeing.

Terms and Abbreviations The following terms, product names, and abbreviations are used throughout this document:

• Compartment - the generic term used to describe the partitioning technologies in HP’s Virtual Server Environment. In the context of this paper, complexes, nPartitions, Virtual Partitions, Integrity Virtual Machines, and standalone servers are compartments.

• HP Systems Insight Manager (HP SIM) - HP’s unified server and storage management tool. • Integrity VM - A virtual machine running on an Integrity Virtual Machines Host. • Integrity Virtual Machines - HP’s Integrity Virtual Machines product offers operating system isolation

and fine-grained, dynamic resource allocation. • Integrity Virtual Machines Host - The host operating system that supports multiple Integrity VMs. • nPartition (nPar) - HP’s hard partition technology offering electrical and operating system isolation. • nPartition Complex - an HP 9000 or HP Integrity server that is capable of hosting one or more

nPartitions. These servers are commonly referred to as an nPar server, a cell-based server, or simply a complex.

• Standalone server - an HP 9000 or HP Integrity server that is not partitioned with any of HP’s partitioning technologies in the Virtual Server Environment.

• Virtual Partition (vPar) - HP’s Virtual Partitions offer virtualized partitions with operating system isolation and dynamic resource migration.

Troubleshooting Matrix This matrix provides a mapping between the most commonly observed symptoms of the problems along with a reference to the most likely causes. Use the numbers in the second column to focus your troubleshooting efforts.

Page 4: VSE Management Suite Troubleshooting Guide

4

Table 1: VSE Management Troubleshooting Matrix

I’m seeing… Likely Cause (section number)...

A vPar that does not have the “vPar” link and is not displayed under the correct server or nPar.

1, 2, 3, 4

A vPar that has a “vPar” link and has the correct color for a vPar, but displayed in Virtualization Manager as a separate box.

5

An Integrity VM that is not displayed under the Integrity Virtual Machines Host.

2

A utilization meter with the text “No Perm”. 3

A utilization meter with the text “No Data”. 2, 4

A system that was converted from an nPar to a vPar or a VM but is not drawn under the correct box in Virtualization Manager.

6

A system that experienced a WBEM provider issue that has now been fixed, yet Virtualization Manager does not show the system under the correct box.

6

A system that was managed by gWLM and was working fine, but when the system was rebooted, it’s no longer working.

9, 10

A system that was previously shown in Virtualization Manager is no longer visible even though HP SIM still shows the system.

8, 10

1. Incompatible WBEM Services and WBEM Providers

Problem WBEM Services version A.02.00.09 and greater have an incompatibility with the vPar WBEM provider versions B.11.11.01.02 for HP-UX 11iv1 and B.11.23.01.03 for HP-UX 11v2. When these products are installed together, the vPar WBEM provider will not function correctly.

Symptoms vPars appear in HP SIM and in VSE Management as either nPars or standalone servers depending on what type of hardware is hosting the vPars.

On systems exhibiting this problem, an incompatible version of the vPar WBEM provider is generally installed and running. In fact, as shown in Listing 1, the HP_VParProviderModule has a status of OK when viewed using the cimprovider WBEM command. However, this does not indicate all of the functions provided by the vPar WBEM provider are operating normally.

Listing 1: Incomplete WBEM Provider Status

# cimprovider -ls

Page 5: VSE Management Suite Troubleshooting Guide

5

MODULE STATUS OperatingSystemModule OK ComputerSystemModule OK ProcessModule OK IPProviderModule OK DNSProviderModule OK NTPProviderModule OK NISProviderModule OK SDProviderModule OK IOTreeModule OK HP_UtilizationProviderModule OK HP_NParProviderModule OK HP_VParProviderModule OK

The HP SIM system list in Figure 1 is the most likely symptom observed when this problem is occurring. Notice there are no associations listed next to the system names, such as “puny01v0 Hosted by puny01”.

Figure 1: HP SIM System List with Incorrect vPar Discovery

The HP SIM system list in Figure 2 shows the correct representation these systems’ configuration. The nPar complex named puny contains one nPar named puny01 that nPar contains two vPars named puny01v0 and puny01v1. Notice the associations are correctly listed in the table as “puny01 in Complex puny” and “puny01v1 Hosted by puny01”. If this information is not displayed in the HP SIM system lists, the VSE Management suite will not display the systems properly in the graphical views provided by Virtualization Manager.

Page 6: VSE Management Suite Troubleshooting Guide

6

Figure 2: HP SIM System List with Correct vPar Discovery

The final symptom most common symptom observed when this problem is occurring is the system subtype is not discovered correctly by HP SIM. Figure 3 shows the correct system subtype for a vPar that is hosted in an nPar. If the subtype for a vPar does not include “HP Virtual Partition”, then HP SIM and the VSE Management software will not correctly display the vPars.

Page 7: VSE Management Suite Troubleshooting Guide

7

Figure 3: HP SIM System Page with Correct vPar Subtype

Resolution This problem is correctable by performing one of the following two options: 1. Run the following command on each vPar: /opt/wbem/bin/cimmof -n root/cimv2/vpar \ /opt/vparprovider/mof/HPUX_VParLocalPartition.mof 2. Remove the installed version of the vPar WBEM provider and install the latest version on each

vPar. – For HP-UX 11iv1, a compatible version of the vPar WBEM provider, B.11.11.01.02 or later, can

be found on the June 2006 HP-UX AR media. – For HP-UX 11iv2, a compatible version of the vPar WBEM provider, B.11.23.01.04 or later, can

be found on the June 2006 HP-UX AR media.

Page 8: VSE Management Suite Troubleshooting Guide

8

2. Missing WBEM Providers and Agents

Problem This problem occurs when the gWLM agent or one or more WBEM providers are missing from a managed system. Each of the partitioning technologies in HP’s Virtual Server Environment that run an isolated operating system has a corresponding WBEM provider. If the appropriate WBEM provider for a given partitioning technology is not installed on the managed node, then the system will not be represented correctly in HP SIM or the VSE Management suite.

As of the December 2005 all required agents and WBEM providers are included as part of the base HP-UX 11i operating system. However, it is common for production environments to lag HP-UX releases, so it is common for one or more of these dependencies to be missing from managed systems.

The nPar WBEM provider has been included with the base HP-UX 11i operating system for several years, so this provider is usually installed in current production environments. One of the WBEM providers that is commonly not installed is the Integrity VM WBEM provider, especially inside Integrity VM guest operating systems. The vPar WBEM provider is fairly new, and is not found on many existing vPar environments. The Utilization WBEM provider is also new and is not found on most existing production environments. Finally, the Global Workload Manager agent must be installed on systems being managed by Global Workload Manager.

Symptoms When a WBEM provider is missing from a system it will not appear in HP SIM or in the VSE Management suite as the appropriate type of system and it will generally not be associated with its containing system. Figure 4 shows the HP SIM system page for an Integrity VM. Notice the subtype is correctly identified and the association is correctly shown for this virtual machine. Regardless of the technology, this page must show the correct subtype for the system and the associations must be correct in order to ensure HP SIM and the VSE Management suite are able to visualize and manage the system.

Page 9: VSE Management Suite Troubleshooting Guide

9

Figure 4: HP SIM System Page with Correct Integrity VM Subtype and Associations

Another indication one or more WBEM providers are missing from a managed system is when a system shows up in the Virtualization Manager system view in a separate box even though the system is not a standalone server. For example, if an Integrity VM does not have the Integrity VM WBEM provider installed, it will show up in Virtualization Manger as a standalone server with the correct model string. Figure 5 shows the Integrity VM puny0v10 as a separate box instead of nesting the system under the puny00 Integrity VM host system. This is a result of the Integrity VM WBEM provider missing from the puny0v10 system.

Page 10: VSE Management Suite Troubleshooting Guide

10

Figure 5: Virtualization Manager Missing Integrity VM WBEM Provider

When the Utilization WBEM provider is not installed on a system, the utilization meters in Virtualization Manager will report “No Data” as shown in Figure 6.

There are a couple of situations when the “No Data” indicator is expected and does not mean the Utilization WBEM Provider is missing. The first is when a new workload is created. It takes between 5 and 10 minutes for data to be reported and during this time, the meters for the workload will show “No Data”. The second is when the Utilization WBEM Provider has just been started. This can be the result of the system rebooting, or the provider just being installed. In general, after between 5 and 10 minutes have passed and the meter still shows “No Data”, it is an indication that the WBEM provider is missing.

Page 11: VSE Management Suite Troubleshooting Guide

11

Figure 6: Virtualization Manager with Missing Utilization WBEM Provider

When a system is missing the Utilization WBEM Provider, Capacity Advisor will display the error message shown in Listing 2.

Listing 2: Capcollect Missing Utilization WBEM Provider Error

# capcollect rowan rowan: The Utilization WBEM Provider is not installed.

Finally, when the gWLM Agent is not installed, an error will be displayed in the Manage Systems and Workloads wizard when the system is attempted to be included in a gWLM shared resource domain. The error is shown in Figure 7.

Page 12: VSE Management Suite Troubleshooting Guide

12

Figure 7: Error when gWLM Agent Not Installed

Resolution Install the appropriate WBEM provider for each technology. The design of the WBEM providers and the gWLM agent are such that installing providers for technologies that are not in use on a system is benign. For example, if the vPar WBEM provider is installed on a standalone server that is not running vPars, the vPar WBEM provider responds to requests indicating the system is not a vPar. Similarly, if the gWLM agent is installed on a system that is not being managed by gWLM, the agent is not started and does not run unless it is manually started.

To verify the VSE Management providers and agents are installed, the command output shown in Listing 3 is the expected minimum output. If any of the highlighted entries are missing, the system may not be discovered correctly by HP SIM.

Listing 3: Verify WBEM Providers Are Installed

# cimprovider -ls MODULE STATUS OperatingSystemModule OK ComputerSystemModule OK ProcessModule OK IPProviderModule OK DNSProviderModule OK NTPProviderModule OK NISProviderModule OK SDProviderModule OK IOTreeModule OK EProviderModule OK HPUXLVMProviderModule OK HP_NParProviderModule OK HP_VParProviderModule OK

Page 13: VSE Management Suite Troubleshooting Guide

13

HP_UtilizationProviderModule OK HPVMProviderModule OK

The command for verifying the gWLM agent is installed is shown in Listing 4.

Listing 4: Verify Global Workload Manager Agent is Installed

# swlist T2743AA # Initializing... # Contacting target "puny01v1"... # # Target: puny01v1:/ # # T2743AA A.02.50.00.04 HP Global Workload Manager Agent T2743AA.gWLM-Agent A.02.50.00.04 HP Global Workload Manager Agent

VSE Management Best Practice

Keep system configurations simple by installing all of the agents and providers on all systems. This allows HP SIM and VSE Management to correctly discover the system without having to maintain separate images for every type of system being managed. Selectively installing providers and agents quickly becomes a maintenance burden.

For environments that require customized configurations for each type of system, Table 2 lists each of the partitioning technologies in HP’s Virtual Server Environment and the minimum required WBEM providers and agents. The gWLM agent is only required if the system is being managed by gWLM. It should be noted that HP SIM requires additional WBEM providers to gather basic system information such as model number and operating system type. However, these WBEM providers are included with the WBEM Services software component, so they are not listed explicitly.

Page 14: VSE Management Suite Troubleshooting Guide

14

Table 2: Required WBEM Providers

HP Virtual Server Environment System Type Required WBEM Providers and Agents (Bundle Name)

nPartition nPartition WBEM Provider (nParProvider)

Utilization WBEM Provider (UtilProvider)

Global Workload Manager Agent (T2743AA)

Virtual Partition nPartition WBEM Provider (nParProvider)

Virtual Partition WBEM Provider (vParProvider1)

Utilization WBEM Provider (UtilProvider)

Global Workload Manager Agent (T2743AA)

Integrity Virtual Machines Host nPartition WBEM Provider (nParProvider)

Integrity Virtual Machines WBEM Provider (VMProvider)

Utilization WBEM Provider (UtilProvider)

Global Workload Manager Agent (T2743AA)

Integrity VM Integrity Virtual Machines WBEM Provider (VMProvider)

Utilization WBEM Provider (UtilProvider)

Global Workload Manager Agent (T2743AA)

Standalone Server Utilization WBEM Provider (UtilProvider)

Global Workload Manager Agent (T2743AA)

3. WBEM Authentication Configuration

Problem HP SIM must have valid authentication credentials for every managed system in order to gather data from WBEM services. While some data is available using SNMP, it is frequently disabled in production environments and in environments where it is enabled, the SNMP agent does not have complete VSE information. Therefore, in order to manage systems with HP SIM and the VSE Management suite, WBEM communication must be enabled on all managed systems and HP SIM must be configured with valid authentication credentials for each system.

1 The actual bundle name of the Virtual Partition WBEM Provider on HP-UX 11iv1 (B.11.11) is “vparprovider” in all lower case letters. On HP-UX 11iv2 the bundle name is “vParProvider”. The name will be changed on HP-UX 11iv1 to “vParProvider” in late 2006 to be consistent.

Page 15: VSE Management Suite Troubleshooting Guide

15

In addition to authentication credentials, a related problem is an incompatibility between WBEM and LDAP-UX version B.03.xx and older.

Symptoms This problem is identifiable in Virtualization Manager as shown in Figure 8. The utilization meter shows the message “No Perm” indicating the product was not able to authenticate to the WBEM server. In addition, the model string will generally be very terse as is the information shown on the HP SIM system page. Compare this view with that shown in Figure 6 to see the differences in behavior when WBEM authentication fails.

Another symptom of this problem is when the Capacity Advisor capcollect command attempted on a system where authentication credentials are not correct. The error shown in Listing 5 will be displayed.

Listing 5: Capcollect Incorrect WBEM Authentication Credentials

# capcollect rowan

rowan: The WBEM credentials provided by HP-SIM were not accepted by the WBEM server.

A slightly different error message is displayed by capcollect when there are no credentials for the system specified in HP SIM. The error message in this case will be that shown in Listing 6.

Listing 6: Capcollect Missing WBEM Authentication Credentials

# capcollect rowan No WBEM credentials are available for system rowan.

Figure 8: Virtualization Manager with Invalid WBEM Authentication

Resolution Some or all of the following changes may be required to correct this problem.

Page 16: VSE Management Suite Troubleshooting Guide

16

1. Environments using LDAP-UX with HP SIM and the VSE Management suite must upgrade to at least version B.04.00.xx on all managed systems.

2. Use the HP SIM Options -> Protocol Settings -> Global Protocol Settings screen to configure a valid username and password to be used by HP SIM for reading data from WBEM providers. This is the most appropriate starting point for resolving this problem when the error message in Listing 6 is displayed.

3. Modify the configuration of the managed system to include the user account specified in the previous step. This may involved adding an entry to the password file, or ensuring the system is setup to use NIS or LDAP authentication services.

4. Execute HP SIM discovery after changing the Global Protocol Settings or the System Protocol Settings by selecting Options -> Discovery and click the Run Now button. This will force HP SIM to retry the WBEM connection and use newly specified authentication credentials.

VSE Management Best Practice

Create a user account in either NIS or LDAP for the WBEM user configured in HP SIM. HP recommends giving this user a non-privileged user id and not providing a login shell. Using this process greatly simplifies management of the WBEM user account and is generally acceptable to security groups.

4. Out-of-date WBEM Software or Missing Patches

Problem There have been several problems addressed in the recent versions of the WBEMServices product. In addition, a couple of patches have been issued to resolve problems related to WBEM. When WBEM is out-of-date or missing one or more patches, the most common problems are:

• Excessive use of CPU resources by the cimserver process • Excessive consumption of memory by the cimserver process • Inability to collect data for Capacity Advisor using the capcollect command

Symptoms The primary symptom of this problem is the Virtualization Manager view showing “No Data” for the utilization meters and Capacity Advisor’s capcollect command failing with an error message that says “system_name: Unable to contact the WBEM server. See the capcollect(1M) manual page”. If the WBEM server is unreachable when HP SIM is attempting to perform the initial discovery on a system, the systems will be the same as those discussed in issue 3 of this paper. To verify the state of the WBEM server, the commands shown in Listing 7 will give errors instead of their usual display. The last command shows the large amounts of CPU time which have been consumed by the cimserver process. This is also an indication that either the WBEMServices software is out-of-date or the system is missing a required patch.

Listing 7: WBEM Troubleshooting Commands

# cimprovider -ls Empty HTTP response message.

Page 17: VSE Management Suite Troubleshooting Guide

17

# osinfo osinfo error: Empty HTTP response message. # ps -ef | grep cimserver$ root 1372 1 0 Feb 3 ? 1609:16 /opt/wbem/sbin/cimserver

Resolution Resolving this problem first requires that the cimserver process be restarted. Listing 8 shows the commands to stop and restart the WBEM server process. Notice the first command gave an error message. In some cases, it may be necessary to manually kill the cimserver process before restarting.

Listing 8: WBEM Server Restart Commands

# /opt/wbem/bin/cimserver -s Shutdown timeout expired. Forced shutdown initiated. # /opt/wbem/bin/cimserver

After issuing these commands, the following steps are required:

1. Ensure at least WBEMServices version A.02.00.08 is installed on each managed system and the CMS.

2. For HP-UX 11iv2 operating systems, PHSS_33349 must be installed. On HP Integrity servers, the dependant patch PHSS_32213 is also required. These patches prevent memory leaks in the cimserver process.

3. If WBEMServices A.02.00.09 is being used, install the required patches for Capacity Advisor to be able to collect data.

– For HP-UX 11iv1 install the patch PHSS_34428. – For HP-UX 11iv2 install the patch PHSS_34429.

5. Invalid or Conflicting Names

Problem HP SIM requires every managed system to have a non-empty and unique name. This includes the names of complexes, nPars, vPars, and Integrity VM’s. This creates a problem in some environments because there has not been a requirement for each of these compartments to have a unique name. However, HP SIM must have a unique name for each of the compartments, even those that are not network addressable. HP SIM uses the network hostname for compartments that are network addressable, but not all of these compartments are network addressable. The most common problems in this area are the following:

1. Complexes do not have unique names. If names are not specified at the time of order, the default complex name from the HP factory is “Complex 01”. In several environments, this has caused a problem because multiple systems in HP SIM were discovered to have the same name.

2. The names of nPartitions that are hosting vPars are not unique. The vPar monitor is not network addressable, so HP SIM uses the name of the nPar for the vPar monitor layer. If the nPar name

Page 18: VSE Management Suite Troubleshooting Guide

18

conflicts with any other name in the managed environment, this causes a problem for HP SIM’s discovery.

3. The character set allowed for compartment names is broader than HP SIM’s. Therefore, names that contain characters outside the HP SIM allowed set will not be discovered correctly.

4. Compartments with empty names. HP SIM cannot properly discover a system with an empty name.

Symptoms Each of the variations to this problem has a slightly different symptom.

1. When multiple complexes have the same name, all of the nPars contained in the complexes are drawn in Virtualization Manager under a single box. Thus all the nPars appear to be in the same complex even though they reside in separate complexes.

2. If a vPar and its containing nPar have the same name, the nPar will be drawn with a color and link indicating the system is a vPar, but its box will contain vPars. In Figure 9, the vPar named zoo25 is contained in an nPar that is also named zoo25. The zoo26 vPar is a sibling vPar contained in the same nPar. The links on the zoo25 nPar indicate the system is actually a vPar and the box is color coded such that it is a vPar. As is evident, systems with the same name result in a confusing illustration in Virtualization Manager.

Figure 9: Virtualization Manager with Duplicate Names between nPar and Contained vPar

3. In the situation where a name of a complex with nPars or a name of an nPar hosting vPars

contains invalid characters, the contained compartments will be drawn outside their contained box. For example, if a name of an nPar contains spaces and the nPar is hosing vPars, the vPars will not be drawn in the nPar’s box in Virtualization Manager. Instead, the vPars will be drawn separately. In Figure 10, the CMS is a vPar with the hostname puny01v0. The containing nPar is

Page 19: VSE Management Suite Troubleshooting Guide

19

named “puny 01”. The two other vPars in the nPar are incorrectly drawn outside the nPar as shown in the figure.

Figure 10: Virtualization Manger with nPar Name Containing Spaces

4. Complexes are the most commonly encountered names that are empty. In these scenarios, nPars

in the complex are drawn by Virtualization Manager as standalone boxes with the “nPar” link and the correct model number but they are not grouped together under their complex and their complex is not drawn in Virtualization Manager.

Resolution To resolve this problem, one or more compartments must be renamed. Depending on the symptoms, one or more of these steps may be necessary.

1. Ensure all complexes have unique names. If any complexes have the same name, rename them so they are unique.

2. Verify that no nPar hosting vPars has the same name as one of the vPars. If any of the names are the same, rename them so they are unique.

3. Make sure every complex and nPar that is hosting vPars has a name with only letters [A-Za-z], numbers[0-9], dashes(-), underscores(_), and periods (.). If any names contain spaces or other disallowed characters, rename them to use the allowed set of characters.

4. Ensure no compartments, especially complexes, have empty names. If any of names are empty, rename them so they are non-empty, unique, and contain only the valid set of characters.

Changing names as described does not require a reboot of any of the managed systems. It will also not have any affect on network names or IP addresses. However, this process requires a rediscovery of the systems after the names have been changed. See issue 6 for details on updating HP SIM’s discovery. Table 3 provides the command syntax for changing compartment names.

Page 20: VSE Management Suite Troubleshooting Guide

20

Table 3: Commands for Changing Compartment Names

To change the name of a... Run the command...

Complex /usr/sbin/cplxmodify -N <new complex name>

nPartition /usr/sbin/parmodify -p <partition id> -P <new name>

Virtual Partition /usr/sbin/vparmodify -p <old name> -P <new name>

Integrity VM /opt/hpvm/bin/hpvmmodify -p <old name> -N <new name>

6. Incorrect System Discovery

Problem When a system is discovered incorrectly for any of the reasons describe in this paper, it is necessary to rediscover the system after correcting the particular problem. However, rediscovery is often unsuccessful and does not correct the graphical views shown in Virtualization Manager.

Symptoms The symptoms of this problem expose themselves in three sequential steps.

First, as described in other tips in this paper, a system is incorrectly discovered and does not appear correctly in Virtualization Manager. The configuration of the system is corrected and a rediscovery is performed in HP SIM.

Second, a known issue in HP SIM prevents a rediscovery from updating a system’s associations. After correcting a system’s configuration problem rediscovering the system, HP SIM and Virtualization Manager continue to show incorrect system configuration.

Third, when the system is attempted to be deleted for a forcible rediscovery, the error message shown in Figure 11 is displayed. This message incorrectly states the system is “...used to manage other systems such as a storage node”.

Page 21: VSE Management Suite Troubleshooting Guide

21

Figure 11: HP SIM Delete System Error

Resolution To resolve this series of issues, select all the systems in the nPar server, including the complex, and delete all of them at the same time. Understand this operation will cause all authorizations, collections, and system settings in HP SIM to be lost for the selected systems. Then rediscover the systems.

In the case where the CMS happens to be in a complex that needs to be deleted, the only way to correct the issue is to perform an mxinitconfig -r. The ramifications of performing this operation should be understood before its undertaking. Alternatively, the CMS can be upgrade to HP SIM 5.0 with Update 2 for HP-UX. This release allows individual systems within an nPar server to be deleted. Thus avoiding the majority of situations where this step is required.

7. Unreliable Hostname Resolution

Problem Hostname resolution is required between the CMS and the managed systems. In addition, for systems managed by gWLM, managed systems must be able to resolve the hostnames of the CMS and every other system in their shared resource domain. When systems cannot consistently resolve the hostnames of their peers in the shared resource domain or the CMS, gWLM cannot manage the system or deploy a shared resource domain.

Symptoms Failure to have consistent name resolution on every system will lead to a variety of errors in VSEMgmt, with the most critical generally being the inability to deploy a gWLM shared resource domain. When errors are encountered in this process, the commands shown in Listing 9 will generally expose the problem.

First determine the IP address of the primary LAN interface using the ifconfig command. With this information, perform a reverse DNS lookup to ensure the network name matches the expected value. Then perform a DNS lookup with the hostname and ensure it matches the expected value. Finally, ensure the actual hostname of the system is the same as the hostname portion of the fully qualified domain name returned by the DNS name resolution. If any of these steps fail, that the behavior of VSEMgmt and particularly gWLM will not be correct.

Listing 9: Name Resolution Troubleshooting Commands

# ifconfig lan0

Page 22: VSE Management Suite Troubleshooting Guide

22

lan0: flags=843<UP,BROADCAST,RUNNING,MULTICAST> inet 15.1.51.36 netmask fffff800 broadcast 15.1.55.255 # nslookup 15.1.51.36 Name Server: udltools.fc.hp.com Address: 15.1.48.11 Trying DNS Name: zoo3.fc.hp.com Address: 15.1.51.36 # nslookup zoo3 Name Server: udltools.fc.hp.com Address: 15.1.48.11 Trying DNS Non-authoritative answer: Name: zoo3.fc.hp.com Address: 15.1.51.36 # hostname zoo3

Resolution Resolving this problem is dependant on the managed environment, but it generally involves ensuring the DNS service is updated with all systems in the managed environment. However, most environments where this problem is found are using file-based name resolution in /etc/hosts. The common errors are failing to ensure all hosts can resolve the hostnames of all the other hosts, and failing to include the CMS’s hostname in the hosts file for each managed node.

VSE Management Best Practice

Deploy a name service technology such as DNS to consistently resolve hostnames throughout the data center. This ensures all systems managed by VSEMgmt can communicate with one another and reduces the amount of manual maintenance of host files.

8. SSH Authentication Configuration

Problem Ssh authentication is required for VSEMgmt for many reasons. Many tasks performed through HP SIM and VSEMgmt rely on ssh as the secure communication protocol for invoking commands remotely. In addition, VSEMgmt uses ssh to scan for installed LTU’s on managed systems, and without the ability to detect LTU’s, systems will not be manageable by VSEMgmt.

Page 23: VSE Management Suite Troubleshooting Guide

23

Symptoms When SSH authentication is not configured, tasks will fail with an error message indicating SshAuthentication failed. Figure 12 shows an example of the most common failures. This tool is accessible through Tools -> Command Line Tools -> UNIX/Linux -> bdf.... A similar type of error will be received when performing many HP SIM and VSEMgmt tasks.

Figure 12: HP SIM Task without SSH Authentication Configured

Resolution There are three options available to resolve this issue. The first is to use the HP SIM task Configure -> Configure or Repair Agents.... This is a three step wizard that walks through the process of configuring SSH authentication. Step 1 is where systems are selected for configuration. The systems specified in this step must all have the same root password. For each system with a unique root password, the task must be performed once for each system. Step 2 is where the username and password for the system are specified. The password is not permanently stored by HP SIM, it is used only to copy the public key to the managed system. Step 3 is shown in Figure 13. The options that pertain directly to ssh authentication have been highlighted. Other configuration options in this step are optional and not required for ssh authentication.

Page 24: VSE Management Suite Troubleshooting Guide

24

Figure 13: HP SIM SSH Authentication Configuration

The second option is the mxagentconfig(1M) command line. This command is executed on the CMS and configures ssh authentication. The password specified to this command is the password corresponding to the root user on the system. While this password is used only to copy the public key to the managed system, it may be visible to other users on the system using commands such as “ps”. /opt/mx/bin/mxagentconfig -a -n <hostname> -u root -o host -p <password> The third option for configuring ssh authentication is to manually copy the public key from HP SIM to the managed system. HP SIM’s public key is stored in the file /etc/opt/mx/config/sshtools/.dtfSshKey.pub. This file must be appended to the file ~root/.ssh/authorized_keys2 on each managed system.

VSE Management Best Practice Include the HP SIM public key in standard images that are built for managed systems. This will simplify the configuration process and ensure systems that are updated or when a new system is deployed it will be configured with ssh authentication from HP SIM preconfigured.

This option is also particularly useful in the following two situations:

• Some environments do not allow system administrators to have access to the password. However, there exists a mechanism for administrators to elevate their privileges to root without using the root password. In these environments, the previous two options are not feasible, but this solution allows the key to be copied using tradition command lines and thus is permitted without special authorizations or exception processes.

Page 25: VSE Management Suite Troubleshooting Guide

25

• For ssh servers that are not directly compatible with OpenSSH. In such cases, it may be necessary to convert HP SIM’s public key to a different format and then distribute to the managed systems. However, this mechanism allows HP SIM to interoperate with some variants of OpenSSH that are not shipped with HP-UX.

9. gWLM Auto-Start Configuration

Problem Every system managed by gWLM requires an agent to be running. This agent must be configured to restart when the system is rebooted. Since the agent is delivered with the foundation HP-UX operating system, it is not started by default. Therefore, it is common for administrators to start the gWLM agent when a system is initially managed by gWLM, but fail to change the configuration file to ensure the agent is restarted when the system is rebooted.

Symptoms When this problem occurs, gWLM is functioning as expected until one of the systems in the shared resource domain is rebooted. At that point, the system no longer joins the shared resource domain and resources are no longer managed by gWLM. Figure 14 shows the Shared Resource Domain tab with an Integrity Virtual Machine host whose agent did not restart after booting.

Figure 14: Global Workload Manager without Agent Running

Resolution There are three mechanisms to ensure the gWLM agent is started after every reboot:

Page 26: VSE Management Suite Troubleshooting Guide

26

1. Run the command on each managed system: /opt/gwlm/bin/gwlmagent --enable_start_on_boot 2. From the CMS, use the tool under Configure -> Configure VSE Agents -> Start gWLM Agent... to

start the agent and update the configuration file so it is automatically started at boot time. 3. On each managed system, manually edit the file /etc/rc.config.d/gwlmCtl and change the

variable GWLM_AGENT_START to 1.

10. Trial License Expiration and Missing LTU’s

Problem When VSEMgmt first discovers a system to be managed, it allocates a 90-day trial license for the system on the CMS. During the 90-day trial period, a permanent license must be purchased for the products that will be used. Failure to purchase and install a license will result in the inability to manage systems with the VSEMgmt products.

The VSEMgmt products are individually licensable. The following products can be purchased and installed on each system that will be managed by the appropriate tool. A license for Virtualization Manager is included with both Capacity Advisor and Global Workload Manager. Therefore, a system is considered licensed for Virtualization Manager if any of the following LTU’s are installed.

• T2782AC A.02.00.00.07 HP Virtualization Manager for HPUX LTU • T2762AA A.02.00.00.07 HP Global Workload Manager Agent LTU • T2784AC A.02.00.00.07 HP Capacity Advisor for HPUX LTU

Alternatively, HP has provided a suite LTU that allows all the VSEMgmt products to be used in addition to either vPars or Integrity Virtual Machines. When the suite LTU is installed, it will actually install the LTU for the suite and all of the individual products’ LTUs for the components of VSEMgmt.

T2786AC A.02.00.00.07 HP VSE Suite for HP-UX 11i LTU

Symptoms When a trial license expires for Virtualization Manager, the system will disappear from the system and workload tabs. Figure 15 shows the VSEMgmt screen accessible from the menu Tools -> VSEMgmt Licenses.... This page shows the systems with expired trial licenses. In this case, the zoo3 system will not show up in Virtualization Manager but will be visible in HP SIM.

Page 27: VSE Management Suite Troubleshooting Guide

27

Figure 15: VSEMgmt License Scan with Expired Trial Licenses

The Capacity Advisor create scenario wizard only allows systems with valid trial or permanent licenses to be added to a scenario. Systems with expired trial licenses cannot be included in a new scenario. In addition, Capacity Advisor will not collect data from systems with expired trial licenses. Listing 10 shows the error reported when data collection is attempted from a system, zoo3, with an expired trial license.

Listing 10: Capacity Advisor Expired Trial License Error

# capcollect zoo3 The system "zoo3" is not licensed for Capacity Advisor.

Global Workload Manager trial licenses are enforced by the gWLM agent since the agents are designed to be operable without relying on the CMS for operation. The gwlm command can be used to display the license status of systems where the gWLM agent is running as shown in Listing 11. Any systems with expired trial licenses will continue to operate until the gWLM agent is restarted or a shared resource domain is redeployed.

Listing 11: Global Workload License Status

# gwlm license SRD Host Status License __________ _______ ______ _____________________________________________ puny00.srd puny00 OK License will expire Fri Sep 15 12:59:48 2006. vse01.srd vse01 OK License is unrestricted.

Page 28: VSE Management Suite Troubleshooting Guide

28

Resolution Resolving this issue involves two steps. First, one or more LTU’s must be purchased and installed on each managed system. After installing the LTU’s, use the Tools -> VSEMgmt Licenses... page to invoke a scan for licenses as shown in Figure 16.

The license scan process requires that ssh authentication be configured between the CMS and managed systems as discussed in topic 8 of this paper. If ssh authentication is not configured, the license scan will not be able to detect the installed LTU’s and the license status of the system will not be updated.

Figure 16: VSE Management Scan for LTU Task

Conclusion The VSEMgmt suite brings together a wealth of information from a variety of sources. The integration of the technologies in HP’s Virtual Server Environment is extremely powerful, but when initial configuration problems are encountered, it can be difficult to diagnose the problem without a guide to focus your efforts and provide a starting place for diagnosis. See contact points at the end of this paper if you have suggestions for making this paper more useful

Page 29: VSE Management Suite Troubleshooting Guide

For more information Visit www.hp.com/go/tryintegrityessentials Visit www.hp.com/go/vse Send feedback to: [email protected]

© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

Itanium is a trademark or registered trademark of Intel Corporation or its subsidiaries in the United States and other countries.

4AA0-XXXXENW, July 2006, Revision 1.0