7
Status Code 213: No storage units available for use. Page 1 of 7 Status Code: 213 No storage units available for use The NetBackup scheduler process (bpsched) did not find any of its storage units available for use. Either all storage units are unavailable or all storage units are configured for “On demand” only and the policy and schedule does not require a specific storage unit. Is this NetBackup 4.5 FP3 or earlier? Status Code 213 Are all master/media servers at same patch level? YES NO Go to Section 1.1 of this technote Is the storage unit correctly configured? NO YES YES NO Go to Section 1.2 of this technote Are there down drives? Go to Section 1.3 of this technote YES NO Is BPCD listening? Go to Section 1.4 of this technote NO YES Are there host name resolution issues? YES NO Go to Section 1.5 of this technote Are the drives over committed? YES NO Contact Veritas Technical Support Go to Section 1.6 of this technote - On a Unix media server, go to the /usr/openv/netbackup directory - On a Windows media server, go to the <INSTALL_PATH>\Veritas\ NetBackup\logs directory Then create the bptm, bpsched, and bpbrm folders and cycle the NetBackup daemons/services.

Status Code_213(No Storage Units Available for Use)

Embed Size (px)

Citation preview

Page 1: Status Code_213(No Storage Units Available for Use)

Status Code 213: No storage units available for use. Page 1 of 7

Status Code: 213 No storage units available for use

The NetBackup scheduler process (bpsched) did not find any of its storage units available for use. Either all storage units are unavailable or all storage units are configured for “On demand” only and the policy and schedule does not require a specific storage unit.

Is this NetBackup

4.5 FP3 or earlier?

Status

Code 213

Are

all master/media

servers at same

patch level?

YES NO

Go to Section 1.1

of this technote

Is

the storage unit

correctly

configured?

NO YES

YES

NO

Go to Section 1.2

of this technote

Are there down

drives?

Go to Section 1.3

of this technoteYES

NO

Is BPCD listening?

Go to Section 1.4

of this technoteNO

YES

Are there host

name resolution

issues?

YES

NO

Go to Section 1.5

of this technote

Are

the drives over

committed?

YES

NO

Contact Veritas

Technical Support

Go to Section 1.6

of this technote

- On a Unix media server, go to the /usr/openv/netbackup directory

- On a Windows media server, go to the <INSTALL_PATH>\Veritas\

NetBackup\logs directory

Then create the bptm, bpsched, and bpbrm folders and cycle the

NetBackup daemons/services.

Page 2: Status Code_213(No Storage Units Available for Use)

Status Code 213: No storage units available for use. Page 2 of 7

Table of Contents 1 Common causes of Status Code 213. ................................................................................................... 2 1.1 Verify NetBackup Version and Patch Level on NetBackup Versions Below 4.5 FP3...................... 2 1.2 Verify Storage Unit Configuration .................................................................................................... 2 1.3 Policy has Storage Unit set to “Any Available” and the Drives are Down........................................ 3 1.3.1 How to determine if drives are down................................................................................... 3 1.3.2 How to bring the drives up. ................................................................................................. 3 1.3.3 Drive fails to come up.......................................................................................................... 4

1.4 Verify BPCD is Listening .................................................................................................................. 4 1.5 Hostname Or IP Resolution Issue.................................................................................................... 5 1.6 Determine if Drives are Over-Committed......................................................................................... 6

1 Common causes of Status Code 213. Status code 213 falls into six main categories:

1. Prior to NetBackup 4.5 FP3 it was mandatory the master and media servers be at the same NetBackup version/patch level. For these versions, the cause of a status code 213 is the media server not being the same version as the master.

2. Storage unit(s) incorrectly configured. 3. Policy has storage unit set to “any available” and drives are down. 4. The bpcd process is not listening. 5. There is a hostname or IP resolution issue and the master server cannot communicate with the

media server, or vice-versa. 6. Within an SSO environment, drives are over-committed.

1.1 Verify NetBackup Version and Patch Level on NetBackup Versions Below 4.5 FP3

In order to verify NetBackup patch levels, review the patch.history file (for a Unix server) or the history.log file (for a Windows server) located in the correct directory:

UNIX - /usr/openv/patch or /usr/openv/pack

Windows - <install_path>\VERITAS\Patch\

If the patch levels are not the same, and the servers are running binaries earlier than NetBackup 4.5 FP3, bring the media server experiencing the status 213 to the same NetBackup patch level as the master server.

1.2 Verify Storage Unit Configuration 1. Go to NetBackup Management > Storage Units in the Java GUI or the NetBackup Administration

Console and double-click on the storage unit name. 2. Verify none of the storage units have their “Maximum concurrent jobs” attribute set to 0 (for Disk

storage units) or “Maximum concurrent drives used for backup” attribute set to 0 (for Media Manager storage units).

3. Verify that the Robot number and Media server name in the storage unit configuration matches the Media Manager device configuration.

4. The "On Demand Only" attribute should only be used when a class/policy and schedule combination requires one and only one particular storage unit. If the class/policy and schedule combination does not require a specific storage unit at all times, then "On Demand Only" should not be selected. Determine if all storage units are set to "On Demand Only" for a class/policy and schedule combination that does not require a specific storage unit. If this is the case, either

Page 3: Status Code_213(No Storage Units Available for Use)

Status Code 213: No storage units available for use. Page 3 of 7

specify a storage unit for the class and schedule combination or turn off "On Demand Only" for a storage unit

1.3 Policy has Storage Unit set to “Any Available” and the Drives are Down.

A status code 213 is returned for any policy where the policy’s storage unit is set to "Any available" and there are no available drives. (A status code 219 is returned for any policy where the storage unit is specified and there are no available drives.)

The bpsched log file shows the following, where rumpunch is the media server and policya is the policy

name: 11:29:30.600 [11372] <2> get_num_avail_drives: NUM UP 0 0 0 0 0 0 11:29:30.601 [11372] <4> log_in_errorDB: no drives up on storage unit <rumpunch-dlt> 11:29:30.602 [11372] <2> ?: available drives = 0, shared drives = 0, allow_mult_ret = 0 11:29:32.607 [11372] <2> ?: /usr/openv/netbackup/bin/backup_exit_notify rumpunch policya full FULL 213 0 &

1.3.1 How to determine if drives are down. Use one of the following methods to determine the status of the drives in the media manager storage unit. Command line:

To determine the status of the drives, run the vmoprcmd command below on the media server in

question.

UNIX: /usr/openv/volmgr/bin/vmoprcmd –d

Windows: <install_path\VERITAS\volmgr\bin\vmoprcmd –d

Drive status is DOWN as shown by the vmoprcmd command. DRIVE STATUS Drv Type Control User Label RecMID ExtMID Ready Wr.Enbl. ReqId 0 4mm DOWN - No - -

GUI:

The drive status can also be viewed in the Java GUI or NetBackup Administration Console under Media and Device Management > Device Monitor. Check the Control field for the drive and confirm that it is not AVR, for drives in a robot.

1.3.2 How to bring the drives up. If the drives are down, bring the drives back up, and retry the backup. Drives can be brought up using either the command line or the GUI. Command line-

UNIX: /usr/openv/volmgr/bin/vmoprcmd –up <drive_index>

Windows: <install_path\VERITAS\volmgr\bin\vmoprcmd –up <drive_index>

GUI:

Drive status is viewed by selecting Media and Device Management > Device Monitor. If the drive status is DOWN, the drive can be brought UP in the Java GUI or NetBackup Administration Console under Media and Device Management > Device Monitor. Right-click on the drive and choose Up Drive.

Page 4: Status Code_213(No Storage Units Available for Use)

Status Code 213: No storage units available for use. Page 4 of 7

1.3.3 Drive fails to come up. If the drive does not come UP, the operating system error logs should be reviewed for errors. The system error logs on a UNIX media server are located at:

• Solaris: /var/adm/messages

• HP-UX: /var/adm/syslog/syslog.log

• AIX: syslog log is not turned on by default. Check /etc/syslog.conf file for location of *.debug

and *.error log file. If it is commented out (if there is a # in front of *.debug and *.error), syslog is

not recording errors at the OS level. To troubleshoot drive issues, turn on syslog logging. The

AIX’s errpt command can also be used to look for hardware errors.

• Linux: /var/log/syslog

• The system error logs on Windows media server are: Event Viewer Application and System logs.

If the reason for the drives going down is not obvious, please contact the operating system manufacturer and the drive manufacturer for additional assistance in troubleshooting the drive issues.

1.4 Verify BPCD is Listening If bpsched on the master cannot communicate with bpcd on the media server, bpsched sets the number of available drives in that storage unit to 0 and further backups to that storage unit during the backup session fail. The number of available drives remains at 0 until the scheduler is initialized again. If bpcd is not listening, the bpsched log file shows something similar to: 00:09:10 [7703] <16> getsockconnected: exceeded timeout of 30 seconds 00:09:10 [7703] <2> getsockconnected: sockfd:-1 timo:32 00:09:10 [7703] <16> bpcr_connect: Can't connect to client server1 00:09:10 [7703] <16> start_bptm: connection refused by host server11

Page 5: Status Code_213(No Storage Units Available for Use)

Status Code 213: No storage units available for use. Page 5 of 7

00:09:10 [7703] <16> get_stunits: get_num_avail_drives failed with stat 204 00:09:10 [7703] <4> get_db_info: no available storage units 00:09:10 [7703] <8> bpsched_main: failed getting database information 00:09:10 [7703] <16> log_in_errorDB: scheduler exiting - no storage units available for use (213)

Verify that the master server can communicate with the bpcd process on the server that has the storage unit.

a. On a UNIX server, executing: # netstat -a | grep bpcd

should return something similar to the following: *.bpcd *.* 0 0 0 0 LISTEN

If it doesn't then check the /etc/services file and verify this line exists: bpcd 13782/tcp bpcd

Also check /etc/inetd.conf for this line and make sure the path to bpcd is correct:

bpcd stream tcp nowait root /usr/openv/netbackup/bin/bpcd bpcd

b. On a Windows server, from a command line, issue the following command and confirm a

connection is established: telnet <media_server_name> 13782

If it doesn't then check the <INSTALL_PATH>\system32\drivers\etc\services file and verify

this line exists: bpcd 13782/tcp

Additionally, confirm the NetBackup Client Service is running on the media server, as this confirms bpcd is running.

c. If bpcd is operating correctly, create bpsched and bpcd activity log directories and retry the operation. Check the resulting activity logs for any failures.

1.5 Hostname Or IP Resolution Issue. If hostname resolution is not working, the master cannot communicate with the media server, and storage unit is marked as unavailable. When the master cannot communicate with the media server, the bpsched log shows messages similar to: 17:13:22.899 [7136] <16> start_bptm: cannot connect to servera 17:13:22.899 [7136] <16> start_bptm: bpcd exit: cannot connect to server backup restore manager (205) 17:13:22.899 [7136] <16> get_stunits: get_num_avail_drives failed with stat 205 17:13:22.899 [7136] <16> log_in_errorDB: cannot connect to server servera; marking storage unit servera-hcart-robot-tld-0 as unavailable 17:13:22.920 [7136] <2> ?: available drives = 0, shared drives = 0, allow_mult_ret = 0

Verify forward and reverse (by ip) host name resolution is working. One method of verification is to use

/usr/openv/netbackup/bin/bpclntcmd.

Please see the TechNote referenced below in the "Related Documents" section.

Page 6: Status Code_213(No Storage Units Available for Use)

Status Code 213: No storage units available for use. Page 6 of 7

1.6 Determine if Drives are Over-Committed In an SSO environment, when drives are over-committed the jobs typically requeue with a status code 134. In current situations, due to timing this scenario may result in a status code 213. For example: In an environment containing one shared drive, two jobs are submitted at the same time. After the two jobs finish, a third job is submitted:

1. The first two jobs initially go active at the same time because the bpsched process sees the drive as available and current jobs as 0.

2. The first job submitted goes active, reserves the drive, and runs to completion. 3. The second job submitted goes active, tries to mount the drive, sees the drive is busy and

continuously requeues with a status code 134 until the first job completes. 4. After first job has completed, second job goes active, reserves the drive, and complete.

After both jobs are DONE in the activity monitor BUT PRIOR TO UNMOUNTING DRIVE another job is submitted. This job fails with a status code 213. The reason the second job requeued with a status code 134 is that the first job was active. A status code 134 is reported if bptm tries to mount the drive and finds it busy. The reason the third job fails with a status code 213 is the bpsched process sees the previous jobs as finished so it queues the job. When querying the Storage Unit for an available drive, ltid reports back to bpsched the drive is “in use”. The drive is considered “in use” because the tape has not been unloaded. Since the drive is in use, the job fails with a status code 213. The difference is the third job was not ACTIVE yet. Since the job was not active yet and was in the queued state, it received a 213. Check the bpsched log for: 10:14:10.994 [16724] <2> find_stunit: (megatron-dlt-robot-tld-1) strec.aj=0 strec.cj=2 strec.ad=1 aj=0 cj=1 ndtr=2 ajtr=0 10:14:10.994 [16724] <2> is_ok_to_use_stuint: stunit <megatron-dlt-robot-tld-1> overcommited, need 1 copies

Over-commits apply to Windows as well as UNIX. There is a touch file that disables the over-commit check. This empty file is placed on the Master server. On UNIX, the location is: /usr/openv/netbackup/oc_check_disable

On Windows in this location: <install_path>\netbackup\oc_check_disable

IMPORTANT NOTE: The oc_check_disable file should ONLY apply to SSO environments. In an SSO

environment, status 134 errors may periodically occur due to the fact that the drives are actually over

committed. It should be stressed the touch file oc_check_disable should ONLY be used in 4.5 pre-MP4

/ FP4 and in 5.1 for specific timing issues resulting in a 213 error instead of a 134 requeue. This file should be removed if it has no positive affect on the problem, and it should be removed after an upgrade.

Page 7: Status Code_213(No Storage Units Available for Use)

Status Code 213: No storage units available for use. Page 7 of 7

2 Links

Click here to Search for other documents on Status 213 Click below to perform a search on the following relevant items:

• Storage unit

• 213