46
Copyright © 2013 EMC Corporation. All Rights Reserved. SCSI-3 PGR Support on Symm Kevin Wang Nov 2014

SCSI-3 PGR Support on Symm

Embed Size (px)

Citation preview

Page 1: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

SCSI-3 PGR Support on Symm

Kevin WangNov 2014

Page 2: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

SCSI-3 Persistent Group Reservation

Page 3: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

SCSI-3 Persistent Group Reservation (PGR)• This is a reservation key based device locking method where each

initiator’s path (registrant) registers itself to a device using a set reservation key and any one of the registrants can hold an Active lock at a single point in time.

• The reservation is “persistent” as the lock information is held on the SFS within the Symmetrix and is not affected by a SCSI bus reset.– This means that the host can be shut down but the lock will stay “Active”

in the Symm until one of the registrants with the same reservation key PREEMT or RELEASE the Active reservation.

Page 4: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

SCSI-3 Persistent Group Reservation (PGR)• Usually, all initiators (HBA port) from a single host register with

the same reservation key. (much like EMC Grouped Reservation)– Same initiator can be resgistered multiple times if its presented to multiple

different FA ports. Each “path” per initiator will need to register.

• There is a maximum number of registrants limit in Symmetrix of 340 (decimal) registrations per device. – This limit was causing issues with Microsoft Cluster Shared Volume where

all the hosts in the cluster registered each of its paths to the device simultaneously and the number exceeded 340.

Page 5: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

SCSI-3 Persistent Group Reservation (PGR)• SC3 Port flag (seen by 8F,,,<port>) needs to be enabled so that the

Inquiry data returned by any device on the port to the host reports Symmetrix supporting SCSI -3.

• From Enginuity 5875, SCSI-3 Pers Resv (PER) bit on the Symm devices are enabled by default. – To check = D1,<dv>,C from FA– There should not be any negative effect by having it enabled on SCSI-2

only hosts

Page 6: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

D1,<dv>,C• This shows the device flags. SCSI-3 Pers Rsv flag is now enabled by

default on all devices at Enginuity 5875 or above.

Page 7: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

D1,<dv>,A• First use A1,D,<dv>,<cnt> to determine which director the device

lock is held.

Page 8: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

8F,’PGR’,VIEW,<dv>• This command shows the last update time of the lock as well as

the registrants and the reservation key.

• Initiator in CYAN is the current active lock holder and this can change often when there is IO to the device.

Page 9: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

SCSI-3 PGR Node A Node B

7c:0 10c:1

0x195c

KEY = 1234123412341234KEY = 1234123412341234KEY = 0000000000000000KEY = 0000000000000000

Page 10: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

How to release SCSI Reservation

Page 11: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Solutions Enabler to clear SCSI Reservation• Exclusive Reservation (SCSI-2) and Group Reservation (SCSI-2)

can be cleared by Solutions Enabler SYMCLI.– symld -g dg_name break LdevName

• Persistent Group Reservation (SCSI-3) cannot be displayed or released using SYMCLI commands. – It can only be cleared by Inlines command or a dedicated host

application that can clear SCSI-3 PGR.

Page 12: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

How to clear a lock for an open system device on a Symm 4, Symm5, Symm6, Symm7 or Symm8 with inline

Page 13: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Releasing SCSI-3 PGR Reservation

1. Run A1,D and find which FA lock is active on

2. Go to the FA holding the active reservation

3. Run following Inlines command– BC,BF,F0,RCVR,ALLI,ALLP,<dv>,PRSV

Note: Broadcast to all FA is being performed via BC,BF, prefix.This is done because the lock holder may change over time.

Page 14: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Lab Example

Page 15: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Lab example – PGR ReserveAfter host has performed reserve:

D1,20,A

8F,'PGR',VIEW,20

Page 16: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Lab example – PGR ReleaseAfter the host releases the reservation:

8F,'PGR',VIEW,20– Notice the CYAN colored text are gone.– Initiator is still registered. (May be cleared depending on host action)

D1,20,A– Does not show any active lock on the device now.

Page 17: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Lab example – PGR Clear• After the host performs a SCSI3 PGR Clear (0x03)

– All the registrations and reservation is cleared– Meaning the device is cleared of SCSI-3 reservation.

8F,'PGR',VIEW,20

D1,20,A

Page 18: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study

Page 19: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – Ranbaxy Laboratories Limited, SR 64251214

• Symmetrix DMX-3

• Customer's windows cluster failed to pass the validation test before creating the cluster

• After the investigation, seems the SCSI-3 PGR reservation should be the root cause

• Manually intervention is required to fix this issue

Page 20: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – Ranbaxy Laboratories Limited, SR 64251214

• Customer's windows cluster failed to pass the validation test before creating the cluster

• After the investigation, seems the SCSI-3 PGR reservation should be the root cause

Page 21: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – Ranbaxy Laboratories Limited, SR 64251214

• Per customer’s request, checked the affected device 191 and 1EA and did not find any SCSI reservation on it.

Page 22: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – Ranbaxy Laboratories Limited, SR 64251214

• After further investigation, I found that the SCSI-3 PRG reservation has not been enabled both on the FA port level and the device. So below is the action that customer need to take to fix this issue.

• Enable the SCSI-3 PGR bit at FA level use the command like this: symconfigure -sid xxx -cmd "set port xxx:x SCSI_3=enable;" commit -v –noprompt

• Enable the SCSI-3 PGR support for the device that needs this feature be enabled use the command like this: symconfigure -sid xxx -cmd "set device <dev_num> attribute=SCSI3_persist_reserv;" commit -v -noprompt

Page 23: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – Ranbaxy Laboratories Limited, SR 64251214

• Enable the SCSI-3 PGR bit at FA level use the command like this: symconfigure -sid xxx -cmd "set port xxx:x SCSI_3=enable;" commit -v –noprompt

Page 24: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – Ranbaxy Laboratories Limited, SR 64251214

• Enable the SCSI-3 PGR support for the device that needs this feature be enabled use the command like this: symconfigure -sid xxx -cmd "set device <dev_num> attribute=SCSI3_persist_reserv;" commit -v -noprompt

Page 25: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TI Telecom Italia, SR 53441840

• Symmetrix VMAX 20K

• Device (1263) was removed from an SG (or deleting a masking view) without first having removed it from the cluster configuration, resulting in PGR being left over on the devices.

• PSE was engaged to fix this issue

Page 26: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TI Telecom Italia, SR 53441840

• DB0D.22 is logging on FA port 16f against device 1263 which is also locked by director 16f

Page 27: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TI Telecom Italia, SR 53441840

• 1263 is locked by WWN 2312 from director 16f

Page 28: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TI Telecom Italia, SR 53441840

• No initiator at FA:16f is registered with device 1263

Page 29: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TI Telecom Italia, SR 53441840

• Clear the reservation lock and error stopped

Page 30: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• 70.1222.D0 - Persistent group reserve found on non-pgr device

• 75.DB0D.43 - Persistent group reserve error

• 1222.D0 and DB0D.43 logging since April 2014 against devices 1050, 1178, 12A7 and 12A9 and streaming on FA 6G:0

Page 31: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• Devices are mapped on FA 6G:0, 7G:0, 10G:0 and 11G:0

Page 32: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• Checked the director flag setting and found that SCSI-3 PGR bit had not been enabled on FA 6G:0

Page 33: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• The SCSI-3 PGR support has not been enabled on those four devices, hence we can not get correct PGR register information

Page 34: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• Checked the lock status on one device and found the following problems (lock status was changing all the time), actually all four devices have the same issue. Reservation type code is WRITE EXCLUSIVE REGISTRANTS ONLY but no initiators on this director are registered which means this function did not work as design.

Page 35: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• Later we issued command A1,D,<device>,<count> on those four affected devices and system did not report any live SCSI locks which is so strange.

Page 36: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• So at this moment, can we find out some clues which may caused this issue from the previous inline output?

• Even if the 8F,PGR,VIEW,<device> command did not work for this case, we can still find the useful information from the following inline output

• All four affected devices were reporting locked by lp_id 0203. The lp_id here can be convert to index in the first column of command 8F,DVIN,VIEW. Index 0 represents lp_id 1; index 1 represents lp_id 2 etc…Here index 202 represents lp_id 203.

Page 37: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• The first problem we observe is that index 202 is a PWWN which logged on 6G:1. But actually all four affected devices are mapped to 6G:0.

• The second problem is that the index 202 does not log in the box correctly. Below is the evidence.

• For common HBA log in, we should have flags show as C3 04.

Page 38: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• If we have the left flag as C2.

• DefineBit(IR_DVIN_EXISTS , 1), /* Initiator is a known initiator. (Known initiators are those which might have logged in earlier and logged out) */

• DefineBit(IR_ALIVE_INITIATOR , 6), /* Initiator is logged in and executed an IO. */

• DefineBit(IR_ACTIVE_INITIATOR , 7) /* Flag updated in emul_new_ulp_cmd() whenever we receive a cdb from CDI. */

• We probably have issue that the HBA is logged into the FA although the host is not seeing the devices and the "symmask -sid <SID> list login" command doesn't report the WWN as logged in. The host needs to be forced to re-initialize its login.

Page 39: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• We have left flag as 06, we would not find the related record in FC,NAME which means this PWWN does not log in correctly.

Page 40: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• We can not check the status of command FC,NTBL,<D_ID>,1 also can not find it in 8F,,,1.

Page 41: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• (First Attempt) In order to fix the issue, ENG tried to release the lock by use SCSI-3 releasing inline command on those four devices, but it did not work. Lock status was still changing as before.

Page 42: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• (Second Attempt) As the current issue was so strange and the PRSV command could not fix the issue, ENG issued regular releasing inline command on those four devices, but it still did not work. Lock status was still changing.

Page 43: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• (Third Attempt) ENG set the trace and 8C the device record but it did not work, so he changed it back.

Page 44: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• 8F,PGR,DBCK,<device>,1,RCLR,FORC

• (Final Attempt) ENG removed the reservations from the devices by removing the PGR database records for each registered initiator from SFS. It works this time.

Page 45: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

Case Study – TIM RIO NORTE S/A, SR 66293766

• 1222.D0 and DB0D.43 error stopped finally

• We found several entries on April 8th for mapping changes on these devices to their apparent current mapping locations so that might be the root cause.

Page 46: SCSI-3 PGR Support on Symm

Copyright © 2013 EMC Corporation. All Rights Reserved.

THANK YOU