Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
HP IT-Symposium 2006
www.decus.de 1
16 May 2006
© 2006 Hitachi Data Systems
Nigel HoneHitachi Data Systems
Vortrag : 3L04Linux Storage Multipathing Techniques
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
1
May 16, 2006© 2006 Hitachi Data Systems 2
Vortrag : 3L04Linux Multipathing Techniques
● More than one path to a disk system is already a standard on many systems. On other systems the standards are still emerging.
● This talk discusses the current dual or mutltipath issues and solutions using Linux as the example Operation System.
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HP IT-Symposium 2006
www.decus.de 2
May 16, 2006© 2006 Hitachi Data Systems 3
Aims of talk
●At the end of this presentation youwill understand the basic issues of multipathing that you can take intoconsideration when implementinga multipath environment.
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
May 16, 2006© 2006 Hitachi Data Systems 4
Agenda
●Multipath providers.●Multipath hardware.●Multipath detection.●Multipathing algorithms.●Problem issues.
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HP IT-Symposium 2006
www.decus.de 3
May 16, 2006© 2006 Hitachi Data Systems 5
Agenda
●Multipath providers.●Multipath hardware.●Multipath detection.●Multipathing algorithms.●Problem issues.
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
May 16, 2006© 2006 Hitachi Data Systems 6
Multipath Providers
controller1controller2
Server
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HBA HBA
Linux
HBA Driver(Qlogic)
controller1controller2
Server
HBA HBA
LinuxMultipath
controller1controller2
Server
● Three types of Multipath Software (here examples)In the HBA Driver
( Qlogic )In the Kernel( Newest SUSE )
ISV( HDS HDLM, HP SecurePath,
EMC PowerPath, Veritas DMP, IBM RDAC)
HBA HBA
HDLMLinux
Included in boot image Load after boot
HP IT-Symposium 2006
www.decus.de 4
May 16, 2006© 2006 Hitachi Data Systems 7
Agenda
●Multipath providers.●Multipath hardware.●How does the software detect multipaths●Multipathing algorithms●Problem issues.
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
May 16, 2006© 2006 Hitachi Data Systems 8
„Standard“ SAN connections
HBA
HBA SAN Switch
SAN Switch
controller1
controller2
Server
Normally we have:…Storage with two controllersTwo fabricsTwo Seperate HBAsIn my configuration here,Each HBA sees 3 disks
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HP IT-Symposium 2006
www.decus.de 5
May 16, 2006© 2006 Hitachi Data Systems 9
Multipath Software on „Standard“ SAN connections
SAN Switch
SAN Switch
controller1
controller2
Server
We need a software to combine the multiple Physical devices into „Emulated“logical devices......the Multipath Software...
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HBA
HBA
Multipath Software
...which emulates disks......more later...
May 16, 2006© 2006 Hitachi Data Systems 10
Variations on the SAN theme. One Fabric.
HBA SAN Switch
controller1
controller2
Server
With One Fabric like thisOne HBA……….. sees Six disksWith two paths we can do, for example, nondisruptive controller updates
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HP IT-Symposium 2006
www.decus.de 6
May 16, 2006© 2006 Hitachi Data Systems 11
Variations on this theme.One Storage Controller on two fabrics.
Two HBAs.
HBA SAN Switch
controller1
Server
With One ControllerEach HBA……….. sees Three disks again
HBA
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
May 16, 2006© 2006 Hitachi Data Systems 12
From „Standard“ SAN connectionsto „Criminal“ SAN connections.
HBA
HBA SAN Switch
SAN Switch
controller1
controller2
Server
With „interconnected“ FabricsStorage with two controllersAll to All connections (my controllers have two ports each)
Two Seperate HBAsIn my configuration here,Each HBA sees 12 disks
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HP IT-Symposium 2006
www.decus.de 7
May 16, 2006© 2006 Hitachi Data Systems 13
Basic Multipath Software fucntion.
Taken from the Hitachi Dynamic Link Manager for Linux Users Guide
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
before after
May 16, 2006© 2006 Hitachi Data Systems 14
Drive addressing (example HDLM)
Table 1.4 Using Logical Device File Names: Application Accesses the LUHost Status Application Device Logical Device File NameBefore installing HDLM The application uses the logical device file name for the SCSI device.Example:- sda- sdbAfter installing HDLM The application uses the logical device file name for the HDLM device.Example:- sddlmaa
sdasdb
sddlmaaGermans: please read sddlmaa as sd-dlm-aa and not elsewise☺
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HP IT-Symposium 2006
www.decus.de 8
May 16, 2006© 2006 Hitachi Data Systems 15
Example OpenVMS disk addressing
─ multiple paths (2 of them)$1$DGA345
PGB0.5006-0E80-0359-FC00
PGA0.5006-0E80-0359-FC10
LUN
HBA
$1$DGA345
Host VENUS
PGB0 PGA0
5006-0E80-0359-FC105006-0E80-0359-FC00
HBA
0359FC = machine serial number in hex
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
Disk $1$DGA345:, device type HITACHI, online, has multiple I/O paths,
I/O paths to device 2Path PGA0.5006-0E80-035A-0110 (VENUS), primary path, current path.Path PGB0.5006-0E80-035A-0100 (VENUS).
VENUS_SYSTEM >> show dev dg
May 16, 2006© 2006 Hitachi Data Systems 16
Hitachi uses a „Virtualising Array“● Our Raid Arrays ( Lightning 9970V 9980V, Thunder
9570V 9585V) „virtualise“● There are „pairs“ of „ports“● Connected to the two Fabrics● JBODs are simulated *● Each one for a different Server
Lightning 9970V PORT
FABRIC FABRIC
PORT
Lun 0
Lun 2Lun 1
Lun 3
Lun 0
Lun 2Lun 1
Lun 3
Lun 0
Lun 2Lun 1
Lun 3
OpenVMS
OpenVMS
Linux
Linux
Linux Linux
* Virtualisation based according to WWN of server
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HP IT-Symposium 2006
www.decus.de 9
May 16, 2006© 2006 Hitachi Data Systems 17
Ask vendor specifically about multipath settings
● This is taken from the HDS USP Linux configuration guide
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
Read The Fine Manuals
May 16, 2006© 2006 Hitachi Data Systems 18
Recap: Multipath hardware.● Multipath is for redundancy
─ Not performance● Keep the SAN simple
─ Best SAN planning is two independent Fabrics8 Paths to the storage box is OK BUT..Optimum performance is two paths to each LUN!!
● Understand which devices are emulated and which devices are „real“.
● There is a difference between „working“ and „supported“.● Linux „support“ is sometimes difficult.
─ The vendor will not supply multipath driver source code to compile.
─ You have to accept the releases that work.Ask your vendor.
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HP IT-Symposium 2006
www.decus.de 10
May 16, 2006© 2006 Hitachi Data Systems 19
Agenda
●Multipath providers.●Multipath hardware.●Multipath detection.●Multipathing algorithms.●Problem issues.
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
How does the multipath software recognise two paths to the same device?
May 16, 2006© 2006 Hitachi Data Systems 20
Multipath Addressing 2. Logical Setup
SAN Switch
SAN Switch
controller1
controller2
….. this „Physical Disk“ here, and ….
MultipathSoftware
OperatingSystem
This Multipath software must know that…
….. this „Physical Disk“ here, …. ….. is really this Disk here
And will simulate this Disk here
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HBA
HBA
Server
HP IT-Symposium 2006
www.decus.de 11
May 16, 2006© 2006 Hitachi Data Systems 21
Disk Identification
SAN Switch
SAN Switch
controller1
controller2
….. this „Physical Disk“ here, and ….
MultipathSoftware
OperatingSystem
So how does the Multipath Software know that…
….. this „Physical Disk“ here, …. ….. is really this Disk here
în order to simulate this Disk here?
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HBA
HBA
Server
May 16, 2006© 2006 Hitachi Data Systems 22
Disk Identification method 1 of 2: : SCSI Inquiry.
SAN Switch
SAN Switch
controller1
controller2
….. Here we see that „HITACHI..0H01050068” from the green Path….The Multipath Software will send a SCSI Inquiry to the „Physical Disk“
….. Is the same as …. ….. „HITACHI..0H01050068” from the Red Path….
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HBA
HBA
MultipathSoftware
SCSI InquiryHITACHI..0H01050068
SCSI InquiryHITACHI..0H01050068
HP IT-Symposium 2006
www.decus.de 12
May 16, 2006© 2006 Hitachi Data Systems 23
SCSI Inquiry ExampleD-SRV-L-01:/home/rudi/lsscsi # lsscsi -vsysfsroot: /sys....
disk HITACHI DF600F 0000dir: /sys/bus/scsi/devices/1:0:0:1 [/sys/devices/pci0000:00/0000:00:02.0/0000:01:00.0/0000:02:0c.0/host1/1:0:0:1]....
disk HITACHI DF600F 0000dir: /sys/bus/scsi/devices/2:0:0:1 [/sys/devices/pci0000:00/0000:00:02.0/0000:01:00.2/0000:03:0b.0/host2/2:0:0:1]
D-SRV-L-01:/home/rudi/lsscsi # ls /dev/sd* | inqraid 2>/dev/null......./dev/sdc -> CHNO = 0 TID = 0 LUN = 1
[ST] CL1-B Ser = 261 LDEV = [HITACHI ] [DF600F ]HORC = SMPL HOMRCF[MU#0 = SMPL MU#1 = SMPL MU#2 = SMPL]RAID5[Group 0- 0] SSID = 0x0000
......./dev/sdf -> CHNO = 0 TID = 0 LUN = 1
[ST] CL2-B Ser = 261 LDEV = [HITACHI ] [DF600F ]HORC = SMPL HOMRCF[MU#0 = SMPL MU#1 = SMPL MU#2 = SMPL]RAID5[Group 0- 0] SSID = 0x0000
D-SRV-L-01:/home/rudi/lsscsi # ls /dev/sd* | inqraid -inqdump 2>/dev/null
..................
---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffffca00]0000: 00000302 b3001102 48495441 43484920 ........[0xffffca10]0010: 44463630 30462020 20202020 20202020[0xffffca20]0020: 30303030 44363048 30313035 30303639 0000D60H0[0xffffca30]0030: 00314200 01010101 00000000 00000000 .1B.......................---ADDR--- -OFF- 0-1-2-3- 4-5-6-7- 8-9-A-B- C-D-E-F- ------CHAR------[0xffffca00]0000: 00000302 b3001102 48495441 43484920 ........[0xffffca10]0010: 44463630 30462020 20202020 20202020 DF600F[0xffffca20]0020: 30303030 44363048 30313035 30303639 0000D60H011500[0xffffca30]0030: 00324200 01010101 00000000 00000000 .2B.............
/dev/sdc
/dev/sdc
are same Diskand /dev/sdf
/dev/sdf
/dev/sdc is on HBA „1“
[1:0:0:1]
/dev/sdf is on HBA „2“
[2:0:0:1]
/dev/sdc is „Ldev“ 105
105
/dev/sdf is *also* „Ldev“ 105105
Here is a hex dump of the SCSI INQ dataIt is an HITACHI device
HITACHI
HITACHI
It is a „DF600“ ArrayDF600F
Array Serial number is x„0115“
11500
„Ldev“ number is 69
69
69 x“69“ is 105 in Decimal!
There is only one Ldev 105 on Hitachi DF600 Serial Number x“0115“ Worldwide!!
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
May 16, 2006© 2006 Hitachi Data Systems 24
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
Ldev
Hitach Dynamic Link Manager ScreenShot
HBA
PathPort
Type & Ser#
HP IT-Symposium 2006
www.decus.de 13
May 16, 2006© 2006 Hitachi Data Systems 25
Disk Identification method 2 of 2: WWNN World Wide NODE Name
SAN Switch
SAN Switch
WWPN
WWPN
Each Controller has a (World Wide Port Name)
MultipathSoftware in Driver
Every „Node“ in a SAN has a (World Wide Name) (or World Wide Number)
The Storage System has a single (World Wide Node Name)
The HBA Driver (Qlogic for example) uses this to correlate paths to Disks
HBA
HBAHBA Driver
OperatingSystem
Only the HBA Driver really knows about WWNs
WWNN
WWN
WWPNWWNN
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
May 16, 2006© 2006 Hitachi Data Systems 26
Qlogic HBA Multpathing
WWPN WWNN
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HP IT-Symposium 2006
www.decus.de 14
May 16, 2006© 2006 Hitachi Data Systems 27
HDS 9500 Setting for Qlogic Multipathing(WWNN setting)
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
May 16, 2006© 2006 Hitachi Data Systems 28
Recap: Multipath Detection
● Scsi Inquiry─Storage has to send correct information
● WWNN─Storage has to send correct information
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HP IT-Symposium 2006
www.decus.de 15
May 16, 2006© 2006 Hitachi Data Systems 29
Agenda
●Multipath providers.●Multipath hardware.●Multipath detection.●Multipathing algorithms.●Problem issues.
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
May 16, 2006© 2006 Hitachi Data Systems 30
Pathing Algorithms
HBA HBA
Server
Controller1 Controller2
Storage
PathFailover
HBA HBA
Server
Controller1 Controller2
Storage
Load Balance
If an error is detected the path set offline
error error
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HP IT-Symposium 2006
www.decus.de 16
May 16, 2006© 2006 Hitachi Data Systems 31
„Midrange“ Storage multipath issues.
HBA HBA
Server
Storage
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
Controller1 Controller2
Disk Info
• In a „Midrange“ Storage, we have a..• „Current“ or „Owning“ Controller• This has all the information for a specific Disk• The other controller is the „Non-Owner“• It does not currently have the full information
for specific Disk• If we do an I/O to the Non-Owning controller• the Disk Info must be switched from the
owning controller• „LoadBalance“ on a „Failover“ Storage can
produce bad „Owner Ping-pong“performance problems
• Either the Multipath Software must recognisethe Storage array correctly*... Or...
• You must configure the Multipath Software correctly
*for example with the Veritas DMP ASL
May 16, 2006© 2006 Hitachi Data Systems 32
A good optimum.Path Failover and Static Load Balance
HBA HBAServer
Controller1 Controller2
Storage
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
LUN0
LUN1
LUN2
LUN3
In this example the Odd Disks are accessed by controler2 and the even Disks by controller1
HP IT-Symposium 2006
www.decus.de 17
May 16, 2006© 2006 Hitachi Data Systems 33
HDLM Settings
● HDLM has the following optional functionalities.
● Load Balancing Algorithm (2.1)● Path Health Checking● Auto Failback● Intermittent Error Monitor (2.3)● Cluster Considerations (2.4)
● Dynamic Reconfiguraton (2.2)─ (Windows Only)
OFF =Round Robin...
Extended-Round Robin
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
PathFailoverON =
Or...
May 16, 2006© 2006 Hitachi Data Systems 34
Single Path attachment. No Prefetch
HBA
CACHE
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006What is „Extended Round Robin“?
HP IT-Symposium 2006
www.decus.de 18
May 16, 2006© 2006 Hitachi Data Systems 35
Single Path attachment. With Prefetch
HBA
CACHE
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006What is „Extended Round Robin“?
May 16, 2006© 2006 Hitachi Data Systems 36
Round-Robin load balance. With or without prefetch
HBA
HBA
CACHE
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
Note: Some Arrays *can* recognise „multipath sequential I/O“.
What is „Extended Round Robin“?
HP IT-Symposium 2006
www.decus.de 19
May 16, 2006© 2006 Hitachi Data Systems 37
„Extended Round Robin“ and prefetch.
HBA
HBA CACHE
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006What is „Extended Round Robin“?
May 16, 2006© 2006 Hitachi Data Systems 38
Recap: Multipathing Algorithms
●Path Failover●LoadBalance─Round-Robin─Extended-Round-Robin (HDS)
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
HP IT-Symposium 2006
www.decus.de 20
May 16, 2006© 2006 Hitachi Data Systems 39
Agenda
●Multipath providers.●Multipath hardware.●Multipath detection.●Multipathing algorithms.●Problem issues.
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
May 16, 2006© 2006 Hitachi Data Systems 40
San Boot
kernelmultipath
Server
HBA HBA
controller1 controller2
Server
HBA HBA
controller1 controller2
kernelmultipath
kernel
Server
HBA HBA
controller1 controller2
kernel
kernelmultipath
kernel
multipath
multipath
kernelmultipath
NORMALLY WE KEEP "APPLICATION DATA"
ON THE SAN
● As already mentioned, we do not recommend SAN Boot
ISV: MP does not work!S/W Loaded after boot
Kernel integrated does work. (new SUSE)
Internal boot is best.
Vortrag : 3L04 18/05/2006
HP IT-Symposium 2006
www.decus.de 21
May 16, 2006© 2006 Hitachi Data Systems 41
Error detection (and timing )
SAN SwitchSAN Switch
ControllerController
Multipath Software
HBA HBA
Application
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
Server • An I/O is initiated by an „Application“and is passed to the Multipath Software
• The Multipath Software will initiate the I/Othru one of the paths.
• The HBA (driver) will initiate the I/O to theStorage
• If an error occurs on the „HBA Side“ of theSwitch
• The Multpath Software will know almostimmediately
• The Alternate Path will be used.
error„H
BA
Sid
e of
sw
itch
May 16, 2006© 2006 Hitachi Data Systems 42
Error detection (and timing )
SAN SwitchSAN Switch
ControllerController
Multipath Software
HBA HBA
Application
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
Server • An I/O is initiated by an „Application“and is passed to the Multipath Software
• The Multipath Software will initiate the I/Othru one of the paths.
• The HBA (driver) will initiate the I/O to theStorage
• If an error occurs on the „Storage Side“of the Switch
• The Multpath Software will NOT know immediately
• The HBA Driver Timer will pop.• NOW, if the Application Timer pops
before the IO returns.....• The Application will die.• You will have to test this!
error
„Sto
rage
Sid
e of
sw
itch
error
HP IT-Symposium 2006
www.decus.de 22
May 16, 2006© 2006 Hitachi Data Systems 43
AutoFailback and Intermittent error detection● Autofailback will set a failed path back online after a certain time (typical 5
minutes)─ If you have good technicians and monitoring, set Autofailback off and
correct errors manually.● When a path suffers from specified number of errors within the specified
interval, HDLM determines this path as “Intermittent Error Path” and placed to Error status. After this, HDLM will not apply auto fail back for this path. This path must be placed Online manually.
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
errortime AFB
Start Monitoring
error
FirstTime
AFBerror
SecondTime
AFBerror
ThirdTime
Exclude path from AFB
time
30 Minutes
May 16, 2006© 2006 Hitachi Data Systems 44
SCSI-Reserves
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
DiskReserve
SCSI-2Reserve
SERVER SERVER
Some OSes will issue a „reserve“to a disk when it goes onlineThis prevents another OS from accessing the disk
Disk
SCSI-2Reserve
SERVER SERVER
MultipathSoftware
With Multipath Software this would be a big problem
ReserveError
ErrorDisk
SCSI-2 ReserveSERVER SERVER
MultipathSoftware
Error
Token
PGRPGR
SCSI-3 PGR
The multipath software will convert a SCSI-2 Resrve to a SCSI-3 PGR (Persistant Group Reserve)
HP IT-Symposium 2006
www.decus.de 23
May 16, 2006© 2006 Hitachi Data Systems 45
Recap: Problem issues
●SAN Boot●Error Detection●Autofailback●SCSI Reserve
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
May 16, 2006© 2006 Hitachi Data Systems 46
Wrap up and Question time
●Multipath providers.●Multipath hardware.●Multipath detection.●Multipathing algorithms.●Problem issues.
Hmm..... do we have time for questions?
IT-Symposium 2006
www.decus.de
Vortrag : 3L04 18/05/2006
The Agenda was.......
HP IT-Symposium 2006
www.decus.de 24
16 May 2006
© 2006 Hitachi Data Systems
Nigel HoneHitachi Data Systems
Vortrag : 3L04Linux Storage Multipathing Techniques
THANKS
Vortrag : 3L04 18/05/2006