15
Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

Embed Size (px)

Citation preview

Page 1: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

Glasgow External RAID and Linux Server

Presentation to UK HEP SYSMAN meeting 21-22 March 2000

A.J.Flavell

Glasgow PPE Group

Page 2: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

Server• Viglen XX2 PLUS - 2 x 500MHz P-III

Xeon.

• Two onboard SCSI channels

• Separate internal Mylex RAID, but we took the min configuration, two 9GB SCSI disks.

• Backup device DLT7000 in HP tape-changer, came with its own PCI SCSI controller - a good thing!

Page 3: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

First steps

First attempt to install linux on the internal RAID was a mess. I scheduled it over a holiday weekend. It in fact installed but I failed to boot it. In a panic I rushed out to the local PC store and bought an IDE hard disk and installed there. This booted fine. And then I realised the system had installed on the internal RAID, but I still couldn't boot it. And I'm still using the IDE disk for booting, although everything else comes off the internal RAID. One day I'll sort that out.

Page 4: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

External RAID system (1)

• Target was our budget, rather then a size.

• Looked as if 750GB nominal would be feasible.

• Two kinds of offering:

• external disk modules with RAID controller supported by OS

• external self-contained subsys, looking, to the OS, like generic SCSI disks (Sweet Valley)

• Vendors reluctant to support linux: made the Sweet Valley approach attractive.

Page 5: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

External RAID system(2)• Final system from Sweet Valley had 15 x 50GB nominal. Priced for

inst. in customer rack. Needs a very deep rack!

• Two boxes with eight disk positions. One box contains RAID controller (we did not take the redundant controller option). Dual PSUs per box.

• One disk slot is empty, one holds a hot spare, rest are configured (we use RAID5).

• Box can be configured:

• from front panel buttons (we do this: it's not too bad)

• from RS232 terminal line (didn't seem vastly better)

• down the SCSI channel from host software (we don’t)

Page 6: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group
Page 7: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group
Page 8: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

Terminology clash Mylex RAID, SV RAID, and unix.

• Mylex (internal RAID in our Viglen):

• Disks are aggregated into Drive group (a.k.a pack)

• Packs are divided into System Drive(s)

• System Drives in linux e.g /dev/rd/c0d1

• Linux partitions e.g /dev/rd/c0d1p5

• Sweet Valley (Infortrend controller):

• Disks are aggregated into 'Logical Drive(s)'

• Logical Drives are divided into 'Partitions' (!)

• These in linux are e.g /dev/sdb; divided into linux partition(s) e,g /dev/sdb3

Page 9: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

SCSI Channels

• Initially surprised that channel was reported as Ultra2-SE.

• Vendor helpfully suggested CD-ROM drive was the problem.

• Well, CD-ROM was indeed SE, but the real point was misidentification of the SCSI channels on motherboard.

• As the Viglen had been delivered, the non-LVD channel was connected to CD-ROM and external connector, the LVD-capable channel was unconnected and had no cable.

• Ordered SCSI cable, connected LVD-capable channel to external connector, adjusted terminations – SUCCESS!

Page 10: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

Sweet Valley box’s Channels

• Sweet Valley box has three channels, one of which has the disks on it (both boxes).

• The other two can be configured as host channels.

• Disks can be made available to one or other host channel, but no sharing.

• Each host SCSI interface has one SCSI ID. We are currently using only one host channel.

Page 11: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

Size of "SV Partitions" (system drives)

• Attempting to offer the entire array to Linux did not work!

• Limit seemed to be set by the pseudo geometry:255 heads x 63 sectors x 65535 cylinders x 512bytes = 539*10^9

• So the SV has to be divided up at this level anyway.

• Each such "Partition" (system drive) gets a SCSI LUN.

• SCSI BIOS must be enabled for multiple LUNs on this device ID, and...

• Linux boot parameter max_scsi_luns=n for reasonable n

Page 12: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

Initial preparation

• When first attempted, the boot-up took ages, with much bleating. But, once the disks had been formatted, this went away. Linux documentation muttered about some errant controllers getting upset by multiple LUN probes, but we haven't seen anything.

• Configuration Example

• /dev/sda reported as 206GB/dev/sda1 69204056/dev/sda2 69204056/dev/sda3 69220152 makes 207628264/dev/sdb reported as 206GB/dev/sdb1 103811132/dev/sdb2 103819196 makes 207630328

• /dev/sdc reported as 208GB (not currently formatted)

Page 13: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

NFS software situation

• As initially installed with RedHat 6.1, the NFS server (with NFSv3 turned off) seemed happy with data transfers, but when more exciting things done e.g massive software builds, it would time-out. This was observed with both RH5.2 system and with Digital Unix 4 as clients. If soft mounted then it would fail; if hard mounted there would be a short hang, but then execution would continue without error.

• Investigations initially confused. Several relevant bits and pieces on web pages, but difficult to piece them together. The RedHat 6.1 distribution was kernel 2.2.12(20), the various bits and pieces did not fit. The situation described in a talk at Autumn'99 HEPiX was depressing.

Page 14: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

NFS software situation (2)• Booting the server was too slow and disruptive. Moved to an ordinary

desktop PC as playground. Worked-up kernel 2.2.14 and played with that.

• Then found ftp://nfs.sourceforge.net/pub/nfs/dhiggen-2.2.14/ NFS patches specifically for this kernel.

• Incorporated them, build went OK. Repeated build on actual server. Still learning the linux way of doing things, as I go along. Confusion here about how to manage the installation of kernel modules when two or more variants are wanted.

• Initial tests with Digital Unix not convincing: again "NFS2 server not responding". There are discouraging remarks on the web sites about cross OS compatibility. Work continues. This is our major priority, as disk space on the old server is very tight and we do not want to procure more when we have this big box.

• Shortly I also want to get Samba server on the road.

Page 15: Glasgow External RAID and Linux Server Presentation to UK HEP SYSMAN meeting 21-22 March 2000 A.J.Flavell Glasgow PPE Group

Conclusion/summary• Sweet Valley boxes have been fine, all the problems were

elsewhere.

• NFS server sort-of works but not happy with it, source code situation improved, but still confusion.

• Users clamouring for disk space as well as a full service to replace Digital Unix by autumn.

• No user service on the server yet; one filesystem being exported and mounted on existing systems (Digital Unix and RH5.2), user guinea-pigs.

• Samba server shall be installed shortly.

• Still lots of work ahead.