20
TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

Embed Size (px)

Citation preview

Page 1: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

Update since last HEPiX/HEPNT meeting

Page 2: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

NETWORKDNS BOOTP DHCPPRINTERS/PLOTTERSLEGACY VMS ALPHA/TRU64 SERVERSPC on X-WINDOWSPC PRINT/DOMAIN SERVERUNIX PRINT SERVERMAIL SERVERWEB SERVER

LINUX/TRU64 AP. SERVERLINUX COMPUTE SERVERUNIX BACKUP SERVERPC BACKUP SERVERExtruded Offsite Subnet

FUTURE DEVELOPMENTS WestGrid

Page 3: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

TRIUMF NETWORKComponents : 100 Mbit Fore Powerhub to UBC (route01) Passport 8003 Routing Switch – 8*1Gb ports

(2Q02) 32 Baystack 450-24T Gigabit Switches FDDI Ring (phased out Oct 1/01)Function: TRIUMF’s Internet connection Site-wide network connections Class B with 26 active subnets ( C ) ~800 active

connections

Page 4: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

Page 5: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

Module Name: REG/ERICHComponents : Dual DSSI/ Dual Vax 4000 VMS 6.2 128Mb memory each, 172Gbytes diskFunction: Legacy VMS server VMS Mail terminated Dec31/2000 Disable Login Sep 1/2002, Power-off

Jan1/2003

Page 6: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

Module Name: TNTSRV00

Components : Dual 1GHz P3, Windows 2000 1GB memory, 3*62GB SCSI disks Old TNTSRV01 acts as backupFunction: Windows Primary Domain Controller Active Directory File Server for Windows Print Server for PC’s/Macs Application Server for Macs

Page 7: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

Module Name: TRBACKUPComponents : 667 MHz Pentium III – VALinux- RH6.2 128 Mb memory, 10 Gbytes disk DLT-8000 (40/80GB tapes) ATL PowerStor L200 SDLT220 (8 slot

110/220GB) SDLT Native goals: 160(2002Q1),

320(2003Q4), 640(2005), 1280(2006Q4)Function: Central Backup/Restore Utility(BRU) server

Page 8: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

Linux RH 7.2FreeSWAN / I PSEC

142.90.126.3142.90.126.1 142.90.126.2

VPN1

TRIUMF

142.90.100.18Internet

142.90.125.1

TRIUMF Network

The TRIUMF - BC Research Extruded SubnetSecure tunnel between VPN1 and VPN2

BC ResearchVPN2

Linux RH 7.2FreeSWAN / IPSEC

142.90.126.254 207.23.243.188

GATEWAY 142.90.126.254NETMASK 255.255.255.0

BROADCAST 142.90.126.255

207.23.243.254 Router

Router

switch

Recipee: Steve [email protected]

Page 9: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

SUPPORT SUMMARYFAST – FLEXIBLE – MANAGEABLE NETWORKAll servers / network components on UPS/DieselRedHat Linux V6.2 / V7.1 (CERN Library)RedHat Linux V7.2 EXT3 (non-CERN)Migrate X-window terminals Redhat Linux BoxesCentralized Mail/Print/Backup servicesCentralized Linux Updates/Security PatchesMore Wireless (Lucent Orinoco AP1000)Extruded Subnet SupportRedhat Linux Cluster/Grid Research

Page 10: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

Works in Progress

Problems Need for small local cluster Need to consolidate disk storage

- More efficient use of disk space- Move to rack-mountable- Add more with little impact- Improved reliability (raid, hot-swap)

Page 11: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

Windows NT/2000/9x(SMB)

Linux(NFS)

Mac (atalk)

ProLiant DL360

9.1 - GB 10kULTRA2 SCSI

9.1 - GB 10kULTRA2 SCSI

ProLiant DL360

9.1 - GB 10kULTRA2 SCSI

9.1 - GB 10kULTRA2 SCSI

ProLiant DL360

9.1 - GB 10kULTRA2 SCSI

9.1 - GB 10kULTRA2 SCSI

ProLiant DL360

9.1 - GB 10kULTRA2 SCSI

9.1 - GB 10kULTRA2 SCSI

ProLiant DL360

9.1 - GB 10kULTRA2 SCSI

9.1 - GB 10kULTRA2 SCSI

ProLiant DL360

9.1 - GB 10kULTRA2 SCSI

9.1 - GB 10kULTRA2 SCSI

ProLiant DL360

9.1 - GB 10kULTRA2 SCSI

9.1 - GB 10kULTRA2 SCSI

ProLiant DL360

9.1 - GB 10kULTRA2 SCSI

9.1 - GB 10kULTRA2 SCSI

ProLiant DL360

9.1 - GB 10kULTRA2 SCSI

9.1 - GB 10kULTRA2 SCSI

S D12 3 4 5 6 7 8 9 1 0 11 1 2

1 3 14 15 1 6 1 7 18 19 2 0 2 1 2 2 23 2 4

25 2 6 2 7 2 8

BayNe tworks

BaySt ack 450-24T Swi tch

2 3 4 5 6 7 8 9 1 0 1 1 1 2

1 3 1 4 1 5 1 6 17 18 1 9 2 0 2 1 2 2 2 3 2 4

1

U p lin k M o d u le

C o m m Po r t 1 0 0

1 0F Dx

A c tiv i ty

1 0 0

1 0F Dx

A c tiv i ty

SD

142.90.0.0

142.90.100.212 142.90.100.211

SCSI ID 7SCSI ID 6

SCSI ID 0

142.90.100.210

IDE DISK ARRAY ~ 1.4TB withSCSI interconnects

192.168.0.2 192.168.0.1

ttyS0ttyS1Heartbeat(s)

Using IP ALIASING

Mosix Clientnodes

Mosix Masternode

Mosix Secondarynode

TRIUMF Proposed Compute/FileserverCluster

192.168.1.2 192.168.1.1

192.168.1.0

HOT SWAP ~120 GB IDE DRIVES

ARENA INDY-2400

Page 12: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

IDE Box Details

512Mb SDRAM Dual Channel Ultra 160 Hot swappable IDE drives Raid 0, 1, 0+1, 3, or 5 Dual Power Supplies

Page 13: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

Status of WestGrid

Federal Funding of $12m approved Waiting for BC/Alberta matching funds Request for Information is in draft mode Request for Proposals upon matching

funding

Page 14: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

Grid Storage

Scientific Visualization

Advanced Collaboration

Computational Resources

LEGEND

Participants

Page 15: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

Computational Sites

Alberta: University of Alberta, University of Calgary,University of Lethbridge, The Banff Centre,British Columbia: University of British Columbia, Simon Fraser

University, New Media Innovation Centre (NewMIC), TRIUMF

Page 16: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

WestGrid Site Facilities

Univ. of Alberta: A 128-CPU / 64GB memory SMMP machine + 5Tb Disks 25Tb Tape (Fine grained concurrency)

Univ. of Calgary: A 256-node 64bit / 256GB Cluster of Multi-Processors (CluMP) + 4Tb Disks (Medium grained concurrency)

UBC/TRIUMF: 1000/1500 CPU “naturally” parallel commodity cluster, 512Mb/CPU, with 10 TB of (SAN) disk, 70-100 TB of variable tape storage (Coarse grained concurrency)

SFU (Harbour Center): Network storage facility25 TB of disk, 200 TB of tape (Above sites storage, database serving)

Page 17: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

TRIUMF Computing Needs TWIST: 72 TB/yr of data; 20 TB/yr of Monte Carlo

~125 cpus (1 GHz) E949: 100 TB/yr of data (50% at TRIUMF/UBC)

~ 100 cpus (part time) ISAC: Data set not huge, but analysis requires

large- scale parallel computing PET: 3-D image reconstruction parallel

computing Large-scale Monte Carlo: BaBar (230 cpus),

ATLAS (200 cpus), HERMES (100 cpus)

Summary: - 250-300 cpus for data analysis- ~500 cpus for Monte Carlo- 150-200 TB/yr of storage- Access to parallel computing

Page 18: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

Management Issues Limit to 3-4 people/site

(hardware/system, not software dev.) SAN management Tape management Security Single logins across WestGrid

Page 19: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

624 * 10/100 ports144 * 10/100/1000 ports

8 x 48 port

5 x 48 port

3*48 Gigabit ports

Page 20: TRIUMF SITE REPORT – Corrie Kost April 15-19 Catania (Italy) Update since last HEPiX/HEPNT meeting

TRIUMF SITE REPORT – Corrie KostApril 15-19 Catania (Italy)

Sample Rack Configuration• 3U blades• 20 Blades / 3U• 280 servers in 42U rack• 512Mbytes memory/server• 9GB disk/server• Two 10/100 Ethernets/server