21
TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 TRIUMF SITE REPORT Corrie Kost Update since Hepix Spring 2005

TRIUMF SITE REPORT Corrie Kost

  • Upload
    anka

  • View
    56

  • Download
    0

Embed Size (px)

DESCRIPTION

TRIUMF SITE REPORT Corrie Kost. Update since Hepix Spring 2005. Google Mini comes to TRIUMF. Read a complete in-depth review at http://www.anandtech.com/IT/showdoc.aspx?i=2523&p=2. $2995 US w 1 yr support indexes up to 100,000 docs 220 different file formats - PowerPoint PPT Presentation

Citation preview

Page 1: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

TRIUMF SITE REPORT

Corrie Kost

Update since Hepix Spring 2005

Page 2: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

• $2995 US w 1 yr support

• indexes up to 100,000 docs

• 220 different file formats

• Two 10/100 Ethernet ports

- 1st for normal operation

- 2nd for setup using cross-over cable

• 120GB Seagate Drive

• 2GB Memory

• Maintainance via special google dial- up modem

Google Mini comes to TRIUMF

Read a complete in-depth review at http://www.anandtech.com/IT/showdoc.aspx?i=2523&p=2

Page 3: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

Page 4: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

Page 5: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

The TRIUMF-CERN 1GbE Lightpath(s)

• TRIUMF• BCNET• CANARIE• SURFnet• CERN

• 1 GbE circuit establishedApril 18th 2005

• 2nd GbE circuit established July 19th 2005

Page 6: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

http://grid.triumf.ca/status/sc3.html

Servers3 EMT64 systems, each with:

2 GB memory hardware raid - 3ware 9xxx SATA raid controller Seagate Barracuda 7200.8 drives in hardware raid 5 - 8 x 250 GB

1 dual Opteron 246 server with: 2 GB memory 3ware 9xxx SATA raid controllerWD Caviar SE drives in hardware raid 0 - 2 x 250 GB 2  4560-SLX IBM Tape Libraries (currently each with only 1 SDLT 320 tape drive)

1 borrowed EMT64 system used temporarily as an FTS Server with: 1 GB memory 2 SATA 80 GB drives for the OS and for Oracle's needs.

Storage5.5+ TB disk 8+ TB tape

ATLAS Service Challenge

Page 7: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

ATLAS Service Challenge

Page 8: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

10 GbE Lightpath to CERN

TRIUMF CERN

Atlantic Crossing√

√√

√X

Page 9: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

10 GbE Lightpath to CERN

•Permanent 10GbE TRIUMF-CERN Lightpath ~ year-end 2005•Foundry Bigiron RX-4’s at TRIUMF & BCnet

Page 10: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

10 GbE Lightpath to CERN

Page 11: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

TRIUMF WAN CWDM

BCNET 22km

10GbEFoundry Switch (CERN / Ottawa)

MRV CWDM

1610 nm1590 nm1570 nm1550 nm

Potential to Add 2 more1GbE channels

Single Pair Fiber

4 1GbE channelsPassport 8600• ORAN• WESTGRID• 2x CERN

SFP 4 Port Optical Mux

2x GbE TDM

PROBLEM: MRV needs 1550+/-3nm but FOUNDRY 1550+/-15nm

Page 12: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

Raid5: Puzzling I/O resultsRepeated reads on same set of files (at 600MB/sec) – one or more files will “degrade” – typically after set of 16 8GB files have been read 1000 times. Positive: Read ~2PB during 50 days – averaging about 600MB/sec

TRANSITION

0

5

10

15

20

1 17 33 49 65 81

File Number (same every 16th)

8G

B F

ile R

ea

d T

ime

(s

ec

)

8 SATA disks on each of pair of RAID5 RocketRaid 1820A controllers

Page 13: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

Unix Backups at TRIUMF

• Amanda system– Dual Opteron 248 2.2 GHz

• 2G Memory• 16 x400G WD disks ~ 6TB (1.5TB present sys ~ 10day cycle)• 2 LSI Mega raid 8 disk controllers

• Disk based ~1 month of backups– At least 2 full backups with daily incrementals

• 26 Slot Overland DLT tape library• SDLT 600 drive 300G native capacity per tape

• 150 Linux machines (users: home dir, servers: full)

Page 14: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

Cheap Hot-Swap Backup

• Promise SuperSwap 1100 Enclosures

• Four 400 GB Seagate Sata Drives

• Promise FastTrak S150 SX4 Sata controller

• Raid 5

• Linux 2.4.20-8 RedHat 9

A disk can be removed at anytime and replaced at anytime. Rebuilds in background.

Used to keep live multiple (daily) RSYNC (via DIRVISH) copies of critical servers (for ~ 1 month). See http://www.dirvish.com/

Page 15: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

VOIP coming to TRIUMF

Page 16: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

TRIUMF Ticketing System (Request Tracker)

Page 17: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

TRIUMF Ticketing System (Request Tracker)

Page 18: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

http://hepix.caspur.it/afs/hepix.org/projects.html

Page 19: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

- Site services (Web, Email, Batch, Windows) all much more stable – new hardware, more memory (typically 4-8GB) in servers

- Quad Opteron SUN I/O - using external SATA - still limited below 1 GB/sec

- Read 16 8GB files repeatedly – averaging over 600MB/sec for ~2PB

- Site “Backup” services still problematic

- tape media capacity (outgrow in 2 years)

- reliability (is SDLT robust?)

- Permanent 10GbE TRIUMF-CERN service by year-end.

- ATLAS Service Challenges targets being met for TRIUMF as TIER1

- Started using PLONE as content management for TRIUMF Web Server

- Moving some phones to voice-over-IP

- Scientific Linux (3 &4) still preferred Linux OS at TRIUMF

- Moving away from distributed printing to print/scan-to-email/copy stations

Conclusions / Observations

Page 20: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

STORM1

STORM2

SUN1

FoundryLCGSTORAGE

WORKERNODES

GPS TIMEMSR WEBNAMEDOCUMENTSCONDORGWEBSHAREMAILFILE

IBM CLUSTER

FEDORA / SLMIRROR

IBM / SHARESTORAGE

AMANDABACKUP (VIA DISKS)

TRIUMF Servers – May/2005

Page 21: TRIUMF   SITE REPORT Corrie Kost

TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005

TRIUMF Servers – October/2005