Upload
channer
View
41
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Oxford University Particle Physics Site Report. Pete Gronbech Systems Manager. Servers. Group DAQ Systems. Windows File Server. General Purpose Unix Server. Mail Server. Web Server. Win 2K PC. Win 2K PC. Win XP PC. Linux System. Win 2K PC. Desktops. - PowerPoint PPT Presentation
Citation preview
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
1
Oxford University Particle Physics
Site Report
Pete Gronbech
Systems Manager
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
3
Particle Physics Strategy The Server / Desktop Divide
Win 2K PC
Linux System
Des
ktop
sS
erve
rs
General Purpose Unix
Server
Group DAQ
Systems
Mail Server
Web Server
Windows File
Server
Win 2K PC
Win 2K PC
Win XP PC
Approx 200 Windows 2000 Desktop PC’s with Exceed used to access central Linux systems
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
4
Central Physics Computing Services E-Mail hubs
In last year 7.3M messages were relayed , 73% rejected and 5% were viruses. Anti-virus and anti-spam measures increasingly important in email hubs. Some spam inevitably leaks
through and clients need to deal with this in a more intelligent way.
Windows Terminal Servers Use is still increasing 250 users in last three months out of 750 staff/students. Now Win2k and 2003. Introduced an 8 CPU server (TermservMP) . Much more powerful system but still awaiting updated
versions of some applications which will run properly on OS.
Web / Database New web server (Windows 2003) in service. New web applications for lecture lists, Computer inventory, admissions and finals
Exchange Servers Running two new servers using Exchange 2003 running on Windows server 2003. Much better Web
interface, support for mobile devices (oma) and for tunnelling through firewalls.
Desktops Windows XP pro is default OS for new desktops and laptops.
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
5
Linux
Central Unix systems are Linux based Red Hat Linux 7.3 is the standard Treat Linux as just another Unix and hence
a server OS to be managed centrally. Wish to avoid badly managed desktop PC’s
running Linux.
Linux based file server (April 2002) General purpose Linux server installed August
2002 Batch farm installed
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
6
pplx1 morpheus pplxfs1 pplxgen pplx21Gb/s
ppcresst1 ppcresst2
ppatlas1 atlassbc
ppminos1 ppminos2
grid tbwn01 pptb01 pptb02
Grid Development
pplx3
CDF
minos DAQ
Atlas DAQ
cresst DAQ
General Purpose Systems
tblcfg se ce
RH7.3
Fermi7.3.1
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
RH7.3
Fermi7.3.1
PBS Batch Farm
4*Dual 2.4GHz systems
RH7.3
RH7.3
RH7.3
RH7.3
Autumn 2002
4*Dual 2.4GHz systems
RH7.3
RH7.3
RH7.3
RH7.3
Autumn 2003
matrix
7.3.17.3.1
7.3.1
7.3.17.3.17.3.1
7.3.17.3.1
7.3.1
7.3.17.3.1LCG2
7.3.17.3.1
7.3.1
7.3.17.3.1LCG2
Oxford Tier 2 - LCG2
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
7
The Linux File Server: pplxfs18*146GB SCSI disks
Dual 1GHz PIII, 1GB RAM
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
8
New Eonstor IDE RAID array added in April 04. 16* 250GB disks gives approx 4TB for around £6k.
This is our second foray into IDE storage. So far so good.
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
9
General Purpose Linux Server : pplxgen
pplxgen is a Dual 2.2GHz Pentium 4 Xeon based system with 2GB ram. It is running Red Hat 7.3It was brought on line at the end of August 2002.
Provides interactive login facilities for code development and test jobs. Long jobs should be sent to the batch queues.
Memory to be upgraded to 4GB next week.
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
10
PP batch farm running Red Hat 7.3 with Open PBS can be seen below pplxgen
This service became fully operational in Feb 2003. Additional 4 worker nodes were installed in October 2003. These are 1U servers and are mounted at the top of the rack.
Miscellaneous other nodes bring a total of 21 cpu’s available to PBS.
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
11
http://www-pnp.physics.ox.ac.uk/ganglia-webfrontend-2.5.4/
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
12
CDF Linux Systems
Morpheus is an IBM x370 8 way SMP 700MHz Xeonwith 8GB RAM and1TB Fibre Channel disksInstalled August 2001
Purchased as part of a JIF grantfor the CDF group
Runs Fermi Red Hat 7.3.1
Uses CDF software developed atFermilab and Oxford to process data from the CDF experiment.
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
13
Approx 7.5 TB for SCSI RAID 5 disks
are attached to the master node.
Each shelf holds 14 * 146GB disks.
These are shared via NFS with the worker nodes.
OpenPBS batch queuing software is used.
Second round of CDF JIF tender: Dell Cluster - MATRIX
10 Dual 2.4GHz P4 Xeon servers running Fermi Linux 7.3.1 and SCALI cluster software. Installed December 2002
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
14
Plenty of space in the second rack for expansion of the cluster.
Additional Disk Shelf with 14*146GB plus an extra node was installed in Autumn 2003.
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
15
Oxford Tier 2 centre for LHC
Two racks each containing 20 Dell dual 2.8GHz Xeon’s with SCSI system disks.
1.6TB SCSI disk array in each rack.
Systems will be loaded with LCG2 software.
SCSI disks and Broadcom Gigabit Ethernet causes some problems with installation. Slow progress being made.
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
16
Problems of Space, Power and Cooling.
Second rack currently temporarily located in theoretical physics computer room.
A proposal for a new purpose built computer room on Level 1 (underground) in progress.
False floor, large Air conditioning units and power for approx 20-30 racks to be provided.
1200W/sq m max air cooling, a rack full of 1U servers can create 10KW of heat.
Water cooling??
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
17
OLD Grid development systems. EDG Test bed setup, currently 2.1.13
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
18
Tape Backup is provided bya Qualstar TLS4480tape robot with 80 slots and Dual Sony AIT3 drives.Each tape can hold 100GB of data.Installed Jan 2002.
Netvault 7.1 Software from BakBoneis used, running on morpheus, forbackup of both cdf and particle physics systems. Main userdisks backed up everyweekday night data disks not generallybacked up BUT weekly backups to OUCS HFS service provide some security.
Network Access
CampusBackboneRouter
Super Janet 4 2.4Gb/s with Super Janet 4
OUCSFirewall
depts
depts
PhysicsFirewall
PhysicsBackboneRouter
100Mb/s
1Gb/s
100Mb/s
1Gb/s
BackboneEdgeRouter
depts
100Mb/s
100Mb/s
100Mb/s
depts
100Mb/s
BackboneEdgeRouter
1Gb/s
Physics Backbone Upgrade to Gigabit Autumn 2002
desktop
Serverswitch
PhysicsFirewall
PhysicsBackboneRouter
1Gb/s
1Gb/s
100Mb/s
100Mb/s
ParticlePhysics
desktop
100Mb/s
100Mb/s
1Gb/s
100Mb/s
Clarendon Lab
1Gb/s
LinuxServer
Win 2kServer
Astro
1Gb/s
1Gb/s
Theory
1Gb/s
Atmos
1Gb/s
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
22
Network Security Constantly under threat from worms and viruses. Boundary Firewall’s don’t solve the
problem entirely as people bring infections in on laptops.
New firewall based on stateful inspection. Policy is now `default closed`. Some teething problems as we learnt what protocols were required but there has been a very significant improvement in security.
Main firewall passes average 5.8GB/hour (link saturates at peak). Rejects 26,000 connection per hour (7 per second). Mischievous connects rejected 1500/hour, one every 2.5 secs. During blaster worm this reached 80/sec.
Additional firewalls installed to protect the Atlas construction area and to protect us from attacks via dialup or VPN.
Need better control over how laptops access our network. Migrating to a new Network Address Translation system so all portables connect through a managed `gateway`.
Have made it easier to keep Anti-Virus software (Sophos) uptodate via simply connecting to a web page. Important that everyone managing their own machines takes advantage of this. Very useful for both laptops and home systems
Keeping OS’s patched is a major challenge. Easier when machines are all inside one management domain but is still very time consuming. Must compare to perhaps 1-few man months of IT support staff effort to clean out a successful worm from the network.
1st July 2004 HEPSYSMAN RAL - Oxford Site Report
23
Goals for 2004 (Computing)
Continue to improve Network security Need better tools for OS patch management Need users to help with their private laptops
– Use automatic updates (e.g. Windows Update)– Update Antivirus software regularly
Segment the network by levels of trust All the above without adding an enormous management overhead !
Reduce number of OS’s Remove last NT4 machines and exchange 5.5 Digital Unix and VMS very nearly gone. Getting closer to standardising on RH 7.3 especially as the EDG software is
now heading that way. Still finding it very hard to support laptops but now have a
standard clone and recommend IBM laptops. What version of Linux to run ? Currently all 7.3 but what next? Looking into Single Sign On for PP systems