Upload
earl-golden
View
214
Download
1
Embed Size (px)
Citation preview
2-Dec-2005 1
Offline ReportMatthias Schröder
Topics:Scientific Linux
Fatmen
Monte Carlo Production
2-Dec-2005 2
SLC3 & OPAL Environment
• In production since January– Lxplus on RH7.3 stopped end of August(?)
• Miminal efforts in providing tools and environment on SLC3– I just provided what I considered the bare minimum…
• …and waited that somebody asked for more
– Little complains from users
– We might still discover missing tools (or variables…)
– No serious certification of ROPE, OD and partners done
2-Dec-2005 3
SLC3 & The OPAL PCs
• All OPAL desktops migrated to SLC3– Most quattorised at the same time
– Systems with little memory VERY slow• Use different desktop/window manager?• Have a look at http://linux.web.cern.ch/linux/scientific3/docs/gnome-slow.shtml
• Ask your supervisor for a reasonable computer!
• Still lots of data living on local disks– Hopefully also archived
– I had to replace many failed disks on OPAL desktops recently
2-Dec-2005 4
SLC4,5,…
• The next versions will come for sure– The only question is when
• Depends on release dates and LHC schedule
– SLC4 might be skipped if SLC5 comes early enough (for LHC)
• Can expect significant problems with completely different fortran compiler– 2 bugs in OPAL code detected already
– Many problems with SAVEing of variables• Might be solved by fixing compiler
• Hope to be able to keep existing compilers as add-on– Certification of OS and compilers have been decoupled
– LHC experiments still need fortran• If you develop for LHC don’t forget to sneak in a few fortran routines here and there ;)
• Trouble to bootstrap patchy– And the old version does not run on SLC4
– And GPHIGS????
2-Dec-2005 5
Fatmen Issues
• IT continues to migrate (re-pack) our tapes– If registered in fatmen by tape, data becomes inaccessible
• Fortunately they keep the data available under special castor names
• I have to – rename files to match our naming convention
– update fatmen entries, use only castor name
• Some MC files have been produced and catalog several times– Have to make sure I convert the correct (best?) entry
– I destroy all info about ‘duplicates’
62-Dec-2005
Fatmen’s Future
2-Dec-2005 7
Fatmen’s Future
• IT will stop fatmen service in 2006– Since NA48 still needs fatmen, the deadline in end 2006 now
• All data access will have to be done via CASTOR
• A file catalog would be useful to ensure completeness– But no time to migrate fatmen entries & rewrite ROPE
– Migrate fatmen data to keep meta data available?• Investigating the options we have
• MC production system makes heavy use of fatmen for control!
• Trying to change existing system to decouple it from fatmen not realistic
• Abandon MC production system and ‘do it by hand’– But still centrally to avoid complete mess with code, geometry, run numbers and data access
– How to arrange for that?
2-Dec-2005 8
Computer Time Allocation
• The OPAL allocation is a tiny fraction of total capacity– And a very small fraction of what we used to get
• Usually ok with the little amount of computing we do
• If you have to run many jobs, please let me know– If urgent I can push them through
• Don’t run LHC, ILC, CLIC or XXX jobs under your OPAL account– You would use up OPAL quota
– Because our allocation is so small, the jobs would wait forever
2-Dec-2005 9
Accounts
• CERN is getting more active with respect to accounts– Accounts get blocked automatically when people leave CERN / are not registered anymore
• …and deleted soon after
• Did a massive exercise of deleting accounts in September– List checked by Pippa and me
– We have been rather conservative
• Unfortunately also deleted accounts we wanted to keep– These have been recreated, also creating some confusion
• Do we have important files on accounts likely to be deleted?– Make sure that important files are kept
• OPAL project space is safer than individual accounts
2-Dec-2005 10
Near Future
• The computer center will be without UPS cover 5th-16th December– No issue as long as there are no power cuts…
• Two interventions planned during week of December 12th– Provide redundant power for an important router
– Will effect AFS, tape access,…
– LXBATCH queues will be stopped for these interventions