18
Southgrid Technical Meeting Pete Gronbech: May 2005 Birmingham

Southgrid Technical Meeting Pete Gronbech: May 2005 Birmingham

Embed Size (px)

Citation preview

Southgrid Technical Meeting

Pete Gronbech: May 2005Birmingham

Present

• Pete Gronbech – Oxford• Chris Brew – RAL PPD• Santanu Das - Cambridge• Lawrie Lowe – Birmingham• Yves Coppens - Birmingham

Southgrid Member Institutions

• Oxford • RAL PPD• Cambridge • Birmingham• Bristol• HP-Bristol• Warwick

Monitoring

• http://www.gridpp.ac.uk/ganglia/• http://map.gridpp.ac.uk/• http://lcg-testzone-reports.web.cern.ch/lcg-testzone

-reports/cgi-bin/lastreport.cgi• Configure view UKI• http://www.physics.ox.ac.uk/users/gronbech/gridmon.htm• Dave Kants helpful doc in the minutes of a tbsupport meeting

links to• http://goc.grid-support.ac.uk/gridsite/accounting/tree/

gridpp_view.php

GOC Accounting Gridpp view

Status at RAL PPD

• SL3 cluster on 2.4.0 • CPUs: 11 2.4 GHz, 33 2.8GHz

– 100% Dedicated to LCG

• 0.7 TB Storage– 100% Dedicated to LCG

• Configuring 6.4TB of SATA RAID disks for use by dcache

• Recent Power trip April 27th.

Status at Cambridge

• Currently LCG 2.4.0 on SL3

• CPUs: 40 2.8GHz – 100% Dedicated to LCG

• 2 TB Storage– 100% Dedicated to LCG

• Condor Batch System• Recent job submission

problems• Lack of Condor support

from LCG teams

Status at Bristol

• Status– LCG involvement limited (“black dot”) for previous six months

due to lack of manpower– New resources, posts now on the horizon!

• Existing resources– 80-CPU BaBar farm to be switched to LCG– ~ 2TB storage resources to be LCG – accessible– LCG head nodes installed by SouthGrid support team with 2.3.0

• New resources– Funding now confirmed for large University investment in

hardware– Includes CPU, high quality and scratch disk resources

• Humans– New system manager post (RG) being filled– New SouthGrid support / development post (GridPP / HP) being

filled– HP keen to expand industrial collaboration – suggestions?

Status at Birmingham

• Currently SL3 with LCG-2_4_0

• CPUs: 26 2.0GHz Xenon (+48 soon)– 100% LCG

• 200GB Storage, 2TB shortly– 100% LCG.

• Air Conditioning problems 21st May (3 days)

• Babar Farm moving to SL3

Status at Oxford

• Currently LCG 2.4.0 on SL304• Only 48 cpus running due to power

limitations (Power trip on 10th May)• CPUs: 80 2.8 GHz

– 100% LCG• 1.5 TB Storage – upgrade to 3TB

planned– 100% LCG.

Apel Accounting

• Get latest rpm from http://goc.grid-support.ac.uk/gridsite/accounting/

• This should fix the cron job problem or do it by hand:add RGMA_HOME=/opt/edg

Security

• Best practices linkhttps://www.gridpp.ac.uk/deployment/security/index.html

• Wiki entryhttp://goc.grid.sinica.edu.tw/gocwiki/AdministrationFaq

• iptables?? – Birmingham to share their setup on the South Grid web pages

Kickstart Packages

• Minimum for worker ??• glibc-headers required by atlas• graphics group pulls that in. (Other

options may also)• But development-tools may be more

logical• Atlas require various glibc headers

which come from glibc-devel• And zlib-devel

kickstart for worker nodes

• %packages• # @ office• #@ engineering-and-scientific• @ editors• @ base-x• # graphics is Needed as it pulls in glibc-headers• #@ graphics• @ development-tools• @ misc-sl• @ text-internet• @ gnome-desktop• @ yum• @ openafs-client• # @ sound-and-video• @ graphical-internet• kernel• kernel-smp• pine• grub• gv• gnupg• -xchat• -redhat-config-samba• -samba

Grid Ice

• Need to open port 2136

Action Plan for Bristol

• Plan to visit on June 9th to install an installation server– dhcp server– NFS copies of SL (local mirror)– PXE boot setup etc

• Second visit to reinstall head nodes with SL304 and LCG-2_4_0 and some worker nodes

• Babar cluster to go to Birmingham– Fergus, Chris, Yves to Liaise.–

Chris Brew

• Gave an overview of the hepix meeting including talks about SL versions

• Gave advice on using Maui fairshare and put config files on the south grid web site

• Gave advice on customizing ganglia and put files on southgrid web site.