Upload
gervais-cook
View
214
Download
0
Embed Size (px)
Citation preview
GridPP CM, ICL16 September 2002
Roger Jones
RWL Jones, Lancaster University
EDG IntegrationEDG Integration
EDG decision to put short-term focus of effort on making the ATLAS EDG decision to put short-term focus of effort on making the ATLAS DC1 production work on 1.2 and afterDC1 production work on 1.2 and after
Good input from the EDG side. Effort from ATLAS, especially UK, Good input from the EDG side. Effort from ATLAS, especially UK, Italy and CERN (Notably Frederic Brochu, but also others: Stan Italy and CERN (Notably Frederic Brochu, but also others: Stan Thompson, RJ, and Alvin Tam joining now)Thompson, RJ, and Alvin Tam joining now)
The submission seems to work `after a fashion’ using multiple UIs and one production site
More problems using multiple production sites, esp. data replication Inability to access Castor using Grid tools was a major problem, fixed Large files sizes also a problem Interfaces to catalogues need work
Encouraging effort, expect full production quality service for Encouraging effort, expect full production quality service for demonstrations in Novemberdemonstrations in November
Note: analysis requires more work – the code is not already `out Note: analysis requires more work – the code is not already `out there’there’
Integration with other ATLAS Grid efforts neededIntegration with other ATLAS Grid efforts needed Evaluation session on Thursday at RHUL Software WeekEvaluation session on Thursday at RHUL Software Week
RWL Jones, Lancaster University
ExercisesExercises
Common ATLAS Environment pre-installed on sitesCommon ATLAS Environment pre-installed on sites Exercise 1: simulation jobsExercise 1: simulation jobs
50 jobs, 5000 events each, 5 submitters One-by-one submission because RB problems Running restricted to CERN 1 job 25h real time, 1Gb output Output stored on the CERN SE
Exercise 2 with patched RB and massive submissionExercise 2 with patched RB and massive submission 250 shorter jobs (several hours, 100Mb output) CERN, CNAF, CCIN2P3, Karlsrurhe, NIKEF, RAL One single user Output back to CERN – SE space is an issue
Also have `fast simulation in a box’ running on the EDG Also have `fast simulation in a box’ running on the EDG testbed, good exercise for analysis/non-installed taskstestbed, good exercise for analysis/non-installed tasks
RWL Jones, Lancaster University
IssuesIssues
There is no defined `system software’There is no defined `system software’ The system managers cannot dictate the shells to use, The system managers cannot dictate the shells to use,
compilers etccompilers etc Multiple users will lead to multiple copies of the same Multiple users will lead to multiple copies of the same
tools unless a system advertising what is installed and tools unless a system advertising what is installed and where is available (PACMAN does this for instance)where is available (PACMAN does this for instance)
A user service requires binaries to be `installed’ on the A user service requires binaries to be `installed’ on the remote sites – trust has to work both waysremote sites – trust has to work both ways
RWL Jones, Lancaster University
Software Away from CERNSoftware Away from CERN
Several cases:Several cases: Software copy for developers at remote sites Software (binary) installation for Grid Productions Software (binary) download for Grid jobs Software (source) download for developers on the Grid
Initially rely on by-hand packaging and installationInitially rely on by-hand packaging and installation True Grid use requires automation to be scalableTrue Grid use requires automation to be scalable Task decomposes three requirementsTask decomposes three requirements
Relocatable code and environment Packaging of the above Deployment tool (something more than human+ftp!)
RWL Jones, Lancaster University
World Relocatable CodeWorld Relocatable Code
Six months ago, ATLAS code was far from deployableSix months ago, ATLAS code was far from deployable Must be able to work with several cases:Must be able to work with several cases:
afs for installation No afs available afs present but not to be used for the installation because of speed
(commonplace!) Significant improvement in this area, with help from John Significant improvement in this area, with help from John
Couchman, John Kennedy, Mike Gardner and othersCouchman, John Kennedy, Mike Gardner and others Big effort in reduction of package dependencies – work with US Big effort in reduction of package dependencies – work with US
colleaguescolleagues However, it takes a long time for this knowledge to become the However, it takes a long time for this knowledge to become the
default – central procedures being improved (Steve O’Neale)default – central procedures being improved (Steve O’Neale) For the non-afs installation the cvsupd mechanism seems to For the non-afs installation the cvsupd mechanism seems to
work generally, but patches are need for e.g. Motif problemswork generally, but patches are need for e.g. Motif problems Appropriate for developers at institutes, but not a good Grid solution
in the long-term
RWL Jones, Lancaster University
An Aside: Installation Method An Aside: Installation Method EvaluationEvaluation
Previous work from Lancaster, now big effort from John Previous work from Lancaster, now big effort from John KennedyKennedy
CvsupdCvsupd Problems with pserver/kserver; makefiles needed editing Download took one night Problems with CMT paths on the CERN side? Problems fixing afs paths into local paths – previously
developed script does not catch all
NorduGrid rpmsNorduGrid rpms Work `after a fashion’. Does not mirror CERN, much fixing by
hand Much editing of makefiles to do anything real
RWL Jones, Lancaster University
An Aside: Installation Method An Aside: Installation Method Evaluation (II)Evaluation (II)
Rsynch methodRsynch method Works reasonably at Lancaster Hard to be selective – easy to have huge downloads taking
more than a day in the first instance Only hurts the first time – better for updates
Official DC1 rpmsOfficial DC1 rpms Circularity in dependencies cause problems with zsh Requires root privilege for installation
RWL Jones, Lancaster University
Installation ToolsInstallation Tools
To use the Grid, deployable software must be deployed on the Grid fabrics, and the deployable To use the Grid, deployable software must be deployed on the Grid fabrics, and the deployable run-time environment establishedrun-time environment established
Installable code and run-time environment/configurationInstallable code and run-time environment/configuration Both ATLAS and LHCb use CMT for the software management and environment configurationBoth ATLAS and LHCb use CMT for the software management and environment configuration CMT knows the package interdependencies and external dependencies CMT knows the package interdependencies and external dependencies this is the obvious tool this is the obvious tool
to prepare the deployable code and to `expose’ the dependencies to the deployment tool (MG to prepare the deployable code and to `expose’ the dependencies to the deployment tool (MG testing)testing)
Grid aware tool to deploy the aboveGrid aware tool to deploy the above
PACMAN is a candidate which seems fairly easy to interface with CMT, see following talkPACMAN is a candidate which seems fairly easy to interface with CMT, see following talk
RWL Jones, Lancaster University
CMT and deployable code CMT and deployable code
Christian Arnault and Charles Loomis have a beta-release Christian Arnault and Charles Loomis have a beta-release of CMT that will produce package rpms, which is a large of CMT that will produce package rpms, which is a large step along the waystep along the way
Still need to clean-up site dependencies Need to make the package dependencies explicit Rpm requires root to install in the system database (but not
for a private installation)
Developer and binary installations being prepared, Developer and binary installations being prepared, probably also needs binary+headers+source for a single probably also needs binary+headers+source for a single packagepackage
Converter making PACMAN cache files from auto-Converter making PACMAN cache files from auto-generated rpms seems to workgenerated rpms seems to work
RWL Jones, Lancaster University
The way forward?The way forward?
An ATLAS Group exists for deployment tools (RWLJ An ATLAS Group exists for deployment tools (RWLJ convening)convening)
LCG have `decided’ to use SCRAM. LCG have `decided’ to use SCRAM. Grounds for the decision seemed narrow, with little thought
to implications outside of LCG If this is to be followed generally, should reconsider strategy
What about using SlashGrid to create a virtual file system?What about using SlashGrid to create a virtual file system?