Upload
chastity-rosa-dickerson
View
221
Download
0
Embed Size (px)
DESCRIPTION
LAT BauerdickARDA Interim Report, SC2 MeetingSep 12, 2003 ARDA Mandate
Citation preview
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
ARDA Interim Report
to the LCG SC2L.A.T.Bauerdick/Fermilab
For the RTAG-11/ARDA group
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.ARDA Mandate RTAG on
An Architectural Roadmap towards Distributed Analysis (ARDA) 1).
Observation:• Different LHC experiments have developed packages (AliEn, Ganga, Dirac, Impala, Boss,
Grappa, Magda…) that either sit on top, complement, expand or parallel thefunctionality of the Grid middleware (VDT, EDG…)
• At this time the LCG is coming to grips with the middleware development requirements• There is an expectation that an OGSA Services Architecture will be the basis for
future development.• The Experiments need to specify in their TDR’s, baselines, fallback and development
strategiesMotivation:• To agree on requirements as laid out in a first step by recent work within the GAG and
identify commonalities within the current projects which might allow the LCG (both inthe AA and GTA areas) to provide a focus of effort.
• To provide guidance to the LCG on future Middleware development directions andinterfacing work to match the experiment requirements
• To build on the richness of the current technical solutions to avoid duplication ofefforts
• To clearly identify the roles and responsibilities of the components/layers/ services inthe experiment DA planning
• To give guidance to the community on the expected division of work between theexperiments, the LCG and the external projects.
1)Arda was the name given by the Elves to their World and all it contained , see www.glyphweb.com/arda/
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.ARDA Mandate Mandate for the ARDA RTAG
• To review the current DA activities and to capture theirarchitectures in a consistent way
• To confront these existing projects to the HEPCAL II use casesand the user's potential work environments in order to explorepotential shortcomings.
• To consider the interfaces between Grid, LCG and experiment-specific services– Review the functionality of experiment-specific packages, state of
advancement and role in the experiment.– Identify similar functionalities in the different packages– Identify functionalities and components that could be integrated in
the generic GRID middleware• To confront the current projects with critical GRID areas• To develop a roadmap specifying wherever possible the
architecture, the components and potential sources ofdeliverables to guide the medium term (2 year) work of the LCGand the DA planning in the experiments.
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.ARDA Mandate Mandate for the ARDA RTAG
• To review the current DA activities and to capture theirarchitectures in a consistent way
• To confront these existing projects to the HEPCAL II use casesand the user's potential work environments in order to explorepotential shortcomings.
• To consider the interfaces between Grid, LCG and experiment-specific services– Review the functionality of experiment-specific packages, state of
advancement and role in the experiment.– Identify similar functionalities in the different packages– Identify functionalities and components that could be integrated in
the generic GRID middleware• To confront the current projects with critical GRID areas• To develop a roadmap specifying wherever possible the
architecture, the components and potential sources ofdeliverables to guide the medium term (2 year) work of the LCGand the DA planning in the experiments.
Long list of projects being looked at, analyzing how their components and services would map to the ARDA services, synthesized to provide description of ARDA componentsGAG discussed an initial internal working draft, GAG to follow up
Both of these are in progress --- will provide a technical annex that documents these
This is a main thrust of the ARDA roadmapWill be part of the technical annex -- e.g. security, auditing etcMain deliverable of ARDA, approach to be described in this talk
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.ARDA Schedule and Makeup Schedule and Makeup of ARDA RTAG
The RTAG shall provide a draft report to the SC2 by September 03.• It should contain initial guidance to the LCG and the experiments
to inform the September LHCC manpower review, in particular onthe expected responsibilities of– The experiment projects– The LCG (Development and interfacing work rather than coordination
work)– The external projects
The final RTAG report is expected for October 03.
The RTAG shall be composed of• Two members from each experiment• Representatives of the LCG GTA and AA• If not included above, the RTAG shall co-opt or invite
representatives from the major Distributed Analysis projects andnon-LHC running experiments with DA experience.
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.ARDA Schedule and Makeup Schedule and Makeup of ARDA RTAG
The RTAG shall provide a draft report to the SC2 by September 03.• It should contain initial guidance to the LCG and the experiments
to inform the September LHCC manpower review, in particular onthe expected responsibilities of– The experiment projects– The LCG (Development and interfacing work rather than coordination
work)– The external projects
The final RTAG report is expected for October 03.
The RTAG shall be composed of• Two members from each experiment• Representatives of the LCG GTA and AA• If not included above, the RTAG shall co-opt or invite
representatives from the major Distributed Analysis projects andnon-LHC running experiments with DA experience.
• Alice: Fons Rademakers and Predrag Buncic• Atlas: Roger Jones and Rob Gardner• CMS: Lothar Bauerdick and Lucia Silvestris • LHCb: Philippe Charpentier and Andrei Tsaregorodtsev
• LCG GTA: David Foster, stand-in Massimo Lamanna• LCG AA: Torre Wenaus• GAG: Federico Carminati
• See what we can do -- want to have initial recommendations for that date
• No written draft report today (too late for reviews anyway)
• Instead verbal interim report, with indication of initial guidance to the LCG and experiments
• The report is clearly not finished, but “blueprint” for a roadmap and its waypoints exists (in the heads of the committee members)
• Still talking to experiments and DA projects
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.ARDA mode of operationThank you for an excellent committee -- large expertise, agility and
responsiveness, very constructive and open-minded, and sacrificing quite a bit of the summer Series of weekly meetings July and August, mini-workshop in
September Invited talks from existing experiment’s projects:• Summary of Caltech GAE workshop (Torre)• PROOF (Fons)• AliEn (Predrag)• DIAL (David Adams)• GAE and Clarens (Conrad Steenberg)• Ganga (Pere Mato)• Dirac (Andrei) Cross-check w/ other projects of emerging ARDA decomposition of
services• Magda, DIAL -- Torre, Rob• EDG, NorduGrid -- Andrei, Massimo• SAM, MCRunjob -- Roger, Lothar• BOSS, MCRunob -- Lucia, Lothar• Clarens, GAE -- Lucia, Lothar• Ganga -- Rob, Torre• PROOF -- Fons• AliEn -- Predrag• DIRAC -- Andrei• VOX -- Lothar
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
Initial Picture Distributed Analysis (Torre, Caltech w/s) QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.Hepcal-II Analysis Use CasesScenarios based on GAG HEPCAL-II report
Register as a user Make sure resources are available Perform queries on the Metadata Catalogue(s) to
determine Data Sets•Select event components
Perform iterative analysis activity looping over event components
Specific requirements from Hepcal-II Job traceability, provenance, logbooks Also discussed: support for finer-grain access control and
enabling to share data within physics groups
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.e.g. Asynchronous Analysis Mode in
AliEn
9/12/03 [email protected] 31
// connect + authenticate to the GRID Service alien as “user”TGrid *alien = TGrid::Connect("alien",”user”,"","");// create a new analysis Object ( <unique ID>, <title>, #subjobs)TAlienAnalysis* analysis = new TAlienAnalysis(“pass001",“MyAnalysis",10);// set the program, which executes the Analysis Macro/Scriptanalysis->Exec("AliRoot.sh”,"file:/home/peters/test.C"); // script to executeanalysis->Query("2002-10/V3.08.Rev.04/00110/%galice.root?pt>0.2");analysis->OutputFileAutoMerge(true); // merge all produced .root filesanalysis->Split(); // split the task in subjobsanalysis->Run(); // submit all subjobs to the AliEn queueanalysis->GetResults(); // download partial/final results and merge themanalysis->Info(); // display job information
C++ equivalent (A)
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.ARDA Roadmap “Informed” By DA
Implementations Following SC2 advice, reviewed major existing DA projectsClearly AliEn today provides the most complete implementation of a distributed
analysis services, that is “fully functional” -- also interfaces to PROOF Implements the major Hepcal-II use cases Presents a clean API to experiments application, Web portals, … Should address most requirements for upcoming experiment’s physics
studies•Existing and fully functional interface to complete analysis package ---
ROOT• Interface to PROOF cluster-based interactive analysis system• Interfaces to any other system well defined and certainly feasible
Based on Web-services, with global (federated) database to give state and persistency to the system
ARDA approach: Re-factoring AliEn, using the experience of the other project, to generalize
it in an architecture; Consider OGSI as a natural foundation for that Confront ARDA services with existing projects (notably EDG, SAM, Dirac, etc) Synthesize service definition, defining their contracts and behavior Blueprint for initial distributed analysis service infrastructure
ARDA services blueprint gains credibility w/ functional prototypical implementation
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.ARDA Distributed Analysis
ServicesDistributed Analysis in a Grid Services based architecture
ARDA Services should be OGSI compliant -- built upon OGSI middleware Frameworks and applications use ARDA API
with bindings to C++, Java, Python, PERL, …• interface through UI/API factory -- authentication, persistent “session”
Fabric Interface to resources through CE, SE services• job description language, based on Condor ClassAds and matchmaking
Database(ses) through Dbase Proxy provide statefulness and persistenceWe arrived at a decomposition into the following key services
API and User Interface Authentication, Authorization, Accounting and Auditing services Workload Management and Data Management services File and (event) Metadata Catalogues Information service Grid and Job Monitoring services Storage Element and Computing Element services Package Manager and Job Provenance services
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
API
User Interface Factory
Auditing
DBD/RDBMS
Registry/Lookup/Config
V.O. directory
Authentication
Storage Element
Gatekeeper
Job Manager
Transfer Manager
File Transfer
Process Monitor
Transfer Broker
Job Broker
Job Optimizer
Transfer Optimizer
Catalogue Optimiser
User Interface
Grid Monitoring
CE
1
1..n1
1
0..n
1..n
1
1
1
1
11
1
1
1
0..n
0..n
0..n
0..n
0..n
1. lookup
2. authenticate
3. register
4. bindAuthorisation
File Catalogue
Metadata Catalogue
Task Queue
DB PRoxy
1
1
1
1
Package Manager
Job Provenance
1
Authorisation
Accounting
111111111111111
1
1
AliEn (re-factored)
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
API
User Interface Factory
Auditing
DBD/RDBMS
Registry/Lookup/Config
V.O. directory
Authentication
Storage Element
Gatekeeper
Job Manager
Transfer Manager
File Transfer
Process Monitor
Transfer Broker
Job Broker
Job Optimizer
Transfer Optimizer
Catalogue Optimiser
User Interface Grid MonitoringCE
1
1..n1
1
0..n
1..n
1
1
1
1
11
1
1
1
0..n
0..n
0..n
0..n
0..n
1. lookup
2. authenticate
3. register
4. bindAuthorisation
1
File Catalogue
Metadata Catalogue
Task Queue
DB PRoxy
1
1
1
1
Package Manager
Job Provenance
1
InformationService Authentication
Authorisation
User Interface
Grid Monitoring
Workload Management
Data Management
StorageElement
Job Monitor
ComputingElement
Job Provenance
Auditing
MetadataCatalogue
FileCatalogue
PackageManager
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.ARDA Key Services for Distributed
AnalysisInformation
ServiceAuthentication
Authorisation
Auditing
Grid Monitoring
Workload Management
Metadata Catalogue
File Catalogue
Data Management
Computing Element
Storage Element
Job Monitor
Job Provenance
Package Manager
DB Proxy
User Interface
API
Accounting
7: 12:
5:
13:
8:
15: 11:
9: 10:
1:
4:
2:
3:
6:
14:
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.API and User Interface
API+ Authentication
+ Data Management+ Grid Service Management
+ Job Control+ Metadata Management
+ NewInterface+ Posix I/O
SOAP(from API)
Grid File Access(from API)
Experiment Frameworks
POOL/ROOT/...(from Experiment Frameworks)...)
API (OGSI User Interface Factory)
Storage Element (POSIX I/O service)
Portals
Grid Shells
Grid File System
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.API and User InterfaceARDA services present an API, called by applications like the experiments
frameworks, interactive analysis packages, Grid portals, Grid shells, etc allows to implement a wide variety of different applications. Examples are
command line interface similar to a UNIX file system. Similar functionality can be provided by graphical user interfaces.
Using these interfaces, it will be possible to access the catalogue, submit jobs and retrieve the output. Web portals can be provided as an alternative user interface, where one can check the status of the current and past jobs, submit new jobs and interact with them.
Web portals should also offer additional functionality to ‘power’ users – Grid administrators can check the status of all services, monitor, start and stop them while VO administrators (production user) can submit and manipulate bulk jobs.
The user interface can use the Condor ClassAds as a Job Description Language This will maintain compatibility with existing job execution services, in particular
LCG-1. The JDL defines the executable, its arguments and the software packages or data
and the resources that are required by the job The Workload Management service can modify the job’s JDL entry by adding or
elaborating requirements based on the detailed information it can get from the system like the exact location of the dataset and replicas, client and service capabilities.
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.File Catalogue and Data
ManagementInput and output associated with any job can be registered in the File
Catalogue, a virtual file system in which a logical name is assigned to a file. Unlike real file systems, the File Catalogue does not own the
files; it only keeps an association between the Logical File Name (LFN) and (possibly more than one) Physical File Names (PFN) on a real file or mass storage system. PFNs describe the physical location of the files and include the name of the Storage Element and the path to the local file.
The system should support file replication and caching and will use file location information when it comes to scheduling jobs for execution.
The directories and files in the File Catalogue have privileges for owner, group and the world. This means that every user can have exclusive read and write privileges for his portion of the logical file namespace (home directory).
Etc pp
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.Job Provenance serviceThe File Catalogue is not meant to support only regular files –
this is extended to include information about running processes in the system (in analogy with the /proc directory on Linux systems) and to support virtual data services Each job sent for execution gets an unique id and a
corresponding /proc/id directory where it can register temporary files, standard input and output as well as all job products. In a typical production scenario, only after a separate process has verified the output, the job products will be renamed and registered in their final destination in the File Catalogue. The entries (LFNs) in the File Catalogue have an immutable unique file id attribute that is required to support long references (for instance in ROOT) and symbolic links.
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.Package Manager ServiceAllows dynamic installation of application software released by the
VO (e.g. the experiment or a physics group).Each VO can provide the Packages and Commands that can be
subsequently executed. Once the corresponding files with bundled executables and libraries are published in the File Catalogue and registered, the Package Manager will install them automatically as soon as a job becomes eligible to run on a site whose policy accepts these jobs.
While installing the package in a shared package repository, the Package Manager will resolve the dependencies on other packages and, taking into account package versions, install them as well. This means that old versions of packages can be safely removed from the shared repository and, if these are needed again at some point later, they will be re-installed automatically by the system. This provides a convenient and automated way to distribute the experiment specific software across the Grid and assures accountability in the long term.
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.Computing ElementComputing Element is a service representing a computing
resource. Its interface should allow submission of a job to be executed on the underlying computing facility, access to the job status information as well as high level job manipulation commands. The interface should also provide access to the dynamic status of the computing resource like its available capacity, load, number of waiting and running jobs.
This service should be available on a per VO basis.
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
Etc. pp
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.General ARDA RoadmapEmerging picture of “waypoints” on the ARDA “roadmap”
ARDA RTAG report• review of existing projects, component decomposition & re-factoring,
capturing of common architectures, synthesis of existing approaches• recommendations for a prototypical architecture and definition of
prototypical functionality and a development strategy development of a prototype and first release
•Re-factoring AliEn web services, studying the ARDA architecture in a OGSI context, based on existing implementation
• POOL and other LCG components (VO, CE, SE, …) interface to ARDA•Adaptation of specific ARDA services to experiments requirements
• E.g. File catalogs, package manager, metadata handling for different data models• Integration with and deployment on LCG-1 resources and services
Re-engineering of prototypical ARDA services, as required• Evolving services scaling up and adding functionality, robustness,
resilience, etc
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.Talking PointsSystem deals with files, not objects
however, Object Location service can be added if required Investigate experiment’s metadata/file catalog interaction
Interface to LCG-1 infrastructure VDT/EDG interface through CE, SE and the use of JDL ARDA VO services should take into account emerging VO management
infrastructureVO system and site security
Jobs are executed on behalf of VO, however users fully traceableHow do policies get implemented, e.g. analysis priorities, MoU contributions etc
Auditing and accounting system, priorities through special “optimizers” accounting of site “contributions”, that depend what resources sites “expose”
Prototype could be based on global database Address latency, stability and scalability issues up-front; good experience exists
ARDA Prototype provides an Physics Analysis environment for experiment framework based and ROOT based analysis of distributed experiments data Interfacing to other analysis packages, event displays, etc. can be implemented
easily
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.Major Role for Middleware
EngineeringARDA roadmap based on a well-factored prototype implementation that allows
evolutionary development into a complete system that evolves to the full LHC scale David Foster: “lets recognize that very little work has so far been done on
the underlying mechanisms needed to provide the appropriate foundations (message passing structures, fault recovery procedures, component instrumentation etc)”
ARDA prototype would be pretty lightweight• Stability through basing on global database to which services talk
through a database proxy• “people know how to do large databases” -- well founded principle (see
e.g. SAM for RunII), with many possible migration paths• HEP-specific services, however based on generic OGSI-compliant services
Expect LCG/EGEE middleware effort to play major role to evolve this foundation, concepts and implementation re-casting the (HEP-specific event-data analysis oriented) services into more
general services, from which the ARDA services would be derived addressing major issues like a solid OGSI foundation, robustness, resilience,
fault recovery, operation and debugging
LAT Bauerdick ARDA Interim Report, SC2 Meeting Sep 12, 2003
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.ConclusionsARDA is identifying a services oriented architecture and an
initial decomposition of services required for distributed analysis
Recognize a central role for a Grid API which provides a factory of user interfaces for experiment frameworks, applications, portals, etc
ARDA Prototype would provide an distributed physics analysis environment of distributed experimental data for experiment framework based analysis•Cobra, Athena, Gaudi, AliRoot,
for ROOT based analysis interfacing to other analysis packages like JAS; event
displays like Iguana; grid portals; etc. can be implemented easily