Upload
lois-duke
View
38
Download
1
Embed Size (px)
DESCRIPTION
GridPP2: Data and Storage Management. Gavin McCance - University of Glasgow Jens Jensen - RAL GridPP9, NeSC, Edinburgh. GridPP2 Middleware Data and Storage Management. Work areas. UK metadata management group Storage management. Metadata Management. - PowerPoint PPT Presentation
Citation preview
GridPP9 – 5 February 2004 – Data Management
DataGrid is a project funded by the European UnionGridPP is funded by PPARC
GridPP2: Data and Storage Management
Gavin McCance - University of Glasgow
Jens Jensen - RAL
GridPP9, NeSC, Edinburgh
GridPP9 – 5 February 2004 – Data Management – n° 2Gavin McCance – University of Glasgow
GridPP2 MiddlewareData and Storage Management
GridPP9 – 5 February 2004 – Data Management – n° 3Gavin McCance – University of Glasgow
Work areas
UK metadata management group
Storage management
GridPP9 – 5 February 2004 – Data Management – n° 4Gavin McCance – University of Glasgow
Metadata Management
The focus is upon Grid-enabling metadata services for the experiments
Building upon our previous work in this area Building upon experiments’ existing work in this area
Formation of a UK metadata group with GridPP2 1 generic Grid metadata post @ Glasgow ~1 post per experiment
ATLAS @ Glasgow, LHCb @ Oxford, CMS @ Bristol/ICUS expts, others??
These posts were described yesterday – the UK metadata group should form part of their work
Input from the UK data management support teams
GridPP9 – 5 February 2004 – Data Management – n° 5Gavin McCance – University of Glasgow
GridPP2 Metadata Group
Purpose will be to Take overall responsibility for common experiment metadata
technologies in order to Grid-enable the experiments’ metadata
Identify the commonalities and experience across experiments and make sure these are recognized
i.e. technologies, schema: data product navigational problem
Come to agreement and feed this back into the wider ARDA process
Work directly with interested groups forming the ARDA EGEE JRA1 Data Management Group (@CERN)
LCG Deployment Teams (@CERN)
LCG Experiments
IT Database group (@CERN)
GridPP9 – 5 February 2004 – Data Management – n° 6Gavin McCance – University of Glasgow
Metadata Responsibilities
Generic metadata post: Concentration on the technologies used to create scalable,
manageable and fault-tolerant metadata services The underlying Grid software stack
Emphasis upon the service, not just the product 24/7 supportable production services
Not prescribing things like the schema, or saying the ‘API must look like Spitfire’: prototype interfaces should be based upon experiments’ existing metadata interfaces
Will track, develop and adopt as necessary Grid metadata access standards
Feed into standards to make sure we’re in a position to benefit from the future production products that implement these standards
Feed PPE use-case and experience back into the wider world
GridPP9 – 5 February 2004 – Data Management – n° 7Gavin McCance – University of Glasgow
Metadata Responsibilities Experiment metadata posts (~1 per experiment):
Document existing implementations from the experiments and make sure all the experiments’ use-cases are satisfied by the products and the technologies being proposed by the group
Work within the group to ensure that commonalities and experience across experiments are recognized and effort is not wasted
At the technology level – e.g. using the same underlying Grid software stack
At the interface level – e.g. GANGA Possibly at the schema level…
Feed this understanding and agreement back into the wider ARDA process and back into their own experiments
ARDA terminology:Dataset metadata ARDA Metadata serviceData product navigation ARDA Job Provenance service
GridPP9 – 5 February 2004 – Data Management – n° 8Gavin McCance – University of Glasgow
Storage Management
Two areas of work (based at RAL)
SRM interface to UK storage sites
Site local data management
GridPP9 – 5 February 2004 – Data Management – n° 9Gavin McCance – University of Glasgow
SRM interface to UK Storage
Initial deliverable will be to provide an SRM (Storage Resource Manager) v1 interface to the Atlas DataStore at RAL
Subsequent migration to the more advanced features offered by e.g. SRM v2
Perform an analysis of the UK Tier-2 storage sites and how these can be exposed via the common SRM interface
Implementation of SRM interfaces these storage systems Deployment on all the Tier-2 sites and support
Contribution to the SRM standardisation process
Work closely with the EGEE JRA1 and LCG deployment groups
Work with support staff for Tier-1 and Tier-2
GridPP9 – 5 February 2004 – Data Management – n° 10Gavin McCance – University of Glasgow
Site-local Data Management
Management of data and files within a site How you access the grid storage from the worker nodes
Cleanup of volatile data resources that a job no longer needs (Tier2) – cache management
Evaluation of existing technologies dCache, SAM, EDG Zambo prototype, Condor, …
Development and deployment of these local data management solutions (@ Tier-2)
Interaction with Tier-2 site managers is vital
Feed back solutions into LCG / EGEE
GridPP9 – 5 February 2004 – Data Management – n° 11Gavin McCance – University of Glasgow
GridPP2 SupportData and Storage Management
GridPP9 – 5 February 2004 – Data Management – n° 12Gavin McCance – University of Glasgow
Data Management Support
UK data management support posts Aim: to provide first-level support for all DM software
first stop for UK system administrators
Work directly with the development and deployment teams (GridPP2, EGEE and LCG)
Provide hands-on deployment help for data challenge support
Develop how-to portal to collect deployment experience
Feed back sys-admin issues and experience to developers Site policies, quotas, firewalls – survey sysadmins
Develop site validation tools
Responsible for developing the overall support plan for the data management services beyond GridPP2
Need to fit all this in with the rest of the UK Support Plan