29
DataGrid is a project funded by the European Commission under contract IST-2000-25182 Status and Prospective of EU Data Grid Project Alessandra Fanfani (University of Bologna) On behalf of EU DataGrid project Outline: EU DataGrid project HEP Application experience Future perspective http://www.eu-datagrid.or

Status and Prospective of EU Data Grid Project

Embed Size (px)

DESCRIPTION

Status and Prospective of EU Data Grid Project. Alessandra Fanfani (University of Bologna) On behalf of EU DataGrid project. http://www.eu-datagrid.org. Outline: EU DataGrid project HEP Application experience Future perspective. 9.8 M Euros EU funding over 3 years - PowerPoint PPT Presentation

Citation preview

Page 1: Status and Prospective of EU Data Grid Project

DataGrid is a project funded by the European Commission under contract IST-2000-25182

Status and Prospective of EU Data Grid Project

Alessandra Fanfani (University of Bologna)

On behalf of EU DataGrid project

Outline: EU DataGrid project HEP Application experience Future perspective

http://www.eu-datagrid.org

Page 2: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 2The 2nd Workshop on HEP GRID – Daegu 22 August 2003

The EU DataGrid Project

9.8 M Euros EU funding over 3 years

90% for middleware and applications (HEP , Earth Observation, Biomedical)

3 year phased developments & demos

Total of 21 partners Research and Academic institutes as

well as industrial companies

Extensions (time and funds) on the basis of first successful results:

DataTAG (2002-2003) www.datatag.org

CrossGrid (2002-2004) www.crossgrid.org

GridStart (2002-2004) www.gridstart.org

Project started on Jan. 2001

Testbed 0 (early 2001) International test bed 0 infrastructure

deployed Globus 1 only - no EDG middleware

Testbed 1 ( early 2002 ) First release of EU DataGrid software to

defined users within the project

Testbed 2 (end 2002) Builds on Testbed 1 to extend facilities of

DataGrid Focus on stability

Passed 2nd annual EU review Feb. 2003

Testbed 3 (2003) Advanced functionality & scalability Currently being deployed

Project stops on Dec. 2003

Page 3: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 3The 2nd Workshop on HEP GRID – Daegu 22 August 2003

Related Grid Projects

Through links with sister projects, there is thepotential for a truly global scientific applications grid

Main components of EDG 2.0 release build the basis for LCG middleware LHC Computing Grid www.cern.ch/lcg

Page 4: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 4The 2nd Workshop on HEP GRID – Daegu 22 August 2003

EDG Middleware Architecture

Collective ServicesCollective Services

Information &

Monitoring

Information &

Monitoring

Replica ManagerReplica

ManagerGrid

SchedulerGrid

Scheduler

Local ApplicationLocal Application Local DatabaseLocal Database

Underlying Grid ServicesUnderlying Grid Services

Computing Element Services

Computing Element Services

Authorization Authentication and Accounting

Authorization Authentication and Accounting

Replica CatalogReplica Catalog

Storage Element Services

Storage Element Services

SQL Database Services

SQL Database Services

Fabric servicesFabric services

ConfigurationManagement

ConfigurationManagement

Node Installation &Management

Node Installation &Management

Monitoringand

Fault Tolerance

Monitoringand

Fault Tolerance

Resource Management

Resource Management

Fabric StorageManagement

Fabric StorageManagement

Grid

Fabric

Local Computing

Grid Grid Application LayerGrid Application Layer

Data Management

Data Management

Job Management

Job Management

Metadata Management

Metadata Management

Service Index

Service Index

APPLICATIONS

GLOBUSCondorG

(via VDT)

M / W

Page 5: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 5The 2nd Workshop on HEP GRID – Daegu 22 August 2003

The user interacts with Grid via a Workload Management System (WMS)

The Goal of WMS is the distributed scheduling and resource management in a Grid environment.

Resource Broker tries to match user requirements with available resources

Software installed at potential sites Ensure data locality Efficient usage of resources

Workload Management System

Page 6: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 6The 2nd Workshop on HEP GRID – Daegu 22 August 2003

Data Management

High level data management on the Grid Location of data

Replication of data

Efficient access to data

Provide basic, consistent interface to disk and mass to storage systems (Hides the Storage Resource Manager )

Page 7: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 7The 2nd Workshop on HEP GRID – Daegu 22 August 2003

Information & Monitoring

R-GMA Relational implementation of GMA from GGF

Makes use of GLUE schema (inter-operability with US grids)

Interoperable with MDS

Deals with information on The Grid itself

Resources and Services Job status information

Grid applications

Page 8: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 8The 2nd Workshop on HEP GRID – Daegu 22 August 2003

Grid aspects covered by EDG

VOMS

(VO Membership Service)

Provides certificate with VOs, groups and roles

RGMA: Information & Monitoring

Provides info on resource utilization & performance

User Interface Submit & monitor jobs, retrieve output

Grid Fabric Management

Configure, installs & maintains grid sw packages and environ.

Workload Management System

Manages submission of jobs to Res. Broker, obtains information and retrieves output

Network performance

Provides efficient network transport, bandwidth monitoring

Computing Element Gatekeeper to a grid computing resource

Testbed admin. Certificate auth.,user reg., usage policy etc.

Storage Resource Manager

Grid-aware storage area Applications

HEP, EO, Biology

Replica Manager Replicates and locates data

Page 9: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 9The 2nd Workshop on HEP GRID – Daegu 22 August 2003

Detailed Interplay of EDG Components

Page 10: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 10The 2nd Workshop on HEP GRID – Daegu 22 August 2003

People

>350 registered users

12 Virtual Organisations

16 Certificate Authorities

>300 people trained

278 man-years of effort

100 years funded

Scientific applications5 Earth Obs institutes9 bio-informatics apps6 HEP experiments

DataGrid in Numbers

Software

50 use cases

18 software releases

Current release 1.4

Release 2.0 being tested

>300K lines of code

Testbeds

>15 regular sites 40 sites using EDG sw (i.e. Taiwan,

Korea)

>10’000s jobs submitted

>1000 CPUs

>15 TeraBytes disk

3 Mass Storage Systems

Page 11: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 11The 2nd Workshop on HEP GRID – Daegu 22 August 2003

DataGrid Scientific Applications

Earth Observation

•about 100 Gbytes of data per day (ERS 1/2)

•500 Gbytes, for the ENVISAT mission

Bio-informatics Data mining on genomic databases (exponential

growth) Indexing of medical databases (Tb/hospital/year)

Particle Physics Simulate and reconstruct complex physics

phenomena millions of times

LHC experiments will generate 6-8 PetaBytes/year

Developing grid middleware to enable large-scale usage by scientific applications Development on computing side but also focus on the real use by the applications!

Page 12: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 12The 2nd Workshop on HEP GRID – Daegu 22 August 2003

Application Usage of Release 1.4

Positive Signs:

Large increase in users.

Many sites interested in joining.

Pushing real jobs through system.

EDG 1.4 evaluated for review in Feb. 2003

CEs

5

674

1,1

175

1867

126

127

1221

809

21

2616

7

32

3

1 10 100 1000 10000

ALICE

ATLAS

BaBar

Bio.

CMS

E.O.

LHCb

ITeam

CPU Hours

J obs

HEP Simulation

Disk Usage

CPU Usage

CEsSEs

Nb

. of

evts

ALICE

ATLAS

Bio.

CMS

E.O.

ITeam

LHCb

Tutor

WP6

1 MB

1 GB

1 TB

TOTAL: >1.5 TB

100 GB

19 GB

200 GB

Disk Usage

(CERN)

Successful 2nd annual EU review: funding agencies were happy about the real use by the application

Page 13: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 13The 2nd Workshop on HEP GRID – Daegu 22 August 2003

HEP Applications

Intense usage of application testbed in 2002 and early 2003, in particular by HEP experiments:

ATLAS, CMS, ALICE, LHCb, Babar, D0 activities within DataGrid documented in detail in deliverable D8.3 https://edms.cern.ch/document/375586/1.2

ATLAS and CMS task forces very active and successful Several hundred ATLAS simulation jobs of length 4-24 hours were

executed & data was replicated using grid tools CMS Generated ~250K events for physics studies with ~10,000

jobs in 3 week period Since project review: ALICE and LHCb have been generating physics

events Babar and D0 performed more basic tests with analysis and Monte-

Carlo production jobs

Page 14: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 14The 2nd Workshop on HEP GRID – Daegu 22 August 2003

Joint evaluation fromAtlas/CMS work on Release 1.4

Results were obtained from focused task-forces of Experiments and EDG people

Good interaction with EDG middleware providers

Fast turnaround in bug fixing and installing new software

Test were labour intensive since software was developing and the overall system was fragile

There are essential developments needed in Data Management (robustness and functionality)

Information Systems (robustness and scalability)

Workload Management (scalability for high rates, batch submissions,stability)

Mass Storage Support (gridified support due in EDG 2.0)

Release 2.0 should fix the major problems

Page 15: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 15The 2nd Workshop on HEP GRID – Daegu 22 August 2003

Release 2.0

Major new developments in all middleware areas

Addressing the key shortcomings identified: WMS stability and scalability WMS re-factored

Replica catalog stability and scalability Replica Location Service

Data management usability DM re-factored

Information system stability and scalability R-GMA

Unified access to MSS new SE service

Fabric monitoring infrastructure

Providing new functionalities

Upgrade underlying software

Page 16: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 16The 2nd Workshop on HEP GRID – Daegu 22 August 2003

HEP experience:the CMS example joint effort involving CMS, EDG, EDT and LCG people

CMS/EDG Stress Test Goals: Verification of the portability of the CMS Production environment into a

grid environment; Verification of the robustness of the European DataGrid middleware in

a production environment; Production of data for the Physics studies of CMS

Use as much as possible the High-level Grid functionalities provided by EDG:

Workload Management System (Resource Broker), Data Management (Replica Manager and Replica Catalog), MDS (Information Indexes), Virtual Organization Management, etc.

Interface (modify) the CMS Production Tools to the Grid provided access method

Measure performances, efficiencies and reason of job failures to have feedback both for CMS and EDG

Page 17: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 17The 2nd Workshop on HEP GRID – Daegu 22 August 2003

CMS/EDG Middleware and Software Middleware was: EDG from version 1.3.4 to version 1.4.3

Resource Broker server Replica Manager and Replica Catalog Servers MDS and Information Indexes Servers Computing Elements (CEs) and Storage Elements (SEs) User Interfaces (UIs) Virtual Organization Management Servers (VO) and Clients EDG Monitoring, etc…

CMS software distributed as rpms and installed on the CE

CMS Production tools (IMPALA,BOSS) installed on User Interface

Monitoring was done trough: Job monitoring and bookkeeping: BOSS Database, EDG Logging & Bookkeeping service Resources monitoring : Nagios, web based tool developed by the DataTag project EDG monitoring system (MDS based): collected regularly by scripts running as cron jobs

and stored for offline analysis BOSS database: permanently stored in the MySQL database

Both sources are processed by a tool (boss2root) to put the information in a Root tree to perform analysis

Online

Offline

Page 18: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 18The 2nd Workshop on HEP GRID – Daegu 22 August 2003

CMS jobs description

CMKINJob

CMSIMJob

Output data(ntuples)

Output data(Fz files)

Grid Storage

Write to Grid

Storage Element

Write to Grid

Storage Element

Read from

Grid

Stora

ge Elem

ent

* PIII 1GHz 512MB 46.8 SI95

size/event

time*/event

CMKIN ~ 0.05MB ~ 0.4-0.5 sec

CMSIM ~ 1.8 MB ~ 6 min

Dataset eg02_BigJets CMS official jobs for “Production” of

results

used in Physics studies : Real-life testing

Production in 2 steps:

1. CMKIN : MC Generation of the proton-proton interaction for a physics channel (dataset)

125 events ~ 1 minute ~ 6 MB ntuples

2. CMSIM : Detailed simulation of CMS Detector

125 events ~ 12 hours ~ 230 MB FZ files

“Short” jobs

“Long” jobs

Page 19: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 19The 2nd Workshop on HEP GRID – Daegu 22 August 2003

CMS production components interfaced to EDG

•Four submitting UIs: Bologna/CNAF (IT), Ecole Polytechnique (FR), Imperial College (UK), Padova/INFN (IT)

•Several Resource Brokers (WMS), CMS-dedicated and shared with other Applications: one RB for each CMS UI + “backup”

•Replica Catalog at CNAF, MDS (and II) at CERN and CNAF, VO server at NIKHEF

CMS EDGBOSS

DB

WorkloadManagement

System

JDL

RefDB

parameters

input

dat a

lo

cat i

on

Push data or info

Pull info

UIIMPALA/BOSS

Replica Manager

CE

CMS software

CE

CMS software

CE

WN

SE

SE

SE

Job output filteringRuntime monitoring

CE

CMS software

SE

data registration

read

write

SECE

CMS software

X

Page 20: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 20The 2nd Workshop on HEP GRID – Daegu 22 August 2003

EDG hardware resources

SiteNumber of CPUs

Disk Space GB

Availability of MSS

CERN (CH) 122 1000* (+100) yes

CNAF (IT) 20 + 20* 1000*

RAL (UK) 16 360

Lyon (FR)shared

120 (400)200 yes

NIKHEF (NL) 22 35

Legnaro (IT)* 50 1000*

Ecole Polytechnique (FR)* 4 220

Imperial College (UK)* 16 450

Padova (IT)* 12 680

Totals 402 (400) 3000* + (2245)

*Dedicated to CMS Stress Test

•CNAF Bologna

•Legnaro & Padova

•CERN

•Ecole Poly

RAL .•Imperial College

•NIKHEF

•Lyon

add new (CMS) sites to provide extra resources

Page 21: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 21The 2nd Workshop on HEP GRID – Daegu 22 August 2003

Statistics of CMS/EDG Stress Test

Nb o

f jo

bs

Executing Computing Element

Total EDG Stress Test jobs = 10676, successful =7196 , failed = 3480

Total nb. of events

CMKIN CMSIM

592750 268375Total size of

data produced 500 GB

distribution of job:Executing CEs

Page 22: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 22The 2nd Workshop on HEP GRID – Daegu 22 August 2003

CMS/EDG Production

~260K events produced

~7 sec/event average

~2.5 sec/event peak (12-14 Dec)

30 Nov

20 Dec

CMS Week

Upgrade of MW

Hit some limitof implement. (RC,MDS)

CMSIM “long” jobs

Nb o

f events job submitted from UI:

Page 23: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 23The 2nd Workshop on HEP GRID – Daegu 22 August 2003

Main results and observations RESULTS

Could distribute and run CMS software in EDG environment

Generated ~250K events for physics with ~10,000 jobs in 3 week period

OBSERVATIONS

Were able to quickly add new sites to provide extra resources

Fast turnaround in bug fixing and installing new software

Test was labour intensive (since software was developing and the overall system was fragile)

WMS: At the start there were serious problems with long jobs- recently improved

Data Management: Replication Tools were difficult to use and not reliable, and the performance of the Replica Catalogue was unsatisfactory

Information system: The Information System based on MDS performed poorly with increasing query rate

The system is sensitive to hardware faults and site/system mis-configuration The user tools for fault diagnosis are limited

EDG 2.0 should fix the major problems providing a system suitable for full integration in distributed production

Page 24: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 24The 2nd Workshop on HEP GRID – Daegu 22 August 2003

EU DataGrid Summary and Outlook The focussing of the project on stability has improved the manner in which

the software is build and supported

The application testbed has reached the highest level of maturity that can be achieved using the available grid middleware and supporting manpower

Steady increase in the size of the testbed until a peak of approx 1000 CPUs at 15 sites

Intense usage of application testbed (release 1.3 and 1.4) in the past year

significant achievements in the use of EDG middleware by the experiments : Real use is possible but labour intensive Results were obtained by task-force which pointed to areas in the middleware which

required development and reconfiguration

The problems in performance encountered by the experiments are addressed in the release EDG 2.0.

There is a strong connection with the LHC Computing Grid. LCG have a new grid service modeled on the EDG testbed and includes EDG 2.0 components

Outlook: A production quality infrastructure is needed EGEE

Continuous, stable Grid operation represents the most ambitious objective of EGEE and require the largest effort

Page 25: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 25The 2nd Workshop on HEP GRID – Daegu 22 August 2003

EGEE vision:Enabling Grids for E-science in Europe

Goal Create a wide European Grid production quality infrastructure on top of present and future EU RN infrastructure

Build on EU and EU member states major investments in Grid Technology

Exploit International connections (US and AP) Several pioneering prototype results Large Grid development team (>60 people) Requires major EU funding effort

Approach Leverage current and planned national and regional Grid programmes (e.g. LCG)

Work closely with relevant industrial Grid developers, NRENs and US-AP projects

 

EGEE

Applications

Geant network

http://www.cern.ch/egee

Page 26: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 26The 2nd Workshop on HEP GRID – Daegu 22 August 2003

EGEE Proposal

Proposal submitted to EU IST 6th framework call on 6th May 2003

Executive summary (exec summary: 10 pages; full proposal: 276 pages)

http://agenda.cern.ch/askArchive.php?base=agenda&categ=a03816&id=a03816s5%2Fdocuments%2FEGEE-executive-summary.pdf

Two-year project conceived as part of a four year programme

9 regional federations covering 70 partners in

26 countries

Page 27: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 27The 2nd Workshop on HEP GRID – Daegu 22 August 2003

EGEE Operation Management

Regional Operations Centre

Core Infrastructure Centre

Service Activities: deliver production level Grid Infrastructure (52% of funding)

Integration of national and international Grid infrastructures

Essential elements: manageability, robustness, resilience to failure,consistent security model, scalability to rapidly absorb new resources

Joint Research Activity: Engineering development (24% of funding)

Re-Engineering of grid middleware (OGSA environment) to improve the services provided by the Grid infrastructure

Networking Activities:Management, Dissemination, Training and Applications (24% of funding)

The Applications Interface Activity will start with two Pilot applications in high energy physics and bio/medical

EGEE Activities

managing the overall Grid infrastructure

regional deployment and support of services

Page 28: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 28The 2nd Workshop on HEP GRID – Daegu 22 August 2003

EGEE Status

EGEE proposal passed thresholds at first EU review (June 2003) Follow-up hearing held at Brussels on 1st July 2003 to answer written

questions from the EU reviewers on details of the project

Evaluation Summary Report received from Brussels (17th July 2003) Number of detailed recommendations made

EU budget estimated at 31.5M€

Negotiate budget details during summer and produce Technical Annex (details of negotiated tasks and budgets)

Informal EGEE/EU meeting held in Brussels 24th July 2003

Foreseen project start date: 1st April 2004

Good match with existing EU DataGrid and related project expected completion

All partners are requested to assign resources already during summer 2003 to start engineering investigations and architecture design work so that project can start on time

Page 29: Status and Prospective of EU Data Grid Project

The European DataGrid Project - n° 29The 2nd Workshop on HEP GRID – Daegu 22 August 2003

EGEE Summary

EGEE is a project to develop and establish a reliable infrastructure that provides high quality grid service to a wide range of users

HEP is one of the two pilot application areas selected to guide the implementation and certify the performance and functionality of this evolving European Grid infrastructure

International connection : participation and collaboration with non EU countries (Russia, US, AP) is desirable and will be pursued