22
LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@CERN .ch

LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

Embed Size (px)

Citation preview

Page 1: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

LCG Status

GridPP 8

September 22nd 2003

[email protected]

Page 2: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

2 [email protected]

LCG/PEB Work areas

Applications

Fabrics

Grid Technology

Grid Deployment

Page 3: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

3 [email protected]

Applications SPI POOL SEAL PI Simulatio

n

Sof

twa

re P

roce

ss &

Inf

rast

ruct

ure

(S

PI)

Core Libraries & Services (SEAL)

Persistency(POOL)

PhysicistInterface

(PI)Simulation…

LCG Applications Area

Other LCG Projects in other Areas

LHC

Exp

erim

ents

Page 4: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

4 [email protected]

Applications SPI

– Software Infrastructure solidly in place. The different components were covered in depth at GridPP 7.

» Effort in this area reduced, but incremental improvements being delivered in response to feedback.

POOL SEAL PI Simulation

Page 5: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

5 [email protected]

Applications SPI POOL

– First production release delivered on schedule in June

– Experiment integration now underway» production use for CMS Pre-Challenge Production milestone

met at end July» completion of first ATLAS integration milestone expected in

September.» POOL deployment on LCG-1 beginning» POOL and SEAL working closely with experiment integrators

to resolve bugs and issues exposed in integration. Lots of them!, but this was expected!

SEAL PI Simulation

Page 6: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

6 [email protected]

Applications SPI POOL SEAL

– Project on track … PI Simulation

Page 7: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

7 [email protected]

ApplicationsRelease

Date Status Description (goals)

V 0.1.0 14/02/03 internal

Establish dependency between POOL and SEALDictionary support & generation from header files

V 0.2.0 31/03/03 public Essential functionality sufficient for the other existing LCG projects (POOL)Foundation library, system abstraction, etc.Plugin management

V 0.3.0 16/05/03 internal

Improve functionality required by POOLBasic framework base classes

V 1.0.0 30/06/03 public Essential functionality sufficient to be adopted by experimentsCollection of basic framework servicesScripting support

Released 04/04/03

Released 14&26/02/03

Released 23/05/03

Released 18/07/03

Page 8: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

8 [email protected]

Applications SPI POOL SEAL

– Project on track …– Waiting for detailed feedback on current functionality

from POOL & experiments– Planning to develop new requested functionality

» Object whiteboard (transient datastore)» Improvements to scripting: LCG dictionary integration, ROOT

integration» Complete support for C++ types in the LCG dictionary

PI Simulation

Page 9: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

9 [email protected]

Applications SPI POOL SEAL PI

– Principal initial mandate, a full ROOT implementation of AIDA histograms, recently completed

– Still a small effort with limited scope, though.– Future planning depends on what comes out of the

ARDA RTAG» Architectural Roadmap towards Distributed Analysis» Reviewing DA activities, HEPCAL II use cases, interfaces

between Grid, LCG and experiment-specific services.» Started in September, scheduled to finish in October.

Simulation

Page 10: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

10 [email protected]

Applications SPI POOL SEAL PI Simulation

– Physics Validation subproject particularly active» pion shower profile for ATLAS improved» expect extensive round of comparison with testbeam data in

autumn.

– ROSE: (Revised Overall Simulation Environment)» Looking at generic framework high level design,

implementation approach, software to be reused. Decisions expected in September.

Page 11: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

11 [email protected]

Fabrics CC Infrastructure

Recosting

Management successes

RH release cycles

Page 12: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

12 [email protected]

Fabrics — CC Infrastructure

Page 13: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

13 [email protected]

Fabrics — CC Infrastructure

Page 14: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

14 [email protected]

Fabrics — Recosting I Representatives from IT and the 4 LHC

experiments reviewed the expected equipment cost for LCG phase 2.– Took into account adjusted requirements from the

experiments and some slight changes to the overall model.

– Results published in July.

Page 15: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

15 [email protected]

Fabrics — Recosting II

34.060.0Budget

-6.834.341.12.968.765.8SUM

-3.63.06.6- 4.43.57.9Sysadmin

-2.84.06.8- 4.46.011.4WAN

-1.617.619.25.327.822.5Tape

0.72.92.25.611.96.3Disk

0.56.86.31.819.517.7CPU+LAN

New -Old

2009-10

New

2009-10

Old

2009-10

New- Old

2006-08

New

2006-08

Old

2006-08Resource

34.060.0Budget

-6.834.341.12.968.765.8SUM

-3.63.06.6- 4.43.57.9Sysadmin

-2.84.06.8- 4.46.011.4WAN

-1.617.619.25.327.822.5Tape

0.72.92.25.611.96.3Disk

0.56.86.31.819.517.7CPU+LAN

New -Old

2009-10

New

2009-10

Old

2009-10

New- Old

2006-08

New

2006-08

Old

2006-08Resource

All units in [ million CHF ]

Page 16: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

16 [email protected]

Fabrics — System Management Overall management suite christened over the

summer: ELFms with components– quattor: EDG/WP4 installation & configuration– Lemon: LHC Era monitoring– LEAF: LHC Era Advanced Fabrics

quattor thoroughly in control of CERN fabric– migration to RH 7.3 managed by quattor in spring.– LSF 5 migration took 10 minutes in late August

» Across 800+ batch nodes. Equivalent migration in 2002 took over 3 weeks with much disruption.

EDG/WP4 OraMon repository in production since September 1st.

State Management System development underway.

Page 17: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

17 [email protected]

Fabrics — RedHat Release Cycles RedHat are moving to a twin product line

– Frequent end-user releases with support limited to 1 year, but free.

– Less frequent business releases with long term support at a cost.

Neither product really adapted to our needs– Annual change of system version is too rapid: 18-24

month cycle more realistic.– Cost of Enterprise server prohibitive for our farms.

Move to negotiate with RedHat for compromise– Major labs club together to pay for limited support

(security patches + ?) for the end-user product for, say, 2 years.

– Discussions at HEPiX in Vancouver» Plus visit to RedHat?

Page 18: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

18 [email protected]

Grid Deployment — I Deployment started (with pre-release tag) in July, to

original 10 Tier 1 sites– CERN, BNL, CNAF, FNAL, FZK, Lyon, Moscow, RAL, Taipei, Tokyo– Other sites joined: PIC (Barcelona), Prague, Budapest

Situation today (18/9/03):– 10 sites up: CERN, CNAF, RAL, FZK, FNAL, Moscow, Taipei,

Tokyo, PIC, Budapest– Still working on installation: BNL, Prague, Lyon (situation not

clear) Other sites currently ready to join:

– Bulgaria, Pakistan, Switzerland, Spanish Tier 2’s, Nikhef, Sweden

Official “certified” LCG-1 release (tag LCG-1.0.0) was available on 1 September at 5pm CET– Was installed at CERN, Taiwan, CNAF, Barcelona, Tokyo 24

hours later(!), and several others within a few days

Page 19: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

19 [email protected]

Grid Deployment — IILCG-1 is: VDT 1.1.8-9 (Globus 2.2.4)

– Information System (MDS) Selected software from EDG 2.0:

– Workload Management System (RB)– EDG Data Management (RLS, LRC, …)

GLUE Schema 1.1 + LCG extensions LCG local modifications/additions/fixes, such as:

– Special job managers (LCGLSF, LCGPBS, LCGCONDOR) to solve the problem of sharing home directories

– Gatekeeper enhancements (adding some accounting and auditing features, log rotation, that LCG requires)

– Number of MDS fixes (also coming from NorduGrid)– Number of misc. Globus fixes, most of them included now in the

VDT version LCG is using Some problems remain. Overall, though, impressive

improvement in terms of stability.

Page 20: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

20 [email protected]

Grid Deployment — III Starting to get experiments testing LCG-1 now

– Loose cannons currently running on LCG-1 to verify basic functionality

– Scheduling now with experiments» Initially we need to carefully control who does what» we need to monitor the system as the tests run to understand the

problems

– Migrate CMS LCG-0 to LCG-1– Atlas, US_Atlas (want to demonstrate interoperability)– ALICE – continue with tests started by Loose Cannons– LHCb ?– We are scheduling these tests now, will commence next week

Once experiments verify their software on LCG-1 we must begin to add resources at each site– Currently very basic resources available

Page 21: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

21 [email protected]

Grid Deployment — IV Basics are in place but many tasks to be done at high

priority to make a real production system:– Experiment sw distribution mechanism– Monitors to watch essential system resources on essential

services (/tmp, etc)– System cleanup procedures– System auditing – must ensure procedures are in place– Need basic usage accounting in place– Need tool independent WN installation procedure – also for UI– Integration with MSS (setting up task force)

» NB sites with MSS will need to implement interfaces– Integration with LXBatch (and others)– Standard procedures – we will start but needs a team from sites

and GOC» for setting Runtime Environments» Change procedures» Operations» Incident handling

Page 22: LCG Status GridPP 8 September 22 nd 2003 Tony.Cass@ CERN.ch

22 [email protected]

Summary In general, good progress

– Applications area– Fabrics, …

Yes, LCG-1 is delayed– but don’t forget the vast improvements to the overall

system driven by the focus on delivering a production quality environment.

UK contribution to this work is extensive and much appreciated.