16
DPS May 11/2002 USCMS CMS-CCS Status and Plans CMS-CCS Status and Plans May 11, 2002 USCMS meeting David Stickland

CMS-CCS Status and Plans

Embed Size (px)

DESCRIPTION

CMS-CCS Status and Plans. May 11, 2002 USCMS meeting David Stickland. Outline. Won’t say anything about ORCA, OSCAR, DDD, IGUANA See talks from Darin, Sarah and Jim. See Tutorials last week at UCSD See Paris’s talk to LHCC this Tuesday Production LCG and its effect on CMS program - PowerPoint PPT Presentation

Citation preview

Page 1: CMS-CCS Status and Plans

DPS May 11/2002 USCMS

CMS-CCS Status and PlansCMS-CCS Status and Plans

May 11, 2002 USCMS meeting

David Stickland

Page 2: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 2

OutlineOutline

Won’t say anything about ORCA, OSCAR, DDD, IGUANA See talks from Darin, Sarah and Jim.

See Tutorials last week at UCSD

See Paris’s talk to LHCC this Tuesday

Production

LCG and its effect on CMS program

Draft of new schedule

Page 3: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 3

CMS - Productions and Computing Data CMS - Productions and Computing Data ChallengesChallenges

Already completed 2000,1: Single site production challenges with up to 300 nodes

~5 Million events, Pileup for 1034 2000,1: GRID enabled prototypes demonstrated 2001,2: Worldwide production infrastructure

11 Regional Centers comprising 21 computing installations Shared production database, job tracking, sample validation etc 10M min-bias, simulated-reconstructed-analyzed for calibration studies

Underway Now Worldwide production 10 million events for DAQ TDR

1000 CPU’s in use Production and Analysis planned at CERN and offsite

Being Scheduled Single Site Production Challenges

Test code performance, computing performance, identify bottlenecks etc Multi Site Production Challenges

Test Infrastructure, GRID prototypes, networks, replication… Single- and Multi-Site Analysis Challenges

Stress local and GRID prototypes under quite different conditions to Analysis

Page 4: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 4

Production StatusProduction StatusProd.Cycle Simulation ooHit NoPU digi 2x1033PU digi 1034PU digi

RC Done:

CERN 96% Started Started Started Started

INFN 100% Started Started

Imperial Coll. 89% Started Started Test in progress

UCSD 95% Started Test successful Started

Moscow 100% Started Test successful

FNAL 89% Started - Started

UFL 100% Started Test successful

Wisconsin 97% Test successful Test in progress

Caltech 100% Test in progress

IN2P3 100% Test in progress

Bristol/Ral 28% Test in progress

USMOP 0%

Done 88% 61% 81% 5.% 17%

Estimate 11/May/02,complete in 17 Days.

(June 1 deadline!)

Estimate 11/May/02,complete in 17 Days.

(June 1 deadline!)

May 11. Production Status

0

1,000,000

2,000,000

3,000,000

4,000,000

5,000,000

6,000,000

7,000,000

Simulated Hit No PU 2x10 3̂3 10 3̂4 Filtered

Eve

nts

Requested

Produced

Page 5: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 5

Production 2002, ComplexityProduction 2002, Complexity

Number of Regional Centers 11

Number of Computing Centers 21

Number of CPU’s ~1000

Largest Local Center 176 CPUs

Number of Production Passes for each Dataset(including analysis group processing done by production) 6-8

Number of Files ~11,000

Data Size (Not including fz files from Simulation) 15TB

File Transfer by GDMP and by perl Scripts over scp/bbcp

Page 6: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 6

LCG StatusLCG Status

Applications Area Persistency Framework, established a roadmap for a new software

based on ROOT and on an RDBMS layer (Hybrid solution) Project manager appointed, work starting !

Defined parameters of an LCG SW Infrastructure group But so far no people! And big decision between SCRAM/CMT to be made

Mathlib, indicates requirement for skilled mathlib personnel Investigating use of resources assigned to LCG by India

MSS, premature for all but ALICE Detector Description Database

About to start, excellent opportunity for collaboration exists Simulation

Waiting for G4-HEPURC (We urgently need this body to start work) Next Month, start an RTAG on Interactive Analysis

Urgent requirement to clarify focus of this activity

(Interest in using IGUANA expressed also by some other experiments)

Page 7: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 7

CMS - Schedule for Challenge Ramp UpCMS - Schedule for Challenge Ramp Up

All CMS work to date with Objectivity,

Now being phased out to be replaced with LCG Software

Enforced lull in production challenges

No point to do work to optimize a solution being replaced

(But much learnt in past challenges to influence new design)

Use Challenge time in 2002 to benchmark current performance

Aim to start testing new system as it becomes available

Target early 2003 for first realistic tests

Thereafter return to roughly exponential complexity ramp up to reach 50%

complexity in 2005

20% Data Challenge, (50% complexity in 2005 is approximately

20% capacity)

Page 8: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 8

Objectivity IssuesObjectivity Issues

Bleak CERN has not renewed the Objectivity Maintenance

Old licenses are still applicable, but cannot be migrated to new hardware Our understanding is that we can continue to use the product as before,

clearly without support any longer Not clear if this applies to any Redhat Version

or for that matter other Linux OS’s Recent contradictory statements from IT/DB

Will become increasingly difficult during this year to find sufficient resources correctly configured for our Objectivity usage.

We are preparing for the demise of our Objectivity-based code by the end of this year CMS already contributing to the new LCG Software Aiming to have first prototypes for catalog layer by July Initial release of CMS prototype ROOT+LCG, September 2002

24/4/02Now Clear we cannot

use on other OS versions

24/4/02Now Clear we cannot

use on other OS versions

Page 9: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 9

MetaDataCatalog

DictionarySvcStreamerSvcStreamerSvc

PersistencyMgr

IReflectionStreamerSvc DictionarySvc

StorageMgr

IPReflection

FileCatalog

ICnv

IReadWrite

C++

CacheMgr

ICache

TFile,TDirectoryTSocket

TClass, etc.

TBuffer, TMessage, TRef, TKey

TGrid

TTree

TStreamerInfo

IteratorSvc TChainTEventListTDSet

IPers

IFCatalog

SelectorSvc

IMCatalog

PlacementSvcIPlacement

TFile

CustomCacheMgrIPers

One possible One possible

mapping to a ROOTmapping to a ROOT

implementationimplementation

(under discussion)(under discussion)

Page 10: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 10

CMS Action to Support LCGCMS Action to Support LCG

We expect >50% of our ATF (Architecture,Frameworks, Toolkits) effort to be directed to LCG in short/medium term First person assigned full time to persistency framework the day the

work package started 3-5 more people ready to join work as task develops Initial emphasis,

build the catalog layer that is missing from ROOT Remove Objectivity from COBRA/ORCA(OSCAR) Ensure Simple ROOT storage of objects is working Aim to have basic catalog services by July,

basic COBRA/ORCA/OSCAR using new persistency scheme by September

Try to get our Release Tools (SCRAM) to be adopted by LCG (Two possibilities CMT (LHCb,ATLAS) or SCRAM)

SCRAM is a better product! If adopted we would expect to have to put extra effort into supporting a

wider community Aim to get some extra manpower from LCG

Page 11: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 11

CMS and the GRIDCMS and the GRID

CMS Grid Implementation plan for 2002 published

Close collaboration with EDG and Griphyn/iVDGL,PPDG

Upcoming CMS GRID/Production Workshop (June CMSweek) File Transfers, Fabrics

Production File Transfer Software Experiences

Production File Transfer Hardware Status & Reports

Future Evolution of File Transfer Tools

Production Tools Monte Carlo Production System Architecture

Experiences with Tools

Monitoring / Deployment Planning Experiences with Grid Monitoring Tools

Towards a Rational System for Tool Deployment

Page 12: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 12

The The Computing ModelComputing Model

MONARC: CMS Analysis ProcessMONARC: CMS Analysis Process

Hierarchy of Processes (Experiment, Analysis Groups,Individuals)Hierarchy of Processes (Experiment, Analysis Groups,Individuals)

SelectionSelection

Iterative selectionIterative selectionOnce per monthOnce per month

~20 Groups’~20 Groups’ActivityActivity

(10(109 9 101077 events)events)

Trigger based andTrigger based andPhysics basedPhysics basedrefinementsrefinements

25 SI95sec/ event~20 jobs per

month

25 SI95sec/ event25 SI95sec/ event~20 jobs per ~20 jobs per

monthmonth

AnalysisAnalysisDifferent Physics cutsDifferent Physics cuts

& MC comparison& MC comparison~Once per day~Once per day

~25 Individual~25 Individualper Groupper GroupActivityActivity

(10(1066 –– 101077 events)events)

Algorithms Algorithms applied to dataapplied to datato get resultsto get results

10 SI95sec/ event~500 jobs per day10 SI95sec/ event10 SI95sec/ event~500 jobs per day~500 jobs per day

Monte CarloMonte Carlo5000 SI95sec/ event5000 SI95sec/ event5000 SI95sec/ event

RAW DataRAW Data

ReconstructionReconstruction ReRe--processingprocessing3 Times per year3 Times per year

ExperimentExperiment--Wide ActivityWide Activity(10(1099 events)events)

New detector New detector calibrationscalibrations

Or understandingOr understanding

3000 SI95sec/ event

1 job year

3000 3000 SI95sec/ eventSI95sec/ event

1 job year1 job year

3000 SI95sec/ event3 jobs per year

3000 3000 SI95sec/ eventSI95sec/ event3 jobs per year3 jobs per year

CMS Computing Model needs updating

1. CMS (and ATLAS) refining Trigger/DAQ rates

2. PASTA process re-costing HW and re-extrapolating “Moore's law”

3. Realistic cost constraints

With above in place, optimize computing model Need continued

development and refinement of MONARC-like tools to simulate and validate computing models

Realistically this will take most of this year

Page 13: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 13

CCS ManpowerCCS Manpower

More or less constant over last year, small increase 52 “names” identified working on CCS tasks 13 ~Full-Time “Engineers” (All CERN or USA)

Of which 7 in ATF group ~20 in Worldwide production operations or support

Last detailed plan called for ~ 30 “Engineers” this year (next year with LHC delay) Delays running at ~ 4 months extra delay per year elapsed

OK as long as LHC keeps getting delayed..On the old schedule we would be getting in to big trouble by

now Use LCG to leverage external manpower,

But no free lunch, CMS probably has the biggest Software group and so will need to contribute proportionately more to the work

Make extra effort to use worldwide manpower We do this already

Page 14: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 14

Draft CPT Schedule ChangeDraft CPT Schedule Change

LHCbeam

Fix

9 months

9 months12

months

~15mo

15 months

LCGTDR

LCGTDR

Page 15: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 15

Page 16: CMS-CCS Status and Plans

DP

S M

ay 1

1/20

02 U

SCM

S

Slide 16

SummarySummary

Still very hard to make firm plans Experiment and LCG schedules are being aligned.

No big problems so far But we do not yet know how much we will have to contribute and how

much we will get We have a slow increase in manpower available, but still most

of it is from CERN and USA. Some major parts of the collaboration are still contributing zero to the CCS effort.

If LHC had not slipped, we would be in trouble defining our baseline (and then getting the Physics TDR underway): Persistency (Transition of 18 months was foreseen, and will be needed) OSCAR validation, SW product needs restructuring, and PRS not

available for physics/detector validation till after DAQ TDR We are proactively trying to find commonality with any other

groups to offload work. CMS is a major contributor of LHC software, so no free lunch, we still

have to do a lot of the work