CHEP ’06 Mumbai Marco Corvo – Cern/Cnaf CRAB A tool to enable CMS Distributed Analysis

CHEP ’06 Mumbai Marco Corvo – Cern/Cnaf

CRAB

A tool to enable CMS Distributed Analysis

CHEP ’06 Mumbai Marco Corvo – Cern/Cnaf 2

CMS (fast) overview• CMS will produce a

large amount of data (events) – ~2 PB/year (assumes

startup luminosity 2x1033 cm-2 s-1)

• All events will be stored into files– O(10^6) files/year

• Files will be grouped in Fileblocks – O(10^3)

Fileblocks/year

• Fileblocks will be grouped in Datasets – O(10^3) Datasets

(total after 10 years of CMS)


CMS Computing Model

Online system

Tier 0

Tier 1

Tier 2

Tier 3

Offline farm

CERN Computer center

. .

Tier2 Center Tier2 Center Tier2 Center

InstituteB InstituteA

. . .workstation

Italy Regional Center

Fermilab Regional Center

France Regional Center

recorded data

The CMS offline computing system is arranged in four Tiers and is geographically distributed


• Large amount of data to be analyzed• Large community of physicists which

wants to access data• Many distributed sites where data

will be stored

So what?


• WLCG, WorldWide LHC Computing Grid, that is a distributed computing environment– Two main different flavours– LCG/gLite in Europe, OSG in the US

• CRAB a python tool which helps the user to build, manage and control analysis jobs over grid environments

Help!


• User writes his/her own analysis code– Starting from CMS specific analysis software– Builds executable and libraries

• He wants to apply the code to a given amount of events splitting the load over many jobs– But generally he is allowed to access only local data

• He should write wrapper scripts and use a local batch system to exploit all the computing power– Comfortable until data you’re looking for are sitting just by your

side

• Then should submit all by hand and check the status and overall progress

• Finally should collect all output files and store them somewhere

Typical user analysis workflow


• Keeps easy to create large number of user analysis job– Assume all jobs are the same except for some

parameters (event number to be accessed, output file name…)

• Allows to access distributed data efficiently– Hiding WLCG middleware complications. All

interactions are transparent for the end user• Manages job submission, tracking,

monitoring and output harvesting– User doesn’t have to take care about how to interact

with sometimes complicated grid commands– Leaves time to get a coffee …

CRAB main purposes


DBCRAB

CE CE CECECE CECE

RB

WNWN

WNWN

WNWN

WNWN

WNWN

WNWN

WNWN

WN

SE

Local file catalog

Data

Log Files/(Job output)

WN

UI

CRAB workflow1) Data location

2) Job preparation

3) Job submission

4) Job status

5) Job output retrievalLCG/OSG

SE

RefDb (DBS)

DB

PubDb (DLS)

Job output


Main CRAB functionalities

• Data discovery– Data are distributed so we need to know where data

have been sent• Job creation

– Both .sh (wrapper script for the real executable) and .jdl (a script which drives the real job towards the “grid”)

– User parameters passed via config file (executable name, output file names, specific executable parameters…)

• Job submission – Scripts are ready to be sent to those sites which host

data– Boss, the job submitter and tracking tool, takes care

of submitting jobs to the Resource Broker


• CRAB monitors, via Boss, the status of the whole submission– The user has to ask for jobs status

• When jobs finish CRAB retrieves all output– Both standard output/error and relevant files

produced by the analysis code– Either the job copies the output on the SE– Or it takes it back to the UI

Main CRAB functionalities (cont’d)


So far (so good?)• CRAB is currently used to analyze

data for the CMS Physics TDR (being written now…)

•Most accessed dataset since

last July

D.Spiga: CRAB Usage and jobs-flow Monitoring (DDA-252)


Some statistics

Most accessed sites since July 05

CRAB jobs so far

D.Spiga: CRAB Usage and jobs-flow Monitoring (DDA-252)


Crab usage during CMS SC3

CRAB has been

extensively used to test CMS T1 sites partecipating

SC3

The goal was to stress the computing facilities

through the full analysis

chain over all distributed

data

J. Andreeva: CMS/ARDA activity within the CMS distributed computing system (DDA-237)


• CRAB needs to evolve to integrate with new CMS computing components– New data discovery components (DBS,

DLS):under testing– New Event Data Model– New computing paradigm

• Integration into a set of services which manage jobs on behalf of the user allowing him to interact only with “light” clients

Crab (and CMS comp) evolves


• CRAB was born in April ’05• Lot of work and efforts have been made to

make it robust, flexible and reliable• Users appreciate the tool and are asking

for further improvements• Crab has been used to analyze data for

CMS Physics TDR• CRAB is used to continuously test CMS

Tiers to prove the whole infrastructure robustness

Conclusions


• CRAB web page– http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab/

• Links to documentation, tutorials and mailing lists

• CRAB monitoring– http://cmsgridweb.pg.infn.it/crab/crabmon.php

• ARDA monitoring for CRAB jobs– http://www-asap.cern.ch/dashboard

Pointers

Documents

CHEP ’06 Mumbai Marco Corvo – Cern/Cnaf CRAB A tool to enable CMS Distributed Analysis