1Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
CMS tools for distributed analysis
N. De Filippis - LLR-Ecole Polytechnique
2Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
CMS tools tutorial organization
morning session: overview of CMS tools
concepts of the CMS computing/analysis model
overview of the analysis workflow and tools
overview of the analysis job monitoring system
afternoon session: practical examples
demonstration of data discovery
demonstration of job submission
job monitoring and trouble-shooting
3Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
CMS tools: overview session
4Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
CMS dataflow and workflow
Data are analysed at Tier-2 and Tier-3 by physics group users and published in a DBS instance dedicated to the physics group.
CERN Computer Centre
FermiLabFrance Regional Centre (IN2P3
Italy Regional Centre (CNAF)
GRIF
Tier 0Tier 0
Tier 2Tier 2 LNL WisconsinRome
Tier 1Tier 1
DBS
Data are collected, filtered online, stored, reconstructed with HLT information at Tier-0 and registered in Data Bookeeping Service (DBS) at CERN.
Data are filtered (in a reduced AOD format) at Tier-1 according to the physics analysis group selection for skimming; skim output are shipped to Tier-2 via PhEDEx
Tier 3Tier 3
RECO data are moved from Tier-0 to Tier 1 via PhEDEx
LLR
PhEDEx PhEDEx PhEDEx
PhEDExPhEDExPhEDEx
5Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
In the computing model CMS stated that the analysis resources
are:
Central Analysis Facility (CAF) at CERN: intended for specific varieties of analysis with requirements of low
latency access to the data (data are on disk pool of CASTOR) CAF is a very large resource but policy decisions about the users
and the use of the resource
processing at CAF: calibration and alignment, few physics analyses
Tier-2 resources:
two groups of analysis users to be supported in Tier-2
1. support for analysis groups/specific analyses
2. support for local communities
1. is addressed by using the distributed analysis system
2. is addressed via local batch queues/access and distributed systems
Tier-3 resources: local and interactive analysis, private resources
CMS analysis model (CTDR)
6Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
A nominal Tier-2 should be:
0.9MSI2k of computing power, 200TB of disk and 1Gb/s of WAN.
that means several hundred batch slots and disk for large skim
samples
A reasonably large fraction of the nominal is devoted to analysis
group activities and the remainder is assignable to the local community
Proposal under discussion is ~50% of nominal processing resources
for simulation, ~40% of the nominal resources for analysis groups
(specific and well organized analyses) , and the remainder for local
users
10% of the local storage for simulation, 60% for analysis groups and
30% for local communities
• for 2008 lower guideline of 60TB of disk and 0.4MSI2k for
analysis groups
Tier-2 resources
7Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
via virtual organization (VO) authorized access
via data discovery (DBS)
via job analysis builder (CRAB)
via data, job and resources monitoring (DashBoard)
via automatic storage allocation for physics groups and local user data
via data movement tools (PhEDeX)
Distributed analysis in Tier-2
8Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
CMS Data DiscoveryThe Dataset Bookkeeping System (DBS) provides the means
to define, discover and use CMS event data.
Data Description: keeps dataset definition along with attributes characterising the dataset, the type of content resulting from a degree of processing applied to the data (RAW, RECO, etc) The DBS also provides information regarding the “provenance” of the data it describes.
Data Discovery: stores information about (real and simulated) CMS data in a queryable format. The supported queries allow users to discover available data and how they are organized (logically) in term of packaging units (files and file-blocks). Answers the question: “Which data exist?”
Data location: provides the means to locate replicas of data in the distributed computing system by providing the names of Storage Elements of sites hosting the data. Answers the question “Where data exist?”
The main features that DBS provides are:
9Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
Data discovery page: https://cmsweb.cern.ch/dbs_discovery/_navigator?
userMode=user
CMS Data Discovery: DBS
A sample of data is identified by a string called: datasetpath
/Primarydataset/Processeddataset/DataTier
Ex: /Njet_2j_80_140-alpgen/CMSSW_1_6_7-CSA07-1200571375/RECO
Primary dataset: name that describes the physics channel Processed dataset: name that describe the kind of processing
applied Data Tier: describes the kind of event information stored from
each step in the simulation and reconstruction chain. Examples: RAW and RECO, and for MC, GEN, SIM and DIGI.
File-related concepts:Logical File Name (LFN): a site-independent name for a file. It
doesn't contain either the actual protocol used to read the file or any of
the site-specific information about the place where it is located. Physical File Name (PFN): site-dependent name for a file to allow
local access to a file at a site. Logical file names are mapped into the physical file names via the local trivial file catalog, (TFC)
10Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
CMS Data Discovery (2)Data discovery page:
https://cmsweb.cern.ch/dbs_discovery/_navigator?userMode=user
Site: SE
name
click on this
Logical file names
11Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
CRAB is a user oriented tool for: job preparation, submission, (basic) monitoring of CMS
analysis jobs in remove sites by using the GRID infrastructure (EGEE and OSG)
Features: User Settings provided via a configuration file (dataset, data
type) Data discovery querying DBS for remote sites Job splitting automatic (n. of jobs or events per job to be
provided) Jobs will run where the data are GRID details mostly hidden to the user status monitoring, job tracking and output management of
jobs publishing of the output
Use cases supported: Official and private code analysis of official CMS data But also private production and skimming
CRAB (CMS remote analysis builder)
The aim of CRAB is to hide as much as possible the grid complexity to the user
12Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
The user analysis basic model
13Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
CRABJob submission
tool
Computing
ElementStorageElement
Resource Broker (RB/WMS)
UI
Workload Management
System
The user provides:• Dataset (runs,#event,..) taken by DBS• Private CMSSW code
DataSet Catalogue
DBS
Worker node
CMSSW
CRAB discoveries data and sites hosting them by querying DBS
CRAB prepares, splits and submits jobs to the Resource Broker/WMS
The RB/WMS sends jobs at sites hosting the data provided the CMS software was installed
CRAB retrieves automatically logs and the output files of the jobs; it’s possible to store output files into SE (best solution)
The user analysis workflow
CRAB can publish the output of the jobs in DBS to make output data available officially for subsequent processing.
14Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
CRAB standalone and server
15Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
CRAB documentation and support
CRAB home page: http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab/CRAB twiki: https://twiki.cern.ch/twiki/bin/view/CMS/CRAB
16Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
CRAB frequently asked questionshttps://twiki.cern.ch/twiki/bin/view/CMS/CrabFaq
17Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
CMS job monitoring: Dashboard
Most of the CMS Job Submission Systems , including Crab, are instrumented to send monitoring information to the CMS Dashboard.
The CMS Job Submission Systems Dashboard collects information from the Grid Monitoring systems.
Monitoring data is stored in the central data base and there is a web interface running on top of it and allowing CMS users to follow the progress of their jobs.
Web interface at link: http://arda-dashboard.cern.ch/cms/
18Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
Monitor of analysis jobs (last month)
sites
dataset
Users
19Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
Monitor of analysis tasks
Choose your identity in the "Select a User" window, select the time window to define the tasks submitted during a given time range, you should get at the screen the list of all your tasks submitted over the time range you have chosen.
20Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
Monitor of site availability for analysis
http://lxarda16.cern.ch/dashboard/request.py/samvisualization
Simple analysis test is run
continuosly on any site to check the availability for
analysis
21Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
All Tiers: ~400 Kjobs: Failure 20%
Tier2 only (32%): ~130 Kjobs: Failure 20%
Statistics of failuresMostly for storage
problems at sites
https://twiki.cern.ch/twiki/bin/view/CMS/JobExitCodes
22Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
Storage management at Tier-2
CMS users/physics groups can produce and store large quantities of data:
due to re-processing of skim due to common preselection processing and iterated reprocessing
due to private productions (especially fast simulation) due to end-user analysis ROOT-ples
care on how to manage the utilization of the storage
Tier-2 disk-based storage will be fixed size and so care is needed to ensure every site can meet their obligations with respect to the collaboration and analysis users are treated fairly:
user quota, policy are under definition physics analysis data namespace user data namespace
Official Physics groups
Private user
23Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
Physics analysis data namespace
24Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
User analysis data namespace
25Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
CMS data movementhttp://cmsdoc.cern.ch/cms/aprom/phedex/
Request a replica of a dataset in one site
request samples to be transferred.
Requests to be approved by site administrators according to policies of physics groups
26Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
Glossary
27Tutorial:Initiation a l’Utilisation de la Grille EGEE/LCG, June 5-6N. De Filippis
Demonstration about CMS analysis workflow in the
afternoon