Download ppt - Nurcan Ozturk University of Texas at Arlington Grid User Training for Local Community TUBITAK ULAKBIM, Ankara, Turkey April 5 - 9, 2010 Overview of ATLAS

Nurcan Ozturk

University of Texas at Arlington

Grid User Training for Local Community

TUBITAK ULAKBIM, Ankara, Turkey

April 5 - 9, 2010

Overview of ATLAS Distributed Analysis

Nurcan OzturkNurcan Ozturk 2

Outline

User’s Work-flow for Data Analysis

ATLAS Distributed Data Analysis System

What Type of Data You Can Run On

What Type of Jobs You Can Run

How to Find Your Input Data AMI

dq2 end-user tools

ELSSI - Event Level Selection Service Interface

Tips for Submitting Your Jobs How To Check Status of ATLAS Releases

How to Check Status of Installation of Releases at Sites

Tips for Retrieving Your Outputs

Tips of Debugging Failures

User Support

Pathena Example on 7 TeV Collision Data


User’s Work-flow for Data Analysis

Locate the data

Analyze the results

Setup the analysis job

Submit to the Grid

Retrieve the results

Setup the analysis code


ATLAS Distributed Data Analysis System

Gateways to resources

User interface

Execution infrastructure

3-layers:


What Type of Data You Can Run On RAW: raw data from DAQ system ESD – Event Selection Data: output of reconstruction of RAW data AOD – Analysis Object Data: condensed version of ESD D1PD – Primary Derived Physics Data: ESD or AOD with:

certain event removed (skimming) some data objects within an event removed (thinning) certain data members within object removed (slimming) Commissioning DPD and Performance DPD (made from ESD’s) Physics DPD (made from AOD’s)

DnPD, n-th Derived Physics Data (D2PD, D3PD): Specific format defined by physics/performance groups or users

TAG: event tags which are event thumbnails with pointers to the full event in the AOD – fast access to specific events (file-based or database-based)

Monte Carlo data EVNT – Event generator data: output of event generation HITS – output of detector simulation (uses EVNT as input) RDO – Raw Data Object: output of digitization (uses HITS as input) – MC equivalent of raw data AOD and DPD

All of the above. However RAW/HITS/RDO are on tape, you need to request

for DDM replication to disk storage.


What Type of Jobs You Can Run

Athena jobs with official production transformations: Event generation, Simulation, Pileup, Digitization, Reconstruction, Merge

General jobs (non-Athena type analysis): ROOT(CINT, C++, pyRoot)

ARA (AthenaRootAccess)

Python, user's executable, shell script, etc.

Jobs with multiple-input streams (e.g. reco trf) Cavern, Minimumbias, BeamHalo , BeamGas input

TAG selection jobs

Jobs with nightly builds

Jobs with arbitrary DBRelease Database release contains Conditions, Geometry and Trigger data

More info about DBReleases is in backup slides

etc.


How to Find Your Input Data

AMI

dq2 end-user tools

ELSSI (Event Level Selection Service Interface)


What is AMI

ATLAS Metadata Interface.

A generic cataloging framework: Dataset discovery tool

Tag Collector tool (release management tool)

Where does AMI get its data: From real data : DAQ data from the Tier 0

From Monte Carlo and reprocessing Pulled from the Task Request Database : Tasks, dataset names, Monte Carlo

and reprocessing config tags

Pulled from the Production Database : Finished tasks – files and metadata

From physicists Monte Carlo input files needed for event generation

Monte Carlo Dataset number info, physics group owner,…

Corrected cross sections and comments.

DPD tags


AMI Portal Page - http://ami.in2p3.fr

There is also a read-only server at CERN: http://atlas-ami.cern.ch


7 TeV Datasets


AMI Tutorial Page


AMI Fast Tutorial Page


Simple Search in AMI –search by name

type here to search for latest 7 TeV collision dataset: data10_7TeV%physics%MinBias%AOD%


Simple Search in AMI – various useful links

Group by

Apply filter


Simple Search in AMI – DQ2 link

By clicking on the DQ2 link:

Use always merge datasets ending with /


Simple Search in AMI – PANDA link

By clicking on the PANDA link:


Simple Search in AMI - interpretation of tags (1)


Simple Search in AMI – Interpretation of tags (2)


Simple Search in AMI – Run Summary

By clicking on the Run_Summary link:


Simple Search in AMI – Run Queries

By clicking on the Run_Query link:


dq2 End-User Tools (1)

User interaction with DDM system: via dq2 end-user tools: querying, retrieving, creating datasets

requesting dataset replication, dataset deletion, etc.

How to set up (on lxplus): source /afs/cern.ch/atlas/offline/external/GRID/ddm/DQ2Clients/setup.sh

voms-proxy-init --voms atlas

More info on setup at: https://twiki.cern.ch/twiki/bin/view/Atlas/

RegularComputingTutorial#Getting_ready_to_use_the_Grid


dq2 End-User Tools (2)

How to use: List available MInBias datasets in DDM system (same search as in AMI) :

dq2-ls 'data10_7TeV*physics*MinBias*‘

Search for merged AOD’s in container datasets: dq2-ls 'data10_7TeV*physics*MinBias*merge*AOD*‘

Find location of container datasets (groups of datasets, ending with a '/' ): dq2-list-dataset-replicas-container

data10_7TeV.00152489.physics_MinBias.merge.AOD.f241_p115/

List files in container dataset: dq2-ls –f data10_7TeV.00152489.physics_MinBias.merge.AOD.f241_p115/

Copy one file locally: dq2-get -n 1 data10_7TeV.00152489.physics_MinBias.merge.AOD.f241_p115/

More info at: https://twiki.cern.ch/twiki/bin/view/Atlas/DQ2ClientsHowTo

https://twiki.cern.ch/twiki/bin/view/Atlas/DQ2Tutorial


DQ2ClientsHowTo

Extensive info here

https://twiki.cern.ch/twiki/bin/view/Atlas/DQ2ClientsHowTo


DQ2Tutorial https://twiki.cern.ch/twiki/bin/view/Atlas/DQ2Tutorial


ELSSI – Event Level Selection Service Interface

https://voatlas18.cern.ch/tagservices/index.htm

Goal: Retrieve the TAG file from TAG database• Define a query to select runs, streams, data quality, trigger chains,… •.Review the query • Execute the query and retrieve the TAG file (a root file) to be used in Athena job


Tips for Submitting Your Jobs

Always test your job locally before submitting to the Grid

Use the latest version of pathena/Ganga

Submit your jobs on container datasets (merged)

If your dataset is on tape, you need to request for data replication to a disk storage first

Do not specify any site name in your job submission, pathena/Ganga will choose the best site available

If you are using a new release that is not meant be used for user analysis, e.g. cosmic data, Tier0 reconstruction, then it’ll not be available at all sites. In general you can check release and installation status at: http://atlas-computing.web.cern.ch/atlas-computing/projects/releases/status/ https://atlas-install.roma1.infn.it/atlas_install/


How To Check Status of ATLAS Releases

http://atlas-computing.web.cern.ch/atlas-computing/projects/releases/status/


How to Check Status of Installation of Releases at Sites (1)

https://atlas-install.roma1.infn.it/atlas_install/

Choose here 15.6.8.2


How to Check Status of Installation of Releases at Sites (2)

https://atlas-install.roma1.infn.it/atlas_install/list.php?rel=15.6.8.2


Tips for Retrieving Your Outputs

If everything went fine, you need to retrieve your outputs from the Grid

Where and how you will store your output datasets:

request a data replication: your output files will stay as a dataset on Grid. You need to freeze your dataset before requesting for replication

download onto your local disk using dq2-get: by default, user datasets are created on _SCRATCHDISK at the site where the jobs run. All the datasets on _SCRATCHDISK are to be deleted after a certain period (~ 30days). If you see a possibility to use them on Grid, you should think about replication.

No more than 30-40GB/day can be copied by dq2-get

Details at: https://twiki.cern.ch/twiki/bin/view/Atlas/DQ2ClientsHowTo#AfterCreatingDataset


Tips of Debugging Failures Try to understand if the error is job related or site related

Job log files tell most of the time

If the site went down during your jobs being executed, you can check its status from the ATLAS Computer Operations Logbook: https://prod-grid-logger.cern.ch/elog/ATLAS+Computer+Operations+Logbook/

If input files are corrupted at given site you can exclude the site (till they are recopied) and notify DAST (Distributed Analysis Support Team)

Look at the FAQ (Frequently Asked Questions) section of the pathena/Ganga twiki pages for most common problems and their solutions: https://twiki.cern.ch/twiki/bin/view/Atlas/PandaAthena#FAQ

https://twiki.cern.ch/twiki/bin/view/Atlas/DAGangaFAQ

If you can not find your problem listed there, do a search in the archive of the distributed-analysis-help forum in e-groups: https://groups.cern.ch/group/hn-atlas-dist-analysis-help/default.aspx

If you still need help, send a message to the e-groups, DAST and other users will help you: [email protected]


Does your job require conditions database access?

If you are running at CERN or at Tier1’s, you may have not even noticed it. They provide direct access to Oracle databases. You may have seen possible some overload problems (errors in job logs).

What is stored in Oracle databases: The geometry and most of the conditions data

LAr calibrations and InDet alignments are too large to be stored effectively in Oracle; they are stored as POOL files and replicated to all Tier1’s (soon to Tier2’s)

If your job runs at Tier2/Tier3’s, remote access to Oracle databases is possible Solution for users: provide the conditions DB releases together with the job:

DB release is an extraction of the needed constants into a tar file that is copied to the worker node and accessed locally. Avaiable as a dataset in DDM system.

--dbReleases NameofDataset (in pathena/Ganga)

Now in place: have user jobs to access Oracle database through FroNtier/Squid caches Jobs can run anywhere w/o configuration changes Solves latency problems with jobs running at Tier2/Tier3’s

More info about databases at: https://twiki.cern.ch/twiki/bin/view/Atlas/AtlasDBRelease https://twiki.cern.ch/twiki/bin/view/Atlas/AthenaDBAccess


ATLAS Computer Operations Logbook – site/service problems reported

https://prod-grid-logger.cern.ch/elog/ATLAS+Computer+Operations+Logbook/


pathena FAQhttps://twiki.cern.ch/twiki/bin/view/Atlas/PandaAthena#FAQ


Ganga FAQhttps://twiki.cern.ch/twiki/bin/view/Atlas/DAGangaFAQ


User Support (1)https://twiki.cern.ch/twiki/bin/view/Atlas/AtlasDAST

DAST membersEU time zone NA time zone------------------------------------------------------------------------Daniel van der Ster Nurcan OzturkMark Slater Alden StradlingHurng-Chun Lee Sergey Panitkin Bjorn Samset Kamile YagciChristian Kummer Bill EdsonMaria Shiyakova Wensheng DengJaroslava Schovancova Manuj JhaKarl HarrisonElena Oliver Garcia-----------------------------------------------------------------------


User Support (2)

DAST started in September 2008 for a combined support of pathena and Ganga users

First point of contact for distributed analysis questions

All kinds of problems are discussed, not just pathena/Ganga related ones Analysis tools related problems

DDM related problems

And athena related problems

15 hours coverage with 2 people on shift (one in NA, one in EU time zones). Plan is to have 2 people in each time zone.

DAST helps directly by solving the problem or escalating to relevant experts

More shifters and more user2user support needed

Shift work counts to OTSMOU credit (Category-2 shifts currently)

User feedback is extremely useful to debug the distributed analysis tools and explore the features pathena/Ganga have to offer. So feel free to write to this forum


Pathena Example on 7 TeV Collision Data

Everyone needs to complete this tutorial: https://twiki.cern.ch/twiki/bin/view/Atlas/RegularComputingTutorial

Setup CMT and athena as explained above. You can use the latest release (production cache): source cmthome/setup.sh -tag=15.6.8.2,AtlasProduction,setup,32

Use a ntuple (D3PD) making package to run on AOD’s: Example: SUSYD3PDMaker package https://twiki.cern.ch/twiki/bin/view/AtlasProtected/SUSYD3PDMaker

Get one file locally to test that athena runs fine: dq2-get -n 1 data10_7TeV.00152489.physics_MinBias.merge.AOD.f241_p115/

Configure the job option file and run athena locally as explained in the SUSYD3PDMaker wiki page: athena SUSYD3PDMaker_topOptions.py

Setup pathena on lxplus, or install yourself as explained in pathena wiki page: source /afs/cern.ch/atlas/offline/external/GRID/DA/panda-client/latest/etc/panda/panda_setup.sh https://twiki.cern.ch/twiki/bin/view/Atlas/DAonPanda

Submit your job to Grid with pathena: pathena –inDS data10_7TeV.00152489.physics_MinBias.merge.AOD.f241_p115/

--outDS user10.NurcanOzturk.trgrid.test SUSYD3PDMaker_topOptions.py Monitor your job on Panda monitor or using pbook (bookkeeping tool for pathena). Once your job is

completed, get your output (root files and log files) to your local machine: dq2-get user10.NurcanOzturk.trgrid.test

Open Root and plot some histograms to see how the real data looks like Use analysis codes (D3PD readers) to do more detailed analysis on ntuples. Codes can be based C++

(example in SUSYD3PDMaker wiki page), python (for instance SPyRoot), etc.