31
Distributed Computing and Ganga Karl Harrison (University of Cambridge) 3rd LHCb-UK Software Course National e-Science Centre, Edinburgh, 8-10 January 2007 http://cern.ch/ganga

Distributed Computing and Ganga Karl Harrison (University of Cambridge) 3rd LHCb-UK Software Course National e-Science Centre, Edinburgh, 8-10 January

Embed Size (px)

Citation preview

Distributed Computing and Ganga

Karl Harrison(University of Cambridge)

3rd LHCb-UK Software CourseNational e-Science Centre, Edinburgh, 8-10 January 2007

http://cern.ch/ganga

8 January 2007 2/31

Evolution of CERN computing

1976: IBM 370/1681958: Ferranti Mercury 1967: CDC 6400

2 years to build3 months to install320 kBytes storageLess computing powerthan today’s calculators

1988: IBM MM 3090,DEC VAX, Cray X-MP

2001: PC Farm

• The scope and complexity of particle-physics experiments has increased in parallel with increases in computing power• Massive upsurge in computing requirements in going from LEP to LHC Will quantify this

8 January 2007 3/31

LHCb data rates

Cross section Rate(mbarn) (MHz)

Bunch crossings 40.All interactions 100. 20.Visible interactions 60. 12.bb events 0.5 0.1cc events 3.5 0.7L0 triggers 1.HLT 0.002

Average luminosity = 2 1032 cm-2 s-1 = 0.2 nbarn-1 s-1

• Hardware trigger

• High ET particles

• Partial information

• Software trigger

• High ET and IP

• Full information

≥ 2 charged tracks in VELO

Approximate cross sections and rates

LHCb event size 35 kByteData rate 70 MByte/s 6.0 TByte/day

LEP data (per experiment):Event size 20-50 kByteTotal raw data (1989-1998) 2.7-3.5 TByte

Factor 1000 increase in data-processingrequirements (storage + CPU) for LHCbcompared with LEP experiment

8 January 2007 4/31

Strategy for processing LHC data

• Majority of data processing (reconstruction/simulation/analysis) for LEP experiments performed at CERN– About 50% of physics analyses run at collaborating institutes

• Similar approach might have been possible for LHC– Increase data-processing capacity at CERN– Take advantage of Moore’s Law increase in

CPU power and storage• LHC Computing Review (CERN/LHCC/2001-004)

discouraged LEP-type approach– Rules out access to funding not available to CERN– Makes poor use of expertise and resources at collaborating

institutes

Require solution for managing distributed data and CPUs: Grid computing

Project for LHC Computing Grid (LCG) started 2002

8 January 2007 5/31

• First release of Globus Toolkit for Grid infrastructures made in 1998• World Wide Web commercially attractive by late 1990s

– e-Everything suddenly in vogue: e-mail, e-Commerce, e-Science– Dot-com bubble 1998-2002

• Grid proposed as evolution of World Wide Web:access to resources as well as to information

• Multi-million pound UK e-Science programmelaunched in 2001– National e-Science centre and network of regional centres set up– Support provided for computationally intensive science, including

GridPP project for particle physics

Grid computing and e-Science• Ideas behind Grid computing have been around since the 1970s,

but became very fashionable around the turn of the centuryA computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access tohigh-end computational capabilities.Ian Foster and Carl Kesselman, The Grid: Blueprint for a New Computing Infrastructure (1998)

Nasd

aq in

dex

8 January 2007 6/31

Grid terminology

• Grid jargon and acronyms can be fairly incomprehensible to the unititiated

• For help, see Grid Acronym Soup (http://www.gridpp.ac.uk/gas/) on GridPP web site

• Consider here some of the terms most frequently encountered

You should specify the walltime in your JDL, and submit to an RB

that supports your VO

And you need a valid GUID to register that LFN in the

LFC

8 January 2007 7/31

• Software that enables Grid functionality is often termed middleware– Layer between the user software and the Grid hardware

• A number of Grid projects have produced middleware

– Globus: low-level middleware, used by all other projects

– European DataGrid: higher-level middleware

– LHC Computing Grid: middleware built on EDG

– Enabling Grids for E-sciencE: gLite middleware, built on

EDG and LCG

– Open Science Grid in US

– NorduGrid, started in

Nordic countries

Projects and Middleware

Relevant for ATLAS and CMS,but not (yet) for LHCb

8 January 2007 8/31

Authentication and Authorisation• A Grid certificate is a sort of electronic passport, usually valid a

year– Need different format depending on usage

• Usually create proxy to access Grid resources– Carries same information about user identity as certificate, but

has validity of days or hours• LHC user needs to be registered with experiment’s Virtual

Organisation (VO) to be authorised to use Grid resources– User privileges can be limited depending on role within the VO

(VO administrator, production manager, user, etc)

I know who you are, and I’m not letting you

in!

User can be authenticated but not authorised

8 January 2007 9/31

CPU access

UI (User Interface)Machine from which user submits processing requests (jobs) specified in Job Definition Language (JDL)

RB (Resource Broker)Machine that decideswhere jobs should run

CE (Compute Element)Machine that managesbatch system at Grid site

WN (Worker Node)Machine thatruns user jobs

8 January 2007 10/31

Data management

LFC (Logical File Catalogue)Database that maps Logical File Names to Physical File Names

LFN (Logical File Name)Alias for any of one ormore files (replicas) withidentical content

PFN (Physical File Name)Path to a file at aspecific site

SE (Storage Element)Mass-storage systemat a Grid site

8 January 2007 11/31

Distributed computing in LHCb

• Information about Grid resources, and how they’re used in the LHCb computing model, given in presentation by Raja Nandakumar

• Use of Grid resources is simplified for LHCb physicists by two pieces of software implemented in Python: Ganga and DIRAC

• Ganga (Gaudi/Athena and Grid Alliance) is an extensible job-management framework, developed as an ATLAS-LHCb common project– This is the only tool you should need to know about to run analyses on the Grid– Hides Grid technicalities, letting you concentrate on the physics– Allows trivial switching between running locally and running on the Grid

• DIRAC (Distributed Infrastructure with Remote Agent Control) serves as the LHCb Production System, but more generally is a Workload Management System– Receives jobs from Ganga (and elsewhere) and performs tasks in background

• Forces Grid jobs to site(s) where input data are available• If needed, installs experiment software on worker nodes• Monitors job progress, and stores output

– Includes tools for data management not yet incorporated in Ganga• See exercises

8 January 2007 12/31

DIRAC submission to LCG : Pilot Agents

JobReceiver

LFC

MatcherData

Optimiser

JobDB

TaskQueue

AgentDirector

Pilot

Agent

LCG

WMS

Computing

Resource

Pilot

Agent

AgentMonitor

DIRAC

Data Optimiser queries Logical

File Catalogue to identify sites for

job execution

Agent Director submits Pilot

Agents for jobs in waiting state

Agent Monitor tracks Agent status,

and triggers further submission as needed

8 January 2007 13/31

DIRAC submission to LCG : Bond Analogy

JobReceiver

LFC

Matcher

JobDB

TaskQueue

AgentDirector

Pilot

Agent

LCG

WMS

Computing

Resource

AgentMonitor

Data Optimiser queries Logical

File Catalogue to identify sites for

job execution DIRAC

Agent Monitor tracks Agent status,

and triggers further submission as needed

Agent Director submits Pilot

Agents for jobs in waiting state

8 January 2007 14/31

Ganga job abstraction

• A job in Ganga is constructed from a set of building blocks, not all required for every job

Merger

Application

Backend

Input Dataset

Output Dataset

Splitter

Data read by application

Data written by application

Rule for dividing into subjobs

Rule for combining outputs

Where to run

What to run

Job

8 January 2007 15/31

Framework for plugin handling

DaVinci

GangaObject

IApplication IBackendIDatasetISplitter IMerger

Dirac-CE-CPUTime-id-status-actualCE

-package-version-cmt_user_path-masterpackage-optsfile-extraopts-configured

Use

rSyst

em

Plu

gin

Inte

rface

sExam

ple

plu

gin

sand s

chem

as

• Ganga provides a framework for handling different types of Application, Backend, Dataset, Splitter and Merger, implemented as plugin classes• Each plugin class has its own schema

8 January 2007 16/31

LHCbapplications

ATLASapplications

Otherapplications

Applications

Experiment-specificworkload-management systems

Local batch systems Distributed (Grid) systems

Processing systems (backends)

Metadatacatalogues

Data

stora

ge a

nd re

trieval

Filecatalogues

Tools fordata

management

Localrepository

Remoterepository

Ganga job archives

Gangamonitoring

loop

User interfacefor job definition

and management

• Ganga has built-in support for ATLAS and LHCb• Component architecture allows customisation for other user groups

Ganga: how the pieces fit together

8 January 2007 17/31

Using Ganga

• Command Line Interface in Python (CLIP) provides interactive job definition and submission from an enhanced Python shell (IPython)– Especially good for trying things out, and seeing how the

system works• Scripts, which may contain any Python/IPython or CLIP

commands,allow automation of repetitive tasks

• Scripts included in distribution enable kind of approach traditionally used when submitting jobs to a local batch system

• Graphical User Interface (GUI)allows job management based onmouse selections and field completion– Lots of configuration possibilities

• Ganga allows users to work in a variety of ways– All possibilities covered in tutorial

8 January 2007 18/31

Python commands

• Ganga is developed in Python, making use of IPython extensions• All Python/IPython commands can be used at the prompt in a

Ganga CLIP session, and the syntax for CLIP and Python commands is the same

• Information about Python can be found at: http://www.python.org/– If you’re new to Python, the on-line tutorial is extremely helpful

• The following are often useful# A hash (#) marks the start of a comment# A slash (\) at the end of a line indicates that# the following line is a continuationdir() # List currently available objectshelp() # Give helphelp( item ) # Give help on specified itemx = 5 # Assign value to variableprint x # Print value of variablectrl-D # Exit from session

8 January 2007 19/31

IPython commands

• Information about IPython extensions can be found at: http://ipython.scipy.org/

• One useful extension is the possibility to use shell commands from Python, together with both shell variables and Python variables

# Use ! before shell commands# Use $ before Python variables# Use $$ before shell variables

here = ‘where the heart is’!echo $$HOME is $here

!ls $$HOME/mySubdir

!emacs # Start emacs session, but don’t try adding &

Exit # Exit from session

8 January 2007 20/31

Ganga startup and configuration files

• The Ganga environment can be set using:

• A Ganga CLIP session is then started by giving the command:

– If the user doesn’t have a valid proxy then his/her Grid passphrase is requested

• When Ganga is first run, a configuration file .gangarc is created in the user’s home directory– The file includes comments on the configuration possibilities– The latest default configuration file can always be obtained

with:

• Before processing .gangarc Ganga processes, in the order they are specified, any configuration files pointed to by the environment variable GANGA_CONFIG_PATH– This makes possible the use of group configuration files, but

allows settings to be overridden on a user-by-user basis

ganga

ganga -g

GangaEnv

8 January 2007 21/31

Ganga workspace

• Ganga creates a directory gangadir in your home directory and uses this for storing job-related files and information– You can’t move this directory, but before running Ganga, you

can create ~/gangadir as a link to another location

gangadir

repository

input

Local

templates

output

workspace

Remote

gui

<username>

jobs 66 67

8 January 2007 22/31

Ganga CLIP commands (1)

• Ganga commands are explained in the guide Working with Ganga:http://cern.ch/ganga/user/html/GangaIntroduction

• From a CLIP session, available classes, objects and functions may be listed, and help can be requested for each

• Useful commands include the following

list_plugins( ‘type’) # List plugins of specified type: # ‘applications’, ‘backends’, etcj1 = Job( backend =LSF() ) # Create a new job for LSFa1 = Executable() # Create Executable applicationj1.application = a1 # Set value for job’s applicationj1.backend = LCG() # Change job’s backend to LCGexport( j1, ‘myJob.py’ ) # Write job to specified fileload( ‘myJob.py’ ) # Load job(s) from specified filej2 = j1.copy() # Create j2 as a copy of job j1jobs # List jobsjobs[ i ].subjobs # List subjobs for split job i

8 January 2007 23/31

Ganga CLIP commands (2)

• When a job j has been defined, the following methods can be used

• Once a job has been submitted, it can no longer be modified, and it cannot be resubmitted, but the job can be copied and the copy can be modified/submitted

• Ganga supports use of templates, which can be used as the basis of a job definition

j.submit() # Submit the jobj.kill() # Kill the job (if running)j.remove() # Kill the job and delete associated filesj.peek() # List files in job’s output directory

t = JobTemplate() # Create templatetemplates # List templatesj3 = Job( templates[ i ] ) # Create job from template i

8 January 2007 24/31

CLIP: “Hello World” example

• From a Ganga CLIP session, a job that writes “Hello World” can be created, and then submitted to LCG, as follows

app = Executable()app.exe = ‘/bin/echo’app.env = {}app.args = [‘Hello World’ ]# Property values set above are in fact the defaults# for Executable applicationj = Job( application = app, backend = LCG() )j.submit()# Check on job progressjobs# When job has completed, check the outputj.peek( ‘stdout’ )

8 January 2007 25/31

CLIP: DaVinci example

• A job to run DaVinci can be created and submitted to DIRAC, split into subjobs, as follows:

app = DaVinci( version = ‘v17r6’)app.masterpackage = ‘myMasterPackage v1r0 myProject’app.optsfile = ‘$HOME/myOptsDir/myOpts.opts’j = Job( application = app, backend = Dirac() )j.splitter = SplitByFiles( filesPerJob = 20 )j.submit()# Check on job progressjobs# When job has completed, view histograms produced# by first subjobj.subjobs[ 0 ].peek( ‘myHistos.root’ )

• Histogram files specified in the job options are automatically retrieved by Ganga when the job completes

8 January 2007 26/31

Ganga CLIP as a code-building environment

• Ganga includes possibilities for checking-out, and building, packages for LHCb applications

app = DaVinci( ‘v17r6’ )app.getpack( ‘Phys/DaVinci v17r6’ )app.getpack( ‘Tutorial/Analysis v6r2’ )app.masterpackage = ‘DaVinci v17r6 Phys’ )app.make()

• The make() method builds the application’s master package, and any user-owned packages on which it depends

8 January 2007 27/31

Using Ganga commands from a Linux shell• Ganga includes scripts that can be used from a Linux shell (i.e. outside of

CLIP)

• Given job name or id as returned by query, also have possibilities such as

• Same syntax can be used from inside CLIP, with no overheads for startup

# Create a job for submitting Gauss to Dirac ganga make_job Gauss Dirac test.py [ Edit test.py to set Gauss and/or Dirac properties ] # Submit job ganga submit test.py # Query status, triggering output retrieval if job is completed ganga query

# Kill job ganga kill id # Remove job from Ganga repository and workspace ganga remove id

8 January 2007 28/31

Example Ganga scripts in LHCb release area

• More recent releases of LHCb applications include in their job subdirectory an example Ganga script for running the application– Example scripts available with all of: Gauss v25r7, Boole

v12r10,Brunel v30r14, DaVinci v17r6

• Scripts assume user has checked out own copy of application package, and include commands both for creating and for submitting job, with extensive comments

• Taking Brunel example, script is executed using:ganga Brunel_Ganga.py

• Note that script is given directly as argument with no intermediate keyword (e.g. submit)

8 January 2007 29/31

Ganga Graphical User Interface (GUI)• GUI consists of central monitoring panel and dockable windows• Job definition based on mouse selections and field completion• Highly configurable: choose what to display and how

Job details

Logical

Folders

Scriptor

Job Monitoring

Log window

Job builder

8 January 2007 30/31

Help with using Ganga

• Ganga documentation can be found in the User Guides section of the Ganga web side: http://cern.ch/ganga/– Most relevant items are:

• Installation• Working with Ganga (general introduction to functionality)• LHCb-specific manual (working with Gaudi applications and

Dirac)• GUI manual (introduction to graphical interface)

• For problems or feature requests, do any of the following:– Send e-mail to one of the Ganga developers (listed on web site)– Send e-mail to [email protected]– Submit a report via Ganga’s bug-submission page in Savannah:

https://savannah.cern.ch/bugs/?func=additem&group=ganga• Should either login to Savannah first, or give e-mail address

8 January 2007 31/31

Hands-on exercises

• Set of 12 exercises attached to course agenda– You should aim to complete at least the first 8

• Exercises 1-4: short exercises dealing with setup for working with Ganga

• Exercises 5-9: different ways of using Ganga (scripts, CLIP, GUI)– You can choose the way you like best and stick with this, but

good to know the possibilities– GUI exercise can be left for later if you don’t have time in this

session• Running GUI over the network anyway isn’t ideal

• Exercises 10-11: data-management tools, Python and IPython– Can be left for later if you don’t have time in this session