126
DataGrid WP1 - WMS S OFTWARE A DMINISTRATOR AND U SER G UIDE (PM9 R ELEASE ) Document identifier: DataGrid-01-TEN-0118-0_2 Date: 09/11/CE5762CE5762 Work package: WP1 Partner: Datamat SpA Document status DRAFT Deliverable identifier: IST-2000- 25182 PUBLIC 1 / 126

WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

DataGr id

W P 1 - W M S S O F T W A R E A D M I N I S T R A T O R A N D U S E R G U I D E

( P M 9 R E L E A S E )

Document identifier: DataGrid-01-TEN-0118-0_2

Date: 09/11/CE5762CE5762

Work package: WP1

Partner: Datamat SpA

Document status DRAFT

Deliverable identifier:

Abstract: This note provides the administrator and user guide for the WP1 WMS software delivered for PM9 release.

IST-2000-25182 PUBLIC 1 / 97

Page 2: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Delivery Slip

Name Partner Date Signature

From Fabrizio Pacini Datamat SpA 14/01/2002

Verified by Stefano Beco Datamat SpA 14/01/2002

Approved by

Document Log

Issue Date Comment Author

0_0 21/12/2001 First draft Fabrizio Pacini

0_1 14/01/2002 Draft Fabrizio Pacini

Document Change Record

Issue Item Reason for Change

0_1 General update

Take into account changes in the rpm generation procedure.

Add missing info about daemons (RB/JSS/CondorG) starting accounts

Some general corrections

0_2 Job state changes Add Cancelling and Cancel Reason

information.

Add OUTPUTREADY job state.

Files

Software Products User files

Word 97 document.doc

Acrobat Exchange 4.0 DataGrid-01-TEN-0118-0_1-Document.pdf

IST-2000-25182 PUBLIC 2 / 97

Page 3: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Content

1. Introduction......................................................................................................................................... 5



2. EXECUTIVE SUMMARY.................................................................................................................... 8

3. BUILD PROCEDURE......................................................................................................................... 93.1. REQUIRED SOFTWARE....................................................................................................................... 93.2. BUILD INSTRUCTIONS....................................................................................................................... 10

3.2.1. Environment Variables........................................................................................................103.2.2. Compiling the code.............................................................................................................12

3.3. RPM INSTALLATION......................................................................................................................... 18

4. INSTALLATION AND CONFIGURATION.......................................................................................204.1. LOGGING AND BOOKKEEPING SERVICES............................................................................................20

4.1.1. Required software...............................................................................................................204.1.2. RPM installation.................................................................................................................. 214.1.3. The installation tree structure..............................................................................................224.1.4. Configuration...................................................................................................................... 234.1.5. Environment Variables........................................................................................................23

4.2. RB AND JSS................................................................................................................................... 254.2.1. Required software...............................................................................................................254.2.2. RPM installation.................................................................................................................. 274.2.3. The Installation Tree structure............................................................................................274.2.4. Configuration...................................................................................................................... 284.2.5. Environment variables........................................................................................................32

4.3. INFORMATION INDEX........................................................................................................................ 344.3.1. Required software...............................................................................................................344.3.2. RPM installation.................................................................................................................. 344.3.3. The Installation tree structure.............................................................................................354.3.4. Configuration...................................................................................................................... 354.3.5. Environment Variables........................................................................................................36

4.4. USER INTERFACE............................................................................................................................. 374.4.1. Required software...............................................................................................................374.4.2. RPM installation.................................................................................................................. 384.4.3. The tree structure...............................................................................................................394.4.4. Configuration...................................................................................................................... 404.4.5. Environment variables........................................................................................................41

5. OPERATING THE SYSTEM............................................................................................................435.1. LB LOCAL-LOGGER.......................................................................................................................... 43

5.1.1. Starting and stopping daemons..........................................................................................435.1.2. Troubleshooting.................................................................................................................. 44

5.2. LB SERVER..................................................................................................................................... 455.2.1. Starting and stopping daemons..........................................................................................455.2.2. Purging the LB database....................................................................................................455.2.3. Troubleshooting.................................................................................................................. 46

5.3. RB AND JSS................................................................................................................................... 475.3.1. Startig PostgreSQL.............................................................................................................47

IST-2000-25182 PUBLIC 3 / 97

Page 4: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

5.3.2. Starting Condor-G...............................................................................................................475.3.3. Starting and stopping RB daemons....................................................................................475.3.4. Starting and stopping JSS daemons...................................................................................485.3.5. RB troubleshooting.............................................................................................................495.3.6. JSS troubleshooting............................................................................................................49

5.4. INFORMATION INDEX........................................................................................................................ 495.4.1. Starting and stopping daemons..........................................................................................49

6. USER GUIDE................................................................................................................................... 506.1. USER INTERFACE............................................................................................................................. 50

6.1.1. Security............................................................................................................................... 506.1.2. Common behaviours...........................................................................................................516.1.3. Commands description.......................................................................................................55



7.5.1. Direct Job Submission........................................................................................................937.5.2. Job submission without data-accesss requirements...........................................................937.5.3. Job submission with data-access requirements..................................................................95

IST-2000-25182 PUBLIC 4 / 97

Page 5: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

1. INTRODUCTIONThis document provides a guide to the building, installation and usage of the WP1 WMS software released for PM9.

1.1. OBJECTIVES OF THIS DOCUMENTGoal of this document is to describe the complete process by which the WP1 WMS software can be installed and configured on the DataGrid test-bed platforms.Guidelines for operating the whole system and accessing provided functionalities are also provided.

1.2. APPLICATION AREAAdministrators can use this document as a basis for installing, configuring and operating WP1 WMS software released for PM9. Users can refer to the User Guide chapter for access-ing provided services through the User Interface.

1.3. APPLICABLE DOCUMENTS AND REFERENCE DOCUMENTSApplicable documents[A1] Job Description Language HowTo – DataGrid-01-TEN-0102-02-Document.pdf – 17/12/2001

(http://www.infn.it/workload-grid/docs/DataGrid-01-TEN-0102-02-Document.pdf)

[A2] DATAGRID WP1 Job Submission User Interface for PM9 (revised presentation) – 23/03/2001 (http://www.infn.it/workload-grid/docs/20010320-JS-UI-datamat.pdf)

[A3] WP1 meeting - CESNET presentation in Milan – 20-21/03/2001(http://www.infn.it/workload-grid/docs/20010320-L_B-matyska.pdf)

[A4] Logging and Bookkeeping Service – 0705/2001(http://www.infn.it/workload-grid/docs/20010508-lb_draft-ruda.pdf)

[A5] Results of Meeting on Workload Manager Components Interaction – 09/05/2001(http://www.infn.it/workload-grid/docs/20010508-WM-Interactions-pacini.pdf)

[A6] Resource Broker Architecture and APIs – 13/06/2001 (http://www.infn.it/workload-grid/docs/20010613-RBArch-2.doc)

[A7] JDL Attributes - DataGrid-01-NOT-0101-0_4 – 17/12/2001(http://www.infn.it/workload-grid/docs/DataGrid-01-NOT-0101-0_2.pdf)

Reference documents[R1]

IST-2000-25182 PUBLIC 5 / 97

Page 6: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

1.4. DOCUMENT EVOLUTION PROCEDUREThe content of this document will be subjected to modification according to the following events: Comments received from Datagrid project members, Changes/evolutions/additions to the WMS components.

1.5. TERMINOLOGYDefinitionsCondor Condor is a High Throughput Computing (HTC) environment that can

manage very large collections of distributively owned workstationsGlobus The Globus Toolkit is a set of software tools and libraries aimed at the

building of computational grids and grid-based applications.

Glossaryclass-ad Classified advertisementCE Computing ElementDB Data BaseFQDN Fully Qualified Domain NameGDMP Grid Data Management Pilot ProjectGIS Grid Information Service, aka MDSGSI Grid Security Infrastructurejob-ad Class-ad describing a jobJDL Job Description LanguageJSS Job Submission ServiceLB Logging and Bookkeeping ServiceLRMS Local Resource Management SystemMDS Metacomputing Directory Service, aka GISMPI Message Passing Interface

PID Process Identifier

PM Project MonthRB Resource BrokerRC Replica CatalogueSE Storage ElementSI00 Spec Int 2000SMP Symmetric Multi Processor

IST-2000-25182 PUBLIC 6 / 97

Page 7: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

TBC To Be ConfirmedTBD To Be DefinedUI User InterfaceUID User IdentifierWMS Workload Management SystemWP Work Package

IST-2000-25182 PUBLIC 7 / 97

Page 8: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

2. EXECUTIVE SUMMARYThis document comprises the following main sections:Section 3: Build Procedure

Outlines the software required to build the system and the actual process for building it and generating rpms for the WMS components; a step-by-step guide is included.

Section 4: Installation and ConfigurationDescribes changes that need to be made to the environment and the steps to be performed for installing the WMS software on the test-bed target platforms. The resulting installation tree structure is detailed for each system component.

Section 5: Operating the SystemProvides actual procedures for starting/stopping WMS components processes and utilities.

Section 6: User GuideDescribes in a Unix man pages style all User Interface component commands allowing the user to access WMS provided services.

Section 7: AnnexesDeepens arguments introduced in the User Guide section that are considered useful for the user to better understand system behaviour.

IST-2000-25182 PUBLIC 8 / 97

Page 9: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

3. BUILD PROCEDUREIn the following section we give detailed instructions for the installation of the WP1 WMS software package. We provide a source code distribution as well as a binary distribution and explain installation procedures for both cases.

3.1. REQUIRED SOFTWAREThe WP1 software runs and has been tested on platforms running Globus Toolkit 2.0 Beta Release 21 on top of Linux RedHat 6.2. Hereafter are listed the software packages, apart from WP1 software version 1.0, that are required to be installed locally on a given site in order to be able to build the WP1 WMS on it. They are:

Globus Toolkit 2.0 Beta 21 or higher (download at http://datagrid.in2p3.fr/distribution/globus/beta-21)

Python 2.1.1 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

Swig 1.3.7 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

Expat 1.95.1 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

MySQL Version 9.38 Distribution 3.22.32, for pc-linux-gnu (i686) (download at http://datagrid.in2p3.fr/distribution/config/external_services.html)

Postgresql 7.1.3 (http://datagrid.in2p3.fr/distribution/config/external_services.html)

Classads library

CondorG 6.3.1 for INTEL-LINUX-GLIBC21

Perl IO Stty 0.02, Perl IO Tty 0.04 (download at http://datagrid.in2p3.fr/distribution/config/external.html )

Perl 5 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

gcc and c++ compilers egcs-2.91.66 or egcs-2.95.2 (mandatory for CondorG)

GNU make version 3.78.1 or higher

GNU autoconf version 2.13

IST-2000-25182 PUBLIC 9 / 97

Page 10: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

GNU libtool 1.3.5

GNU automake 1.4

GNU m4 1.4 or higher

RPM 3.0.5

sendmail 8.11.6

3.2. BUILD INSTRUCTIONSThe following instructions deal with the building of the WMS software and hence apply to the source code distribution.

3.2.1. Environment VariablesBefore starting the compilation, some environment variables related to the WMS components can be set or configured by means of the configure script. This is needed only if package defaults are not suitable. Involved variables are listed below:

- GLOBUS_LOCATION base directory of the Globus installationThe default path is /opt/globus.

- MYSQL_INSTALL_PATH base directory of the MySQL installationThe default path is /usr.

- EXPAT_INSTALL_PATH base directory of the Expat installation.The default path is /usr.

- GDMP_INSTALL_PATH base directory of the Gdmp installationThe default path is /opt/edg.

- PGSQL_INSTALL_PATH base directory of the Pgsql installation. The default path is /usr.

- CLASSAD_INSTALL_PATH base directory of the Classad library installation. The default path is /opt/classads.

- CONDORG_INSTALL_PATH base directory of the Condor installation. The default path is /opt/CondorG.

IST-2000-25182 PUBLIC 10 / 97

Page 11: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

- PYTHON_INSTALL_PATH base directory of the Python installation.The default path is /usr.

- SWIG_INSTALL_PATH base directory of the Swig installation .The default path is /usr/local.

In order to build the whole WP1 package, all the environment variables in the previous list must be set. Instead for building the User Interface module, the environment variables that need to be set are the following:

- GLOBUS_LOCATION- CLASSAD_INSTALL_PATH- PYTHON_INSTALL_PATH- SWIG_INSTALL_PATH- EXPAT_INSTALL_PATH

If you plan to build the Job Submission and Resource Broker module, variable to set are:

- GLOBUS_LOCATION- MYSQL_INSTALL_PATH- EXPAT_INSTALL_PATH- GDMP_INSTALL_PATH- PGSQL_INSTALL_PATH- CLASSAD_INSTALL_PATH- CONDORG_INSTALL_PATH

Whilst the LB server and Local Logger modules, to be built need the following environment variables:

- GLOBUS_LOCATION- MYSQL_INSTALL_PATH- EXPAT_INSTALL_PATH

Finally, the LB library module needs:

- GLOBUS_LOCATION- EXPAT_INSTALL_PATH

IST-2000-25182 PUBLIC 11 / 97

Page 12: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

and the Information Index module only:

- GLOBUS_LOCATION

3.2.2. Compiling the codeAfter having unpacked the WP1 source distribution tar file, or having downloaded the code directly from the CVS repository, change your working directory to be the WP1 base directory, i.e. the Workload directory, and run the following command:

./recoursive-bootstrap

At this point the configure command can be run. The configure script has to be invoked as follows:

./configure <options>

The list of options that are recognized by configure is reported hereafter:

--help

--prefix=<installation path> It is used to specify the Workload installation dir. The default installation dir is /opt/edg.

--enable-allIt is used to enable the build of the whole WP1 package. By default this option is turned on.

--enable-userinterface It is used to enable the build of the User Interface module with Logging/Client, Broker/Client, Broker/Socket++ and ThirdParty/trio/src submodules. By default this option is turned off.

--enable-jss_rbIt is used to enable the build of the Job Submission and Resource Broker modules with Logging/Client, Common, test, and ThirdParty/trio/src submodules. By default this option is turned off.

--enable-lbserver

IST-2000-25182 PUBLIC 12 / 97

Page 13: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

It is used to enable the build of the LB Server service with Logging/Client, Logging/etc, Logging/Server, Logging/InterLogger/Net, Logging/InterLogger/SSL, Logging/InterLogger/Error, Logging/InterLogger/Lbserver and ThirdParty/trio/src submodules. By default this option is turned off.

--enable-localloggerIt is used to enable the build of the LB Local Logger service with Logging/Client, Logging/InterLogger/Net, Logging/InterLogger/SSL, Logging/InterLogger/Error, Logging/InterLogger/InterLogger, Logging/LocalLogger, man and ThirdParty/trio/src submodules. By default this option is turned off.

--enable-logging_devIt is used to enable the build of the LB Client Library with Logging/Client and ThirdParty/trio/src submodules. By default this option is turned off.

--enable-information

It is used to enable the build of the Information Index module.By default this option is turned off.

--with-globus-install=<dir>It allows specifying the Globus installation directory without setting the environment variable GLOBUS_LOCATION.

--with-pgsql-install=<dir>It allows specifying the Pgsql installation directory without setting the environment variable PGSQL_INSTALL_PATH.

--with-gdmp-install=<dir>It allows specifying the GDMP installation directory without setting the environment variable GDMP_INSTALL_PATH.

--with-expat-install=<dir>

It allows specifying the Expat installation directory without setting the environment variable EXPAT_INSTALL_PATH.

--with-mysql-install=<dir>It allows to specify the MySQL installation directory without setting the environment variable MYSQL_INSTALL_PATH.

--with-expat=<option>

IST-2000-25182 PUBLIC 13 / 97

Page 14: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

It allows either to enable or to disable the Expat installation checking. The default value is 'yes'.

--with-pgsql=<option>

It allows either to enable or to disable Pgsql installation checking. The default value is 'yes'.

--with-mysql=<option>

It allows either to enable or to disable MySQL installation checking. The default value is 'yes'.

--with-gdmp=<option>It allows either to enable or to disable Gdmp installation checking. Thedefault value is 'yes'.

During the configure step, six spec files (i.e. wl-userinterface.spec, wl-locallogger.spec, wl lbserver.spec, wl-logging_dev.spec, wl-jss_rb.spec and wl-information.spec) are created respectively in the following source sub-directories to produce a flavour specific version:

- Workload/UserInterface- Workload/Logging- Workload/JobSubmission- Workload/InformIndex

Once the configure script has terminated its execution, check that the make from the GNU distribution is in your path and then always in the Workload source code directory run:

make

then:

make check

to build the test code. If the two previous steps complete successfully, the installation of the software can be performed. In order to install the package in the installation directory specified either by the --prefix option of the configure script or by the default value (i.e. /opt/edg), you can now issue the command:

make install

IST-2000-25182 PUBLIC 14 / 97

Page 15: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

It is possible to run "make clean" to remove object files, executable files, library files and all the other files that are created during ”make” and “make check”. The command:

make -i dist

can be used to produce in the workload-1.0.0 directory, located in the Workload's base directory, a binary gzipped tar ball of the Workload distribution. This tar ball can be both transferred on other platforms and used as source for the RPM creation.For creating the RPMs for Workload 1.0 (according to the configure options you have used) make sure that your PATH is set in such a way that the GNU autotools, make and the gcc compiler can be used and edit the file $HOME/.rpmmacros (if this file does not exist in your home directory, then you have to create it) to set the following entry:

%_topdir <your home dir>/rpm/redhat

Then you can issue the command:

make rpm

that generates the RPMs in $(HOME)/rpm/redhat/RPMS.For example if before building the package you have used the configure as follows:

./configure –-enable-all

then the make rpm command creates the directories:

$(HOME)/rpm/redhat/SOURCES$(HOME)/rpm/redhat/SPECS$(HOME)/rpm/redhat/BUILD$(HOME)/rpm/redhat/RPMS$(HOME)/rpm/redhat/SRPMS

and copies the previously created tar ball workload-1.0.0/Workload.tar.gz in $(HOME)/rpm/redhat/SOURCES. Moreover it copies the generated spec files:

JobSubmission/wl-jss_rb.specUserInterface/wl-userinterface.specInformIndex/wl-information.specLogging/wl-lbserver.specLogging/wl-locallogger.spec

IST-2000-25182 PUBLIC 15 / 97

Page 16: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Logging/wl-logging_dev.spec

in $(HOME)/rpm/redhat/SPECS and finally executes the following commands:

rpm -ba wl-userinterface.specrpm -ba wl-locallogger.specrpm -ba wl-lbserver.specrpm -ba wl-logging_dev.specrpm -ba wl-jss_rb.specrpm -ba wl-information.spec

generating respectively the following rpms in the $(HOME)/rpm/redhat/RPMS directory:

- userinterface-1.0.0-6.i386.rpm- locallogger-1.0.0-5.i386.rpm- lbserver-1.0.0-6.i386.rpm- logging_dev-1.0.0-4.i386.rpm- jobsubmission-1.0.0-6.i386.rpm- informationindex-1.0.0-5.i386.rpm

If you have instead built only the User Interface, i.e. used:

./configure --disable-all --enable-userinterface --with-mysql='no' --with-pgsql='no' --with-gdmp='no'

the make rpm command will copy only the file UserInterface/wl-userinterface.spec in $(HOME)/rpm/redhat/SPECS and will create only the User Interface rpm (userinterface-1.0.0-6.i386.rpm).An alternative procedure can be followed to build the II and Logging packages. To do this, move in the Workoad/InformIndex dir and run the following commands:

./bootstrap

./configure [option]

where the recognized options are:

--prefix=<install path>

It is used to specify the Information Index installation dir. The default installation dir is /opt/edg

IST-2000-25182 PUBLIC 16 / 97

Page 17: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

--with-globus-install=<dir>It allows to specify the Globus install directory without setting the environment variable GLOBUS_LOCATION.

Then issue:makemake install

Afterwards move into the Workload/Logging directory and run the following commands:

./bootstrap

./configure [option]

where the recognized options are:

--prefix=<install path>

It is used to specify the Logging installation dir. The default installation dir is /opt/edg

--with-globus-install=<dir>

It allows specifying the Globus install directory without setting the environment variable GLOBUS_LOCATION.

--with-expat-install=<dir>

It allows specifying the Expat install directory without setting the environment variable EXPAT_INSTALL_PATH

--with-mysql-install=<dir>It allows specifying the MySQL install directory without setting the environment variable MYSQL_INSTALL_PATH.

--with-expat=<option>It allows either to enable or to disable Expat install checking. The default value is 'yes'.

--with-mysql=<option>It allows either to enable or to disable MySQL install checking. The default value is 'yes'.

IST-2000-25182 PUBLIC 17 / 97

Page 18: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Then issue:makemake checkmake install

Summarising, in relation to the WMS module you want to build, the configure script has to be run with the following options:

all./configure

userinterface ./configure --disable-all --enable-userinterface \--with-mysql='no' --with-pgsql='no' --with-gdmp='no'

information./configure --disable-all --enable-information

lbserver./configure --disable-all --enable-lbserver

locallogger./configure --disable-all --enable-locallogger

logging for developers./configure --disable-all --enable-logging_dev \--with-mysql='no'

jobsubmission and broker./configure --disable-all --enable-jss_rb

3.3. RPM INSTALLATIONIn order to install the WP1 RPMs on the target platforms, the following commands have to be executed as root:

rpm -ivh userinterface-1.0.0-6.i386.rpmrpm -ivh informationindex-1.0.0-5.i386.rpmrpm -ivh jobsubmission-1.0.0-6.i386.rpm

IST-2000-25182 PUBLIC 18 / 97

Page 19: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

rpm -ivh locallogger-1.0.0-5.i386.rpmrpm -ivh lbserver-1.0.0-6.i386.rpmrpm -ivh logging_dev-1.0.0-4.i386.rpm

By default the rpm installs the software in the /opt/edg directory. If you have installed one of the following rpms:- userinterface-1.0.0-6.i386.rpm, - informationindex-1.0.0-5.i386.rpm - jobsubmission-1.0.0-6.i386.rpmyou have to run the /opt/edg/etc/configure_workload script as root, which installs /opt/edg/etc/workload.sh and /opt/edg/etc/workload.csh scripts under /etc/profile.d. The two latter scripts set the EDG_LOCATION environment variable to /opt/edg and run the $EDG_LOCATION/etc/workload_{jss, ui}_env.sh scripts. The script workload_ui_env.{sh, csh} sets and updates the following environment variables:

PATH="${EDG_LOCATION}/bin:${PATH}"LD_LIBRARY_PATH="${EDG_LOCATION}/lib:${LD_LIBRARY_PATH}"PYTHONPATH="${EDG_LOCATION}/lib:${PYTHONPATH}"

The script workload_jss_env.{sh, csh} checks instead that on the machine there are condor_master and condor_schedd executables.Furthermore, the start_JobSubmission and start_Broker script files in /opt/edg/utils can be run as root to start the Job Submission and Broker services. The SXXII script file in /opt/edg/utils can be run as root to start the Information Index service and finally the kill_JobSubmission and kill_Broker script files can be run as root to stop the RBserver, jssparser and jssserver processesDetails on the installation and configuration and of each of the listed rpms are provided in section 4 of this document. For further information about RPM please consult the man pages or http://www.rpm.org.

IST-2000-25182 PUBLIC 19 / 97

Page 20: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

4. INSTALLATION AND CONFIGURATION This section deals with the procedures for installing and configuring the WP1 WMS components on the target platforms. For each of them, before starting with the installation procedure which is described through step-by-step examples, is reported the list of dependencies i.e. the software required on the same machine by the component to run. Moreover a description of needed configuration items and environment variables settings is also provided.

4.1. LOGGING AND BOOKKEEPING SERVICESFrom the installation point of view LB services can be split in two main components:

The LB services responsible for accepting messages from their sources and forwarding them to the logging and/or bookkeeping servers, which we will refer as LB local-logger services.

The LB services responsible for accepting messages from the LB local-logger services, saving them on their permanent storage and supporting queries generated by the consumer API, that we will refer as LB server services.

The LB local-logger services must be installed on all the machines hosting processes pushing information into the LB system, i.e. the machines running RB and JSS, and the gatekeeper machine of the CE. An exception is the submitting machine (i.e. the machine running the User Interface) on which this component can be installed but is not mandatory: The LB server services need instead to be installed only on a server machine that usually coincides with the RB server one.

4.1.1. Required software

4.1.1.1. LB local-loggerFor the installation of the LB local-logger the only software required is the Globus Toolkit 2.0 (actually only GSI rpms are needed). Globus 2 rpms are available at http://datagrid.in2p3.fr/distribution/globus under the directory beta-xx/RPMS (recommended beta is 21 or higher). All rpms can be downloaded with the command

wget -nd –r <URL>/<rpm name>

and installed with rpm –ivh <rpm name>

4.1.1.2. LB ServerFor the installation of the LB server the Globus Toolkit 2.0 (actually only GSI rpms are needed). Globus 2 rpms are available at http://datagrid.in2p3.fr/distribution/globus under the directory beta-xx/RPMS (recommended beta is 21 or higher). All rpms can be downloaded with the command

wget -nd –r <URL>/<rpm name>and installed with

rpm –ivh <rpm name>

IST-2000-25182 PUBLIC 20 / 97

Page 21: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Besides Globus Toolkit 2.0 for the LB server to work properly it is also necessary to install MySQL Distribution 3.22.31 or higher.Instructions about MySQL installation can be found at the following URLs: http://www.redhat.com/support/resources/faqs/RH-apache-FAQ/MySQL/mysql-install.htmPackages and more general documentation can be found at:http://www.mysql.org/listcats3.php?menu=21&page_id=9.Anyway the rpm of MySQL Ver 9.38 Distribution 3.22.32, for pc-linux-gnu (i686) is available at http://datagrid.in2p3.fr/distribution/config/external_services.html.At least packages MySQL-3.22.32 and MySQL-client-3.32.22 have to be installed for creating and configuring the LB database.LB server stores the logging data in a MySQL database that must hence be created. The following assumes the database and the server daemons (bkserver and ileventd) run on the same machine, which is considered to be secure, i.e. no database authentication is used. In a different set-up the procedure has to be adjusted accordingly as well as a secure database connection (via ssh tunnel etc.) established.The action list below contains placeholders DB_NAME and USER_NAME, real values have to be substituted. They form the database connection string required on some LB daemons invocation. Suggested value for both DB_NAME and USER_NAME is `lbserver', this value is also the compiled-in default (i.e. when used, the database connection string needn't be specified at all).The following needed steps require MySQL root privileges:1) Create the database: mysqladmin -u root -p create DB_NAMEwhere DB_NAME is the name of the database.

2) Create a dedicated LB database user: mysql -u root -p -e 'grant create,drop,select,insert, \

update,delete on DB_NAME.* to USER_NAME@localhost'where USER_NAME is the name of the user running the LB server daemons.

3) Create the database tables: mysql -u USER_NAME DB_NAME < server.sqlwhere server.sql is a file containing sql commands for creating needed tables. server.sql can be found in the directory “<install path>/etc” created by the LB server rpm installation.

4.1.2. RPM installationIn order to install the LB local-logger and the LB server services, the following command have to be respectively issued with root privileges:

rpm -ivh [--prefix <installdir>] locallogger-1.0.0-5.i386.rpm

IST-2000-25182 PUBLIC 21 / 97

Page 22: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

rpm -ivh [--prefix <installdir>] lbserver-1.0.0-6.i386.rpm

By default the rpm installs the software in the “/opt/edg” directory. Using the --prefix directive, it is possible to install the software in a different location (i.e. in the <installdir> directory). Instead the --relocate <oldpath>=<newpath> directive can be used to relocate an installation from <oldpath> to <newpath>.

4.1.3. The installation tree structure

4.1.3.1. LB local-loggerWhen the LB local-logger RPM is installed, the following directory tree is created:

<install-path>/info<install-path>/infointerlogger.info<install-path>/lib (empty dir)<install-path>/man<install-path>/man1/interlogger.1<install-path>/man3 (empty dir)<install-path>/sbin<install-path>/sbin/dglogd<install-path>/sbin/interlogger<install-path>/sbin/locallogger

The sbin directory contains all the LB local-logger daemons executables and the script locallogger to be used for starting daemons. In the man directory can be found the man page for the inter-logger daemon.After having installed the locallogger package the administrator shall create in the directory “/etc/rc.d/init.d “ a symbolic link to <install-path>/sbin/locallogger, using as root the following commands:cd /etc/rc.d/init.dln –s <install-path>/sbin/locallogger locallogger

4.1.3.2. LB ServerWhen the LB server RPM package is installed, the following directory tree is created:

<install-path>/sbin<install-path>/lib (empty dir)<install-path>/sbin/bkpurge<install-path>/sbin/bkserver<install-path>/sbin/ileventd<install-path>/sbin/lbserver

IST-2000-25182 PUBLIC 22 / 97

Page 23: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

<install-path>/etc/server.sql<install-path>/share/doc<install-path>/share/doc/DataGrid-01-TEN-0118-0_0.pdf

where the sbin directory contains all the LB server daemons executables and the script lbserver to be used for starting daemons.After having installed the lbserver package the administrator shall create in the directory “/etc/rc.d/init.d “ a symbolic link to <install-path>/sbin/lbserver, using as root the following commands:cd /etc/rc.d/init.dln –s <install-path>/sbin/lbserver lbserver

4.1.4. ConfigurationBoth the LB local-logger and LB server have no configuration files so no action is needed for this task.

4.1.5. Environment VariablesAll LB components need the following environment variables to be set:

X509_USER_KEY the user private key file path X509_USER_CERT the user certificate file path X509_CERT_DIR the trusted certificate directory and ca-signing-policy

directory X509_USER_PROXY the user proxy certificate file path

as required by GSI. However, in case of LB daemons, the recommended way for specifying security files locations is using --cert, --key, --CAdir options explicitly.The Logging library i.e. the library that is linked into UI, RB, JSS and Jobmanager, reads its immediate logging destination form the variable DGLOG_DEST. It defaults to “x-dglog://localhost:15830“ which is the correct value, hence it normally does not need to be set but on the submitting machine. Correct format for this variable is:DGLOG_DEST=x-dglog://HOST:PORTwhere as already mentioned HOST defaults to localhost and PORT defaults to 15830.On the submitting machine if the variable is not set it is dynamically assigned by the UI with the value:DGLOG_DEST=x-dglog://<LB_CONTACT>:15830where LB_CONTACT is the hostname of the machine where the LB server currently associated to the RB used for submitting jobs is running. Finally there is LBDB, the environment variable needed by the LB Server daemons (ileventd, bkserver and bkpurge). LBDB represents the MySQL database connect-string, defaults to

IST-2000-25182 PUBLIC 23 / 97

Page 24: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

“lbserver/@localhost:lbserver” and in the recommended set-up (see section 4.1.1.2) does not need to be set. Otherwise it should be set as follows:LBDB=USER_NAME/PASSWORD@DB_HOSTNAME:DB_NAMEwhere - USER_NAME is the name of database user, - PASSWORD is user password for the database - DB_HOSTNAME is hostname of the host where the database is located - DB_NAME is name of the database.

IST-2000-25182 PUBLIC 24 / 97

Page 25: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

4.2. RB AND JSSThe Resource Broker and the Job Submission Services are the WMS components allowing the submission of jobs to the CEs. They are dealt with together since they always reside on the same host and consequently are distributed by means of a single rpm.

4.2.1. Required softwareFor the installation of RB and JSS the Globus Toolkit 2.0 rpms available at http://datagrid.in2p3.fr/distribution/globus under the directory beta-xx/RPMS (recommended beta is 21 or higher) are required to be installed on the target platform. All needed rpms can be downloaded with the command

wget -nd –r <URL>/<rpm name>and installed with

rpm –ivh <rpm name>

The Globus gridftp server package must also be installed and configured on the same host (see http://marianne.in2p3.fr/datagrid/documentation/EDG-Install-HOWTO.html for details).It is important to recall that the Globus grid-mapfile located in /etc/grid-security on the RB server machine must be filled with the certificate subjects of all the users allowed to use the Resource Broker functionalities. Moreover on the same platform the following products are expected to be installed:

LB local-logger services (see section 4.1.1.1) PostgreSQL (RB and JSS) Condor-G (JSS) ClassAd library (RB and JSS) ReplicaCatalog from the WP2 distribution (RB)

4.2.1.1. PostgreSQL installation and configurationBoth RB and JSS use PostgreSQL database for implementing the internal job queue. The installation kit and the documentation for PostgreSQL can be found at the following URL:http://www3.us.postgresql.org/sites.htmlRequired PostgreSQL version is 7.1.3 or higher. The following packages need to be installed (respecting the order in which they are listed): postgresql-libs, posgresql-devel, postgresql, postgresql-server, postgresql-tcl, postgresql-tk and postgresql-docs.PostgreSQL also needs packages cyrus-sasl-1-5-11 (or higher), openssl-0.9.5a and openssl-devel-0.9.5a (or higher). All of them can be found at the following URL:http://datagrid.in2p3.fr/distribution/external/RPMSHereafter are reported the configuration options that must be used when installing the package:

--with-CXX--with-tcl

--enable-odbc

IST-2000-25182 PUBLIC 25 / 97

Page 26: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Postgresql 7.1.3 is also available in rpm format (to be installed as root) at the URL :http://datagrid.in2p3.fr/distribution/external/RPMS Once PostgreSQL has been installed, you need as root to create a new system account dguser (i.e. using option –r of adduser OS function) and to follow steps reported here below to create an empty database for JSS:

su – postgres (become the postgres user)createuser –d –A dguser (create the new database user dguser)su – dguser (become the user dguser)createdb <DBNAME> (create the new database for JSS)

The name of the created database must be the same as the one assigned to the Database_name attribute in file jss.conf (see section 4.2.4.2 for more details), otherwise JSS will use as default the "template1" database. Avoiding use of the template database is anyway strongly recommended.The RB server uses instead another database named "rb", which is created by RB itself.

4.2.1.2. Condor-G installation and configurationCondor-G release required by JSS is CondorG 6.3.1 for INTEL-LINUX-GLIBC21. The Condor-G installation toolkit can be found at the following URL:http://www.cs.wisc.edu/condor/downloads/condorg.license.html. whilst it is available in rpm format (to be installed as root) at:http://datagrid.in2p3.fr/distribution/external/RPMS Installation and configuration are quite straightforward and for details the reader can refer to the README file included in the Condor-G package. Main steps to be performed after having unpacked the package as root are: become dguser (su – dguser) make sure the directory where you are going to install CondorG is owned by dguser make sure the Globus Toolkit 2.0 has been installed on the platform run the /opt/CondorG/setup.sh installation script remove the link ~dguser/.globus/certificates created by the installation script Moreover some additional configuration steps have to be performed in the Condor configuration file pointed to by the CONDOR_CONFIG environment variable set during installation. In the $CONDOR_CONFIG file the following attributes need to be modified:RELEASE_DIR = $(CONDORG_INSTALL_PATH)CONDOR_ADMIN = <a valid e-mail address of the Condor-G administrator>UID_DOMAIN = < the domain of the machine (e.g. pd.infn.it)>FILESYSTEM_DOMAIN = < the domain of the machine (e.g. pd.infn.it)>HOSTALLOW_WRITE = *CRED_MIN_TIME_LEFT = 0

IST-2000-25182 PUBLIC 26 / 97

Page 27: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

GLOBUSRUN = $(GLOBUS_LOCATION)/bin/globusrun

and the following entries need to be added:

SKIP_AUTHENTICATION = YESAUTHENTICATION_METHODS = CLAIMTOBEDISABLE_AUTH_NEGOTIATION = TRUEGRIDMANAGER_CHECKPROXY_INTERVAL = 600GRIDMANAGER_MINIMUM_PROXY_TIME = 180

The environment variable CONDORG_INSTALL_PATH is also set during installation and points to the path where the Condor-G package has been installed.

4.2.1.3. ClassAd installation and configurationThe ClassAd release required by JSS and RB is classads-0.0 (or higher). The ClassAd library documentation can be found at the following URL:http://www.cs.wisc.edu/condor/classad. whilst it is available in rpm format (to be installed as root) at:http://datagrid.in2p3.fr/distribution/external/RPMS

4.2.1.4. ReplicaCatalog installation and configurationThe ReplicaCatalog release required by RB is ReplicaCatalogue-gcc32dbg-2.0 (or higher) that is available in rpm format (to be installed as root) at:http://datagrid.in2p3.fr/distribution/wp2/RPMS

4.2.2. RPM installationIn order to install the Resource Broker and the Job Submission services, the following command has to be issued with root privileges: rpm -ivh [--prefix <installdir>] jobsubmission-1.0.0-6.i386.rpm

By default the rpm installs the software in the “/opt/edg” directory. Using the --prefix directive, it is possible to install the software in a different location (i.e. in the <installdir> directory). Instead the --relocate <oldpath>=<newpath> directive can be used to relocate an installation from <oldpath> to <newpath>.

4.2.3. The Installation Tree structureWhen the jobsubmission rpm has been installed, the following directory tree is created:

<install-path>/bin <install-path>/bin/Rbserver

IST-2000-25182 PUBLIC 27 / 97

Page 28: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

<install-path>/bin/jssparser<install-path>/bin/jssserver<install-path>/etc <install-path>/etc/jss.conf<install-path>/etc/rb.conf<install-path>/etc/workload.csh<install-path>/etc/workload.sh<install-path>/etc/workload_jss_env.csh<install-path>/etc/workload_jss_env.sh<install-path>/lib (empty dir)<install-path>/sbin<install-path>/sbin/broker<install-path>/sbin/jobsubmission

The directory bin contains all the RB and JSS server process executables Rbserver, jssserver and jssparser. In etc are stored the configuration files (see below Section 4.2.4.1 and section 4.2.4.2) while sbin contains the scripts to start and stop the RB and JSS processes.

4.2.4. ConfigurationOnce the rpm has been installed, the RB and JSS services must be properly configured. This can be done editing the two files rb.conf and jss.conf that are stored in <install-path >/etc. Actions to be performed to configure the Resource Broker and the Job Submission Service are described in the following two sections.

4.2.4.1. RB configurationConfiguration of the Resource Broker is accomplished editing the file “<install-path>/etc/rb.conf:” to set opportunely the contained attributes. They are listed hereafter grouped according to the functionality they are related with: MDS_contact, MDS_port and MDS_timeout refer to the II service and respectively

represent the hostname where this service is running, the port number, and the timeout in seconds when the RB queries the II. E.g.:

MDS_contact = "grid001f.cnaf.infn.it"; MDS_port = 2170; MDS_timeout = 60;

MDS_gris_port refers to the port to be used by RB to contact GRIS’es. E.g.: MDS_gris_port = 2135;

IST-2000-25182 PUBLIC 28 / 97

Page 29: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

MDS_multi_attributes define the list of the attribute that in the MDS are multi-valued (i.e. that this can assume multiple values). It is recommended to not modify the default value for this parameter which is currently:

MDS_multi_attributes = {"AuthorizedUser","RunTimeEnvironment","CloseCE"

};

MDS_basedn defines the basedn, which represents the distinguished name (DN) to use as a starting place for searches in the information index. It is recommended to not modify the default value for this parameter which is currently set to:

MDS_basedn = "o=Grid"

LB_CONTACT and LB_PORT refer to the LB Server service and represent respectively the hostname and port where the LB server is listening for connections. E.g.:

LB_contact = "grid004f.cnaf.infn.it";LB_port = 7846;

The Logging library i.e. the library providing APIs for logging job events to the LB (that is linked into RB) reads its immediate logging destination form the environment variable DGLOG_DEST (see section 4.1.5) hence it is not dealt with in the configuration file. DGLOG_DEST defaults to “x-dglog://localhost:15830“ which is the correct value, hence it normally does not need to be set indicating that the LB local-logger services should normally run on the sa,e host as the RB server..

JSS_contact and JSS_server_port refer to the JSS and represent respectively the hostname (it must be the same host of the RB server one) and the port number (it must match with the RB_client_port parameter in the jss.conf file - see section 4.2.4.2) where the JSS server is listening. Moreover JSS_client_port represents the port used by RB to listen for JSS communications. Value of the latter parameter must match with the JSS_server_port parameter in the jss.conf file (see section 4.2.4.2). Hereafter is reported an example for these parameters:

JSS_contact = "grid004f.cnaf.infn.it"; JSS_client_port = 8881; JSS_server_port = 9991;

JSS_backlog and UI_backlog define the maximum number of simultaneous connections

from JSS and UI supported by the socket . Default values are:

JSS_backlog = 5; UI_backlog = 5;

IST-2000-25182 PUBLIC 29 / 97

Page 30: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

UI_server port is the port used by the RB server to listen for requests coming from the User Interface. Default value for this parameter is:

UI_server_port = 7771;

RB_pool_size represents the maximum number of request managed simultaneously by the RB server. Default value for this parameter is:

RB_pool_size = 16;

RB_purge_threshold that defines the threshold age in seconds for RBRegistry information. Indeed RB purges all the information and frees storage space of a job (input/output sandboxes) when the last update of the internal information database has taken place since more than RB_purge_threshold seconds. Default value for this parameter is about one week:

RB_purge_threshold = 600000;

RB_cleanup_threshold represents the span of time (expressed in seconds) between two consecutive cleanups of job registry. During the registry cleanup the RB removes all the entries of those jobs classified as ABORTED. At the end of the cleanup if it is needed (see RB_purge_trheshold) the purging of the registry is performed, as well. The default value for this configuration parameter is:

RB_cleanup_threshold = 3600;

Finally, there is: RB_sandbox_path, which represents the pathname of the root sandboxes directory i.e.

the complete pathname linking to the directory where the RB creates both input/output sandboxes directories and stores the .Brokerinfo file. Default value for this parameter is the temporary directory:

RB_sandbox_path = "/tmp"

The administrator according to the estimated amount of jobs input/sandbox files in the given period must anyway tailor this value in order to not overfull RB machine disk space.No semicolon has to be put at the end of last field in the rb.conf file.

4.2.4.2. JSS configurationConfiguration of the Job Submission Service is accomplished editing the file “<install-path>/etc/jss.conf:” to set opportunely the contained parameters. They are listed hereafter together wit their meanings:

IST-2000-25182 PUBLIC 30 / 97

Page 31: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Condor_submit_file_prefix defines the prefix for the CondorG submission file. The job identifier dg_jobId is then appended to this prefix to build the actual submission file name). Default value for this parameter is:

Condor_submit_file_prefix = "/var/tmp/CondorG.sub";

Condor_log_file defines the absolute path name of the CondorG log file, i.e. the file where the events for the submitted jobs are recorded. Default value for this parameter is:

Condor_log_file = "/var/tmp/CondorG.log";

Condor_stdoe_dir defines the directory where the standard output and standard error files of CondorG are temporarily saved. Default value is:

Condor_stdoe_dir = "/var/tmp";

Job_wrapper_file_prefix is the prefix for the Job Wrapper file name (i.e. the script wrapping the actual job which is submitted on the CE). As before the job identifier dg_jobId is appended to this prefix to build the actual file name. Default value for this parameter is:

Job_wrapper_file_prefix = "/var/tmp/Job_wrapper.sh";

Database_name is the name of the Postgres database where JSS registers information about submitted jobs. This name must correspond to an existing database (how to create it is briefly described in section 4.2.1.1). Default value for the database name is the one of the database automatically created when installing Postgres, i.e.:

Database_name = "template1";

Database_table_name is the name of the table in the previous database. This table is created by the JSS itself if not found. Default value for this parameter is:

Database_table_name = "condor_submit";

JSS_server_port and RB_client_port represent respectively the port used by JSS to listen for RB communication and to communicate to the RB server (e.g. for sending notifications). The two mentioned parameters have to match respectively with the JSS_client_port and JSS_server_port parameters in the rb.conf file (see section 4.2.4.1). Default values are:

JSS_server_port = 8881;RB_client_port = 9991;

IST-2000-25182 PUBLIC 31 / 97

Page 32: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Condor_log_file_size indicates the size in bytes at which the CondorG.log log file has to be splitted. Default value is:

Condor_log_file_size = 64000;

4.2.5. Environment variables

4.2.5.1. RBEnvironment variables that have to be set for the RB are listed hereafter: PGSQL_INSTALL_PATH the Postgres database installation path. Default value is

/usr/local/pgsql PGDATA the path where are stored the Postgres database data

Files. Default value is /usr/local/pgsql/data GDMP_INSTALL_PATH the gdmp installation path. Default value is /opt/edg.

Setting of PGSQL_INSTALL_PATH and PGDATA is only needed if installation is not performed from rpm. Moreover $GDMP_INSTALL_PATH/lib has to be added to LD_LIBRARY_PATH. Finally, there are other environment variables needed at run-time by RB. They are: EDG_WL_RB_CONFIG_DIR the RB configuration directory X509_HOST_CERT the user certificate file path X509_HOST_KEY the user private key file path X509_USER_PROXY the user proxy certificate file path GRIDMAP location of the Globus grid-mapfile that translates X509

certificate subjects into local Unix usernames. The default is /etc/grid-security/grid-mapfile.

Anyway, all variable in the latter group are set by the start_Broker script located in <install-path>/utils.

4.2.5.2. JSSEnvironment variables that have to be set for the JSS are listed hereafter: PGSQL_INSTALL_PATH the Postgres database installation path. Default value is

/usr/local/pgsql PGDATA the path where are stored the Postgres database data

Files. Default value is /usr/local/pgsql/data CONDOR_CONFIG The CondorG configuration file path. Default value is

/usr/local/CondorG/etc/condor_config CONDORG_INSTALL_PATH the CondorG installation path. Default value is

/usr/local/CondorG

IST-2000-25182 PUBLIC 32 / 97

Page 33: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Setting of PGSQL_INSTALL_PATH and PGDATA is only needed if installation is not performed from rpm. Moreover: $CONDORG_INSTALL_PATH/bin $CONDORG_INSTALL_PATH/sbin $PGSQL_INSTALL_PATH/bin (only if installation is not performed from rpm)must be included in the PATH environment variable and $CONDORG_INSTALL_PATH/lib, $PGSQL_INSTALL_PATH/lib (only if installation is not performed from rpm)have to be added to LD_LIBRARY_PATH. Finally, there are other environment variables needed at run-time by JSS. They are: EDG_WL_JSS_CONFIG_DIR the JSS configuration directory X509_HOST_CERT the user certificate file path X509_HOST_KEY the user private key file path X509_USER_PROXY the user proxy certificate file path GRIDMAP location of the Globus grid-mapfile that translates X509

certificate subjects into local Unix usernames. The default is /etc/grid-security/grid-mapfile.

Anyway all variables in the latter group are set into the start_JobSubmission script located in <install-path>/utils.

IST-2000-25182 PUBLIC 33 / 97

Page 34: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

4.3. INFORMATION INDEXThe Information Index (II) is the service queried by the Resource Broker to get information about resources for the submitted jobs during the matchmaking process. An II must hence be deployed for each RB/JSS instance.This section describes steps to be performed to install and configure the Information Index service.

4.3.1. Required softwareFor installing the II, apart from the informationindex rpm (see section 4.3.2 for details), the following Globus Toolkit 2.0 rpms are needed:

globus_libtool-gcc32dbg_rtl-1.4.i386.rpm globus_openldap-gcc32dbg_pgm-2.0.14.i386.rpm globus_openldap-gcc32dbg_rtl-2.0.14.i386.rpm globus_gss_assist-gcc32dbg_rtl-2.0.i386.rpm globus_openldap-gcc32dbgpthr_rtl-2.0.14.i386.rpm globus_openssl-gcc32dbg_rtl-0.9.6b.i386.rpm globus_cyrus_sasl-gcc32dbgpthr_rtl-1.5.27.i386.rpm globus_ssl_utils-gcc32dbg_rtl-2.1.i386.rpm globus_openssl-gcc32dbgpthr_rtl-0.9.6b.i386.rpm globus_mds_back_giis-gcc32dbg_pgm-0.2.i386.rpm globus_libtool-gcc32dbgpthr_rtl-1.4.i386.rpm globus_cyrus_sasl-gcc32dbg_rtl-1.5.27.i386.rpm globus_gssapi_gsi-gcc32dbg_rtl-2.0.i386.rpm

The above listed rpms are available at http://datagrid.in2p3.fr/distribution/globus under the directory beta-xx/RPMS (recommended beta is 21 or higher). All the needed packages can be downloaded with the command

wget -nd –r <URL>/<rpm name>and installed with

rpm –ivh <rpm name>

4.3.2. RPM installationIn order to install the Information Index service, the following command has to be issued with root privileges:

rpm -ivh [--prefix <installdir>] informationindex.1.0.0-5.i386.rpm

IST-2000-25182 PUBLIC 34 / 97

Page 35: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

By default the rpm installs the software in the “/opt/edg” directory. Using the --prefix directive, it is possible to install the software in a different location (i.e. in the <installdir> directory). Instead the --relocate <oldpath>=<newpath> directive can be used to relocate an installation from <oldpath> to <newpath>.

4.3.3. The Installation tree structureWhen the informationindex rpm has been installed, the following directory tree is created:

<install-path>/etc<install-path>/etc/configure_workload<install-path>/etc/grid-info-site-giis.conf<install-path>/etc/grid-info-slapd-giis.conf<install-path>/etc/workload.csh<install-path>/etc/workload.sh<install-path>/schema<install-path>/schema/core.schema<install-path>/schema/grid.ce.schema<install-path>/schema/grid.globusversion.schema<install-path>/schema/grid.gramscheduler.schema<install-path>/schema/grid.infohost.schema<install-path>/schema/grid.se.schema<install-path>/schema/my.grid.common.schema<install-path>/utils<install-path>/utils/SXXII<install-path>/var (empty dir)

In schema are located the schema files, utils contains the init.d-style startup script SXXII, in etc are stored the configuration files and var (initially empty) is used by the II to store files created at start-up, containing args and pid of the II process.

4.3.4. ConfigurationThe II has two configuration files that are located in <install-path>/etc and are named:

grid-info-slapd-giis.conf grid-info-site-giis.conf

In grid-info-slapd-giis.conf are specified the schema file locations and the database type, whilst in grid-info-site-giis.conf are listed the entries for the GRISes that are registered to this II. Each entry has the following format:dn: service=register, dc=mi, dc=infn, dc=it, o=gridobjectclass: GlobusTopobjectclass: GlobusDaemon

IST-2000-25182 PUBLIC 35 / 97

Page 36: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

objectclass: GlobusServiceobjectclass: GlobusServiceMDSResourceMds-Service-type: ldapMds-Service-hn: bbq.mi.infn.itMds-Service-port: 2135Mds-Service-Ldap-sizelimit: 20Mds-Service-Ldap-ttl: 200Mds-Service-Ldap-cachettl: 50Mds-Service-Ldap-timeout: 30Mds-Service-Ldap-suffix: o=grid

The field Mds-Service-hn specifies the GRIS address; the Mds-Service-port specifies the GRIS port (2135 is strongly recommended) whilst the other entries are related to ldap sizelimit and ldap ttl. To add a new GRIS to the given II, it suffices to add a new entry like the one just showed, to the grid-info-site-giis.conf file.

Another file that can be used to configure the II is the start-up script <install-path>/utils/SXXII. In this file is indeed specified the number of the port that is used by the II to listen for requests whose default is 2170. This value can be changed to make II listen on another port provided it matches with the value of the MDS_port attribute in the RB configuration file rb.conf (see section 4.2.4.1).

4.3.5. Environment VariablesThe only environment variable needed by the II to run is the Globus installation path GLOBUS_LOCATION that is anyway set by the start-up script SXXII.

IST-2000-25182 PUBLIC 36 / 97

Page 37: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

4.4. USER INTERFACEThis section describes the steps needed to install and configure the User Interface, which is the software module of the WMS allowing the user to access main services made available by the components of the scheduling sub-layer.

4.4.1. Required softwareFor installing the UI, apart from the userinterface rpm (see section 4.4.2 for details), the following Globus Toolkit 2.0 rpms available at http://datagrid.in2p3.fr/distribution/globus are needed:

globus_gss_assist-gcc32dbgpthr_rtl-2.0-21 globus_gssapi_gsi-gcc32dbgpthr_rtl-2.0-21 globus_ssl_utils-gcc32dbgpthr_rtl-2.1-21 globus_gass_transfer-gcc32dbg_rtl-2.0-21 globus_openssl-gcc32dbgpthr_rtl-0.9.6b-21 globus_ftp_control-gcc32dbg_rtl-1.0-21 globus_user_env-noflavor_data-2.1-21 globus_gss_assist-gcc32dbg_rtl-2.0-21 globus_gssapi_gsi-gcc32dbg_rtl-2.0-21 globus_ftp_client-gcc32dbg_rtl-1.1-21 globus_ssl_utils-gcc32dbg_rtl-2.1-21 globus_ssl_utils-gcc32dbg_pgm-2.1-21 globus_gass_copy-gcc32dbg_rtl-2.0-21 globus_gsincftp-gcc32dbg_pgm-0.1-21 globus_openssl-gcc32dbg_rtl-0.9.6b-21 globus_common-gcc32dbg_rtl-2.0-21 globus_profile-edgconfig-0.9-1 globus_io-gcc32dbg_rtl-2.0-21 globus_core-edgconfig-0.6-2 obj-globus-1.0-4.edg globus_cyrus_sasl-gcc32dbgpthr_rtl-1.5.27-21 globus_libtool-gcc32dbgpthr_rtl-1.4-21 globus_mds_common-gcc32dbg_pgm-2.2-21 globus_openldap-gcc32dbg_pgm-2.0.14-21 globus_openldap-gcc32dbgpthr_rtl-2.0.14-21 globus_core-gcc32dbg_pgm-2.1-21

IST-2000-25182 PUBLIC 37 / 97

Page 38: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Moreover the Python interpreter, version 2.1.1 has to be installed on the submitting machine (this package can be found at www.python.org). The rpm for this package is available at http://datagrid.in2p3.fr/distribution/external/RPMS as:

python-2.1.1-3.i386.rpmAll the needed packages can be downloaded with the command

wget -nd –r <URL>/<rpm name>

and installed with rpm –ivh <rpm name>

4.4.2. RPM installationIn order to install the User Interface, the following command has to be issued with root privileges:

rpm -ivh [--prefix <installdir>] userinterface-1.0.0-6.i386.rpm

By default the rpm installs the software in the “/opt/edg” directory. Using the --prefix directive, it is possible to install the software in a different location (i.e. in the <installdir> directory). Instead the --relocate <oldpath>=<newpath> directive can be used to relocate an installation from <oldpath> to <newpath>.

IST-2000-25182 PUBLIC 38 / 97

Page 39: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

4.4.3. The tree structureAfter the userinterface rpm has been installed, the following directory tree is created:<install-path>/bin<install-path>/bin/JobAdv.py<install-path>/bin/JobAdv.pyc<install-path>/bin/UIchecks.py<install-path>/bin/UIchecks.pyc<install-path>/bin/UIutils.py<install-path>/bin/UIutils.pyc<install-path>/bin/dg-job-cancel<install-path>/bin/dg-job-get-logging-info<install-path>/bin/dg-job-get-output<install-path>/bin/dg-job-id-info<install-path>/bin/dg-job-list-match<install-path>/bin/dg-job-status<install-path>/bin/dg-job-submit<install-path>/bin/libRBapi.py<install-path>/bin/libRBapi.pyc<install-path>/etc<install-path>/etc/UI_ConfigENV.cfg<install-path>/etc/UI_Errors.cfg<install-path>/etc/UI_Help.cfg<install-path>/etc/configure_workload<install-path>/etc/job_template.tpl<install-path>/etc/workload.csh<install-path>/etc/workload.sh<install-path>/etc/workload_us_env.csh<install-path>/etc/workload_ui_env.sh<install-path>/lib<install-path>/lib/libLBapi.a<install-path>/lib/libLBapi.la<install-path>/lib/libLBapi.so<install-path>/lib/libLBapi.so.0<install-path>/lib/libLBapi.so.0.0.0<install-path>/lib/libLOGapi.a<install-path>/lib/libLOGapi.la<install-path>/lib/libLOGapi.so<install-path>/lib/libLOGapi.so.0<install-path>/lib/libLOGapi.so.0.0.0<install-path>/lib/libRBapic.a<install-path>/lib/libRBapic.la<install-path>/lib/libRBapic.so<install-path>/lib/libRBapic.so.0<install-path>/lib/libRBapic.so.0.0.0

IST-2000-25182 PUBLIC 39 / 97

Page 40: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

<install-path>/utils<install-path>/utils/set_python

The bin directory contains all UI python scripts including the commands made available to the user. In lib are installed all the API wrappers shared libraries, while in etc can be found the errors and configuration files UI_ConfigENV.cfg and UI_Errors.cfg plus the help file (UI_Help.cfg) and a template of a job description in JDL (job_template.tpl). Finally there is the <install-path>/utils/set_python script that can be sourced to set the environment variable PYTHONPATH representing the libraries search path for python.

4.4.4. ConfigurationConfiguration of the User Interface is accomplished editing the file “<install-path>/etc/UI_ConfigENV.cfg:” to set opportunely the contained parameters. They are listed hereafter together wit their meanings: DEFAULT_STORAGE_AREA_IN defines the path of the directory where files coming

from RB (i.e. the jobs Output Sandbox files) are stored if not specified by the user through commands options. Default value for this parameter is:

DEFAULT_STORAGE_AREA_IN = /tmp

requirements, rank and RetryCount represent the values that are assigned by the UI to the corresponding job attributes (mandatory attributes) if these have not been provided by the user in the JDL file describing the job. Default values are:

requirements = TRUErank = - other.EstimatedTraversalTimeRetryCount = 3

ErrorStorage represent the path of the location where the UI creates log files. Default location is:

ErrorStorage = /tmp

RetryCountLB and RetryCountJobId are the number of UI retrials on fatal errors respectively when opening connection with an LB and when querying the LB for information about a given job. Default values for these parameters are:

RetryCountLB = 1RetryCountJobId = 1

Moreover there are two sections reserved to the addresses of the LBs and RBs that are accessible for the UI from the machine where it is installed. Special markers (e.g. %%beginLB%%) that must not be modified, indicate the sections begin-end. Hereafter is reported an example of the two mentioned sections: %%beginLB%%https://grid013g.cnaf.infn.it:7846https://grid004f.cnaf.infn.it:7846

IST-2000-25182 PUBLIC 40 / 97

Page 41: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

https://skurut.cesnet.cz:7846%%endLB%%

%%beginRB%%grid013g.cnaf.infn.it:7771grid004f.cnaf.infn.it:7771%%endRB%%

LB addresses must be in the format:[<protocol>://]<hostname>:<port>

where if not provided, default for <protocol> is “https”, whilst RB addresses must be in the format:<hostname>:<port>

4.4.5. Environment variablesEnvironment variables that have to be set for the User Interface are listed hereafter:

X509_USER_KEY the user private key file path. Default value is $HOME/.globus/userkey.pem

X509_USER_CERT the user certificate file path.Default value is $HOME/.globus/usercert.pem

X509_CERT_DIR the trusted certificate directory and ca-signing-policy directory. Default value is /etc/grid-security/certificates

X509_USER_PROXY the user proxy certificate file path. Default value is /tmp/x509up_u<UID> where UID is the user identifier on the machine

as required by GSI. Moreover there are:

PYTHONPATH Python modules import search path. It has to be set to <install path/lib.

EDG_WL_UI_INSTALL_PATH UI install path. It has to be set only if installation has been made in a non default location. It defaults to /opt/edg

EDG_WL_UI_CONFIG_PATH Non standard location of the UI configuration file UI_ConfigENV.cfg. This variable points to the file absolute path.

GLOBUS_LOCATION The Globus rpms installation path.

IST-2000-25182 PUBLIC 41 / 97

Page 42: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

The Logging library i.e. the library that is linked into UI for logging the jobs transfer events reads its immediate logging destination form the variable DGLOG_DEST. Correct format for this variable is:DGLOG_DEST=x-dglog://HOST:PORTwhere HOST defaults to localhost and PORT defaults to 15830. On the submitting machine if the variable is not set it is dynamically assigned by the UI with the value:DGLOG_DEST=x-dglog://<LB_CONTACT>:15830where LB_CONTACT is the hostname of the machine where the LB server currently associated to the RB used for submitting jobs is running.

IST-2000-25182 PUBLIC 42 / 97

Page 43: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

5. OPERATING THE SYSTEM

5.1. LB LOCAL-LOGGER

5.1.1. Starting and stopping daemonsTo run the LB local-logger services, both the dglogd and the interlogger processes must be started. This can be done issuing the following commands:

<install path>/sbin/dglogd <options><install path>/sbin/interlogger <options>

Both daemons recognize a common set of options:--key=<keyfile> host certificate private key file (this option overrides value of

the environment variable X509_USER_KEY). Here below an example of option usage:--key=/etc/grid-security/hostkey.pem

--cert=<certfile> host certificate file (this option overrides value of the environment variable X509_USER_CERT). Here below an example of option usage:--cert=/etc/grid-security/hostcert.pem

--CAdir=<certdir> trusted certificate and ca-signing-policy directory (this option overrides value of the environment variable X509_CERT_DIR). Here below an example of option usage:--CAdir=/etc/grid-security/certificates

--file-prefix=<file path> Absolute path of the file where are stored locally the logged events. The default value is /tmp/dglog, which can result in risk of data loss in case of reboot. Note that the same value must be specified for dglogd and interlogger.

--debug make the process run in foreground to produce diagnostics

Using the options explicitly is recommended rather than relying on the correspondent environment variables.Stop of the LB local-logger services, can be performed killing the dglogd and interlogger daemons with the “kill –TERM <PID>” OS command.

IST-2000-25182 PUBLIC 43 / 97

Page 44: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

5.1.2. TroubleshootingIf the LB local-logger services are started in debug mode (i.e. using the –-debug option), the daemons log fatal failures with syslog().

IST-2000-25182 PUBLIC 44 / 97

Page 45: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

5.2. LB SERVER

5.2.1. Starting and stopping daemonsTo run the LB server services, both the bkserver and the ileventd processes must be started. This can be done issuing the following commands:

<install path>/sbin/ileventd <options><install path>/sbin/bkserver <options>

Both daemons recognize a common set of options:

--key=<keyfile> host certificate private key file (this option overrides value of the environment variable X509_USER_KEY). Here below an example of option usage:--key=/etc/grid-security/hostkey.pem

--cert=<certfile> host certificate file (this option overrides value of the environment variable X509_USER_CERT). Here below an example of option usage:--cert=/etc/grid-security/hostcert.pem

--CAdir=<certdir> trusted certificate and ca-signing-policy directory (this option overrides value of the environment variable X509_CERT_DIR). Here below an example of option usage:--CAdir=/etc/grid-security/certificates

--debug make the process run in foreground to produce diagnostics

Using the options explicitly is recommended rather than relying on the correspondent environment variables.Stop of the LB local-logger services, can be performed killing the ilventd and bkserver daemons with the “kill –TERM <PID>” OS command.

5.2.2. Purging the LB databaseThe bkpurge process, whose executable is installed in <install path>/sbin, is not a daemon but an utility which should be run periodically (e.g. using a cron job) in order to remove inactive jobs (i.e. those that have already entered the Cleared status since a certain amount of time) from the LB database. This utility recognizes the following set of options:

IST-2000-25182 PUBLIC 45 / 97

Page 46: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

--log data being purged from database are dumped on the stdout

--outfile=<file> data being purged from database are dumped in the file named <file>

--mysql=<database> name of the database to be purged. It must be the same used by bkserver (this option is not required in the standard set-up

--timeout=<timeout>[smhd] removes data for all jobs that entered the “Cleared” status since more than <timeout> [seconds/minutes/hours/days].

--debug print diagnostics on the stderr--nopurge dry run mode. It doesn't really purge (useful for

debugging purposes)--aborted, -a delete from the database data also for jobs that have

entered the “Aborted” statusIf --log is specified, the data in ULM format are dumped to stdout (or <file>). Normally information is appended to the file. The file is locked with flock (_LOCK_EX) to prevent race conditions, e.g. rotating logs.An example of usage of this utility could be the issuing once a day, using a cron job, of a bkpurge like:

bkpurge --log --outfile=/var/log/dglb-data.log --timeout=14d

5.2.3. TroubleshootingIf the LB server services are started in debug mode (that is using the –-debug option) the daemons log fatal failures with syslog().

IST-2000-25182 PUBLIC 46 / 97

Page 47: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

5.3. RB AND JSS

5.3.1. Startig PostgreSQLTo start postgreSQL the following commands have to be issued:

su - postgrespostgres> nohup $PGSQL_INSTALL_PATH/bin/postmaster -i -D $PGDATA &postgres> exit

To check if the database is running issue:

ps -awux | grep postmaster.

And check that the postmaster process is listed.

5.3.2. Starting Condor-GTo start Condor-G, the following command has to be issued as dguser (see section 4.2.1.1 for details on creation of this account):

condor_master

As before, to check if Condor-G is running use the command:

ps -auwx | grep condor

and check that the condor_master and condor_schedd processes are listed.Actually the script that starts JSS (see section 5.3.4) also checks if Condor-G is running, and starts it if necessary. It is anyway important to remember that RB, JSS and Condor-G have to be started from the same account.

5.3.3. Starting and stopping RB daemonsThe RB server can be started through a shell script (broker) issuing the following command as user dguser (see section 4.2.1.1 for details on creation of this account):

<install path>/sbin/broker start

IST-2000-25182 PUBLIC 47 / 97

Page 48: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

For starting the RB as dguser (recommended choice), there must be some key-certificate files available to the dguser user account, pointed by the X509_HOST_KEY and X509_HOST_CERT environment variables. In this case the broker script also creates a symbolic link .gridmap in $HOME pointing to /etc/grid-security/grid-mapfile needed by the RB Otherwise if “default” key and certificate files are used (i.e. respectively /etc/grid-security/hostkey.pem and /etc/grid-security/hostcert.pem), the RB must be started as root.It is possible to choose the location where the RB log file (default path name: /var/tmp/RBserver.log) is created customizing the following two lines of start_RB.

mv /var/tmp/RBserver.log /var/tmp/RBserver.log.old/opt/edg/bin/RBserver > /var/tmp/RBserver.log 2>&1 &

The RB can be instead stopped invoking the script:

<install path>/sbin/broker stop

It is important to remember that RB, JSS and Condor-G have to be started from the same account.

5.3.4. Starting and stopping JSS daemonsThe Job Submission Service daemon is started launching the jobsubmission shell script as user dguser (see section 4.2.1.1 for details on creation of this account):

<install path>/sbin/jobsubmission start

For starting the JSS as dguser (recommended choice), there must be some key-certificate files available to the dguser user account, pointed by the X509_HOST_KEY and X509_HOST_CERT environment variables. Otherwise if “default” key and certificate files are used (i.e. respectively /etc/grid-security/hostkey.pem and /etc/grid-security/hostcert.pem), the JSS must be started as root.It is possible to choose the location where the JSS log files (default paths are: /var/tmp/JSSserver.log and /var/tmp/JSSparser.log) are created customizing the following lines of start_JobSubmission:

mv /var/tmp/JSSserver.log /var/tmp/JSSserver.log.oldmv /var/tmp/JSSparser.log /var/tmp/JSSparser.log.old/opt/edg/bin/JSSserver > /var/tmp/JSSserver.log 2>&1 &/opt/edg/bin/JSSparser > /var/tmp/JSSparser.log 2>&1 &

The JSS can be instead stopped invoking the script:

IST-2000-25182 PUBLIC 48 / 97

Page 49: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

<install path>/sbin/jobsubmission stop

It is important to remember that RB, JSS and Condor-G have to be started from the same account.

5.3.5. RB troubleshootingAs reported in section 5.3.3, the script responsible to start RB includes the definition of a log file where RB logs the various events, and therefore this file can be used to debug abnormal behaviours of RB. The default pathname for this file is: /var/tmp/RBserver.log

5.3.6. JSS troubleshootingAs reported in section 5.3.4, the script responsible to start JSS includes the definition of the JSS log files, and therefore these files can be checked to analyse JSS behaviour. The default pathnames for these files are: /var/tmp/JSSserver.log and /var/tmp/JSSparser.log.Also the CondorG log files can help to debug problems related with the JSS. There are moreover some temporary files created by JSS that can be checked if something goes wrong; these files are indicated by values of the variables Condor_submit_file_prefix, Job_wrapper_file_prefix and Condor_stdoe_dir defined in the jss.conf file.

5.4. INFORMATION INDEX

5.4.1. Starting and stopping daemonsTo start/top the II, the following command has to be used:

/opt/edg/utils/SXXII {start | stop}

IST-2000-25182 PUBLIC 49 / 97

Page 50: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

6. USER GUIDEThe software module of the WMS allowing the user to access main services made available by the components of the scheduling sub-layer is the User Interface that hence represents the entry-point to the whole system. Sections 6.1.1 and 6.1.2 provide a general description of the UI, dealing with the security management, common behaviours, environment variables to be set etc. Section 6.1.3 instead describes the Job Submission User Interface commands in a Unix man-page style.

6.1. USER INTERFACE The Job Submission UI is the module of the WMS allowing the user to access main services made available by the components of the scheduling sub-layer. The user interaction with the system is assured for PM9 by means of a JDL and a command-driven user interface provid-ing commands to perform a certain set of basic operations. Main operations made possible by the UI are:

- Submit a job for execution on a remote Computing Element, also encompassing: automatic resource discovery and selection staging of the application sandbox (input sandbox)

- Find the list of resources suitable to run a specific job- Cancel one or more submitted jobs- Retrieve the output files of a completed job (output sandbox)- Retrieve and display bookkeeping information about submitted jobs- Retrieve and display logging information about submitted jobs.

The User Interface depends on two other Workload Management System components:- the Resource Broker that provides support for the job control functionality- the Logging and Bookkeeping Service provides support for the job monitoring

functionality.

6.1.1. Security For the DataGrid to be an effective framework for largely distributed computation, users, user processes and grid services must work in a secure environmentDue to this, all interactions between WMS components, especially those that are network-separated, will be mutually authenticated: depending on the specific interaction, an entity au-thenticates itself to the other peer using either its own credential or a delegated user creden-tial or both. For example when the User Interface passes a job to the Resource Broker, the UI authenticates using a delegated user credential (a proxy certificate) whereas the RB uses its own service credential. The same happens when the UI interacts with the Logging and Bookkeeping service. The UI uses a delegated user credential to limit the risk of compromis-ing the original credential in the hands of the user.The user or service identity and their public key are included in a X.509 certificate signed by a DataGrid trusted Certification Authority (CA), whose purpose is to guarantee the associa-tion between that public key and its ownerAccording to what just premised, to take advantage of UI commands the user has to possess a valid X.509 certificate on the submitting machine, consisting of two files: the certificate file and the private key file. The location of the two mentioned files is assumed to be either pointed to respectively by “$X509_USER_CERT” and “$X509_USER_KEY” or by “$HOME/.globus/usercert.pem” and “$HOME/.globus/userkey.pem” if the X509 environment variables are not set. The user certificate and private key files are needed for the creation of

IST-2000-25182 PUBLIC 50 / 97

Page 51: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

the delegated user credentials and for reading the user certificate subject that acts as an identifier for the submitter. All UI commands, when started, check for the existence and expiration date of a user proxy certificate in the location pointed to by “$X509_USER_PROXY” or in “/tmp/x509up_u<UID>” (<UID> is the user identifier in the submitting machine OS) if the X509 environment variable is not set. If the proxy certificate does not exist or has expired a new one with default duration of 24 hours is automatically created by the UI using the GSI services (grid-proxy-init and grid-proxy-info). The user proxy certificate is created either as “$X509_USER_PROXY” or as “/tmp/x509up_u<UID>”. Once a job has been submitted by the UI, it passes through several components of the WMS (e.g. the RB, the JSS etc.) before it completes its execution. At each step operations that are related with the job could require authentication by a certificate. For example during the scheduling phase, the RB needs to get some information about the user who wants to sched-ule a job. An authentication by a certificate of the user could be needed to access this infor-mation. Similarly, a valid user’s certificate is needed by JSS to submit a job to CE. JSS has to be able to repeat this process e.g. in case of crashing the CE which the job is running on, therefore, a valid user’s certificate is needed for all the job lifetime.A job gets a valid proxy certificate when it is submitted by the UI to RB. Validity of such a cer-tificate is usually set to 24 hours, hence problems could occur if the job spends on CE (in a queue or running) more time than lifetime of its proxy certificate.Since for PM9 release no mechanism for periodic credential renewal will be provided, the UI dg-job-submit command (see description later in this document) supplies an option (-hours H) allowing the specification of the duration in hours of the proxy certificate that is created on behalf of the user. Due to this, it being understood that the certificates files search paths remains as before, the proxy checking mechanism for this command slightly differs from that of the other commands, i.e.: If the “–hours H” option has not been specified, the proxy certificate check is done as ex-

plained before If the “–hours H” option has been specified, then a new proxy certificate having a duration

of H hours is created both when no existing proxy is found and when the existing proxy lifetime is less than H. In the latter case the existing proxy certificate is destroyed before creating the new one.

6.1.2. Common behavioursA User Interface installation mainly consists of three directories bin, lib and etc that are created under the UI installation path that is usually pointed by the EDG_WL_UI_INSTALL_PATH environment variable. If this variable is not set or its value is not correct, default value is assumed to be “/opt/edg”.bin contains the commands executables and hence it is recommended to add it to the user PATH environment variable to allow her/him to use UI commands from whatever location. lib contains the shared libraries (wrappers of the RB/LB APIs) implementing functionalities for accessing the RB and LB services , whereas etc is the UI configuration area. The shared libraries stored in lib are imported at run-time by the UI commands that search them in the locations pointed to by the PYTHONPATH environment variable. By default PYTHONPATH encompasses only some python standard locations hence “$EDG_WL_UI_INSTALL_PATH/lib” has to be added to the search path.

IST-2000-25182 PUBLIC 51 / 97

Page 52: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

The UI configuration area etc contains the job description template file job_template.jdl, the file containing the mapping between error codes and error messages UI_Errors.cfg, and the actual configuration file UI_ConfigEnv.cfg. The latter file is the only one that could need to be edited and tailored according to the user/platform characteristics and needs. It contains the following information that are read by and have influence on commands behaviour (see section 4.4.4 for details):

- address and port of accessible RBs ordered by priority,- address and port of accessible LBs ordered by priority,- default location of the local storage areas for the Input/Output sandbox files,- default values for the JDL mandatory attributes,- default number of retrials on fatal errors when connecting to the LB.

When started, UI commands first check if the EDG_WL_UI_INSTALL_PATH is set and then search for the etc directory containing its configuration files in the following locations, in order of precedence: “$EDG_WL_UI_INSTALL_PATH”, “/“, “/usr/local“ and “/opt/edg“. If none of the locations contains needed files an error is returned to the user.Since several users on the same machine can use a single installation of the UI, people concurrently issuing UI commands share the same configuration files. Anyway for users (or groups of users) having particular needs it is possible to “customise” the UI configuration through the –config option supported by each UI command.Indeed every command launched specifying “–config path_name” reads its configuration settings in the file “path_name” instead of the default configuration file. Hence the user only needs to create such file according to her/his needs and to use the –config option to work under “private” settings.Moreover if the user wants to make this change in some way permanent avoiding the use for each issued command of the –config option, she/he can set the environment variable EDG_WL_UI_CONFIG_PATH to point to the non-standard path of the configuration file. Indeed if that variable is set commands will read settings from file “$EDG_WL_UI_CONFIG_PATH”. Anyway the –config option takes precedence on all other settings.Hereafter are listed the options that are common to all UI commands (with the exception of dg-job-id-info that is a local utility):

- -config path_name- -noint- -debug- -version- -help

The –noint option skips all interactive questions to the user and goes ahead in the command execution. All warning messages and errors (if any) are written to the file <command_name>_<UID>_<PID>.log in the “/tmp” directory instead of the standard output. It is important to note that when –noint is specified some checks on “dangerous actions” are skipped. For example if jobs cancellation is requested with this option, this action will be performed without requiring any confirmation to the user. The same applies if the command output will overwrite an existing file, so it is recommended to use the –noint option in a safe context.

IST-2000-25182 PUBLIC 52 / 97

Page 53: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

The –debug option is mainly thought for testing and debugging purposes; indeed it makes the commands print additional information while running. Every time an external API function call is encountered during the command execution, values of parameters passed to the API are printed to the user. The info messages are displayed on the standard output and are also written together with possible errors, to <command_name>_<UID>_<PID>.log file in the /tmp directory. An example of the debug messages format is as follows:#### Debug API #### - The function 'dgLBJobStatus' has been called with the following parameter(s): >>Struct 'dgLBContext': -> 0 -> 0 >>Struct 'dgJobId': -> lx01.hep.ph.ic.ac.uk/124445102160554 -> grid004f.cnaf.infn.it -> 7846 -> grid013g.cnaf.infn.it:7771>> 0

If –noint option is specified together with –debug option the debug message will not be printed on standard output.The –version and –help options respectively make the commands display the UI current version and the command usage.Two further options that are common to almost all commands are –input and –output. The latter one makes the commands redirect the outcome to the file specified as option argument whilst the former reads a list of input items from the file given as option argument. The only exception is the dg-job-list-match command that does not have the –input option.For all commands, the file given as argument to the –input option shall contain a list of job identifiers in the following format: one dg_jobId for each line, comments beginning with a “#” or a “*” character. If the input file contains only one dg_jobId (see the description of dg-job-submit command later in this document for details about dg_jobId format), then the request is directly submitted taking the dg_jobId as input, otherwise a menu is displayed to the user listing all the contained items, i.e. something like:------------------------------------------------------------------------------------------------------------------------------------------1 : https://grid013g.cnaf.infn.it:7846/lx01.hep.ph.ic.ac.uk/133711137156527?grid013g.cnaf.infn.it:77812 : https://grid013g.cnaf.infn.it:7846/lx01.hep.ph.ic.ac.uk/133747137833158?grid013g.cnaf.infn.it:77813 : https://grid004f.cnaf.infn.it:7846/lx01.hep.ph.ic.ac.uk/133957138124219?grid004f.cnaf.infn.it:77714 : https://grid013g.cnaf.infn.it:7846/lx01.hep.ph.ic.ac.uk/134030138239274?grid013g.cnaf.infn.it:77715 : https://grid001f.cnaf.infn.it:7846/lx01.hep.ph.ic.ac.uk/140706140477638?grid013g.cnaf.infn.it:7771a : allq : quit-------------------------------------------------------------------------------------------------------------------------------------------Choose one or more dg_jobId(s) in the list - [1-10]all:

The user can choose one or more jobs from the list entering the corresponding numbers. E.g.:

IST-2000-25182 PUBLIC 53 / 97

Page 54: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

2 makes the command take the second listed dg_jobId as input 1,4 makes the command take the first and the fourth listed dg_jobIds as input 2-5 makes the command take listed dg_jobIds from 2 to 5 (ends included) as input all makes the command take all listed dg_jobIds as input q makes the command quit

Default value for the choice is all. If the –input option is used together with the –noint then all dg_jobIds contained in the input file are taken into account by the command.The only command whose –input behaviour differs from the one just described is dg-job-submit. First of all the input file contains in this case CEIds instead of dg_jobIds, moreover only one CE at a time can be the target of a submission hence the user is allowed to choose one and only one CEId. Default value for the choice is “1”, i.e. the first CEId in the list. This also the choice automatically made by the command when the –input option is used together with the –noint one.

IST-2000-25182 PUBLIC 54 / 97

Page 55: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

6.1.3. Commands descriptionIn this section we describe syntax and behavior of the commands made available by the UI to allow job submission, monitoring and control.In the commands synopsis the mandatory arguments are showed between angle brackets (<arg>) whilst the optional ones between square brackets ([arg]). The pipe character “¦” between options/arguments indicates mutually exclusive option/arguments.

– dg-job-submitAllows the user to submit a job for execution on remote resources in a grid.

SYNOPSISdg-job-submit [-help]dg-job-submit [-version]dg-job-submit [-template]dg-job-submit <job_description_file> [-input input_file | -resource ce_id] [-notify e_mail_address] [-config path_name] [-output out_file] [-hours H] [-nomsg] [-noint] [-debug]

DESCRIPTIONdg-job-submit is the command for submitting jobs to the DataGrid and hence allows the user to run a job at one or several remote resources. dg-job-submit requires as input a job description file in which job characteristics and requirements are expressed by means of Condor class-ad-like expressions. While it does not matter the order of the other arguments, the job description file has to be the first argument of this command.The job description file given in input to this command is syntactically checked and default values are assigned to some of the not provided mandatory attributes in order to create a meaningful class-ad. The resulting job-ad is sent to the Resource Broker that finds the job best matching resource (match-making) and submits the job to it. The match-making algorithm is described in details in Annex 7.5.Upon successful completion this command returns to the user the submitted job identifier dg_jobId (a string that identifies unambiguously the job in the whole DataGrid), generated by the User Interface, that can be later used as a handle to perform monitor and control operations on the job (e.g. see dg-job-status described later in this document). The format of the dg_jobId is as follows:

<LBname>/<UIaddress>/<time><PID><RND>?<RBname>where:

- LBname is the LB server name and port- UIaddress is the UI machine IP address (or FQDN)- time is the current UTC time on the submitting machine in hhmmss format

IST-2000-25182 PUBLIC 55 / 97

Page 56: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

- PID is the command process identifier- RND is a random number generated at each job submission- RBname is the RB server hostname and port

The structure of the dg_jobId that could appear in some way complex and not easily readable has been conceived in order to assure uniqueness and the same time contain information that are needed by the components of the WMS to fulfil user requests. The -resource option can be used to target the job submission to a specific known resource identified by the provided Computing Element identifier ce_id (returned by dg-job-list-match described later in this document). For PM9 a resource will be either a queue of an underlying LRMS, assuming that this queue represents a set of “homogeneous” resources or a “single” node. The CE identifier is a string, assigned by WP4 and published in the GIS (the CEId field) that univocally identifies a resource belonging to the Grid. CEId is obtained “combining” the GlobusResourceContactString and QueueName attribute, e.g. if lxde01.pd.infn.it:2119 is the Globus resource contact string and grid01 is the queue name then it looks like lxde01.pd.infn.it:2119/jobmanager-lsf-grid01.When the –resource option is specified, the Resource Broker skips the match making process and directly submits the job to the requested CE. It is also possible to specify the target CE to which submit the job using the –input option. With the -input option an input_file must be supplied containing a list of target CE ids. In this case the dg-job-submit command parses the input_file and displays on the standard output the list of CE Ids written in the input_file. The user is then asked to choose one CEId between the listed ones. The command will then behave exactly like already explained for the -resource option. The basic idea of this command is to use as input_file the output file generated by the dg-job-list-match command when used with the –output option (see dg-job-list-match) that contains the list of CE Ids (if any) matching the requirements specified in the jobad.jdl file. An example of a possible sequence of commands is:>$ dg-job-list-match jobad.jdl –output CEList.out>$ dg-job-submit jobad.jdl –input CEList.out If CEList.out contains more than one CEId then the user is prompted for choosing one Id from the list.When dg-job-submit is used with the –notify option, the following schema is used to notify the user about job status changes:- an e-mail notification is sent to the specified e_mail_address when the match-making

process has finished and the job is ready to be submitted to JSS (READY status)- an e-mail notification is sent to the specified e_mail_address when the job starts running

on the CE (RUNNING status)- an e-mail notification is sent to the specified e_mail_address when the job has finished

(ABORTED or DONE status).The notification message will contain basic information about the job such as the job identifier, the Id of the assigned CE and a brief description of its status.It is possible to redirect the returned dg_jobId to an output file using the –output option. If the file already exists, a check is performed: if the file was previously created by the command dg-job-submit (i.e. it contains a well defined header), the returned dg_jobId is appended to the existing file every time the command is launched. If the file wasn’t created by the

IST-2000-25182 PUBLIC 56 / 97

Page 57: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

command dg-job-submit the user will be prompted to choose if overwrite the file or not. If the answer is no the command will abort.The dg-job-submit command has a particular behaviour when the job description file contains the InputSandbox attribute whose value is a list of file paths on the UI machine local disk. The purpose of the introduction of the InputSandbox attribute is to stage, from the UI to the CE, files that are not available in any SE and are not published in any Replica Catalogue.To better understand, let’s suppose to have a job that needs for the execution a certain set of files having a small size and available on the submitting machine. Let’s also suppose that for performance reasons it is preferable not going through the WP2 data transfer services for the staging of these files on the executing node. Then the user can use the InputSandbox attribute to specify the files that have to be staged from the submitting machine to the executing CE. All of them are indeed transferred at job submission time together with the job class-ad to the RB that will store them temporarily on its local disk. The JSS will then perform the staging of these files on the executing node. The size of files to be transferred to the RB should be small since overfull of RB local storage means that no more job of this type can be submitted. This mechanism can also be used to stage a job executable available locally on the UI machine to the executing CE. Indeed in this case the user has to include this file in the InputSandbox list (specifying its absolute path in the file system of the UI machine) and as Executable attribute value has only to specify the file name. On the contrary, if the executable is already available in the file system of the executing machine, the user has to specify as Executable an absolute path name for this file (if necessary using environment variables). The same argument can be applied to the standard input file that is specified through the StdInput JDL attribute.For the standard output and error of the job the user shall instead always specify just file names (without any directory path) through the StdOutput and StdError JDL attributes. To have them staged back on the UI machine it suffices to list them in the OutputSandbox and use after job completion the dg-job-get-output command described later in this document.The list of data specification JDL attributes is completed by the InputData attribute that refers to data used as input by the job that are not subjected to staging and are stored in one or more storage elements and published in replica catalogues. Due to this when the user specifies the InputData attribute then he/she also has to provide the name of the replica catalogue (ReplicaCatalog attribute) where these data are published and the protocol her/his application is able to “speak” for accessing data (DataAccessProtocol attribute). The InputData attribute should normally contain a list of logical and/or physical file names. If InputData only contains PFNs then the ReplicaCatalog attribute specification is no more mandatory.Since the InputSandbox expression can consist of a great number of file names, it is admitted the use of wildcards and environment variables to specify the value of this attribute. Syntax and allowed wildcards are described in Annex 7.4.The –hours allows the user to specify the user proxy duration H, in hours, needed for submitting the job. This option has to be used for long-lasting jobs, indeed a job when submitted needs to be accompanied by a valid proxy certificate during all its life-time and the default duration of user proxy created by UI commands is 24 hours that could in some case not be enough.Lastly the –nomsg option makes the command display neither messages nor errors on the standard output. Only the dg_jobId assigned to the job is printed to the user if the command

IST-2000-25182 PUBLIC 57 / 97

Page 58: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

was successful. Otherwise the location of the generated log file containing error messages is printed on the standard output. This option has been provided to make easier use of the dg-job-submit command inside scripts in alternative to the –output option.JOB DESCRIPTION FILEA job description file contains a description of job characteristics and constraints in a class-ad style. Details on the class-ad language are reported in the document [A1] also available at the following URL:http://www.infn.it/workload-grid/docs/DataGrid-01-TEN-0102-Document.pdf.The job description file must be edited by the user to insert relevant information about the job that is later needed by the RB to perform the match-making. A template of the job description file, containing a basic set of attributes can be obtained by calling the dg-job-submit command with the -template option. Job description file entries are strings having the format attribute = expression and are terminated by the semicolon character. If the entry spans more than one line, the end of line has to be indicated with a backslash (\) character. Comments must be preceded by a sharp character (#) at the beginning of each line.Being the class-ad an extensible language, it there doesn’t exist a fixed set of admitted attributes, i.e. the user can insert in the job description file whatever attribute he believes meaningful to describe her/his jobs, anyway only the attributes that can be in some way connected with the resource ones published in the GIS are taken into account by the Resource Broker for the match-making process. Unrelated attributes are simply ignored except when they are used to build the Requirements expression. In the latter case they are indeed evaluated and could affect the match-making result. The attributes taken into account by the RB together with their meaning are reported in document [A7].There is a small subset of class-ad attributes that are compulsory, i.e. that have to be present in a job class-ad before it is sent to the Resource Broker in order to make possible the performing of the match making process. They can be grouped in two categories: some of them must be provided by the user whilst some other, if not provided, are filled by the UI with configurable default values. The following Table 1 summarises what just stated.

Attribute Mandatory Mandatory with default value (default value)

Executable

Requirements (TRUE)

Rank (-EstimatedTraversalTime)

RetryCount (3)

ReplicaCatalog (only if at least one LFN has been

specified in the InputData attribute)

DataAccessProtocol (only if the InputData attribute has

been specified)

IST-2000-25182 PUBLIC 58 / 97

Page 59: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Table 1 Mandatory Attributes

In Table 1 the default values for Requirements and Rank can be interpreted respectively as follows:- if the user has not provided job constraints then Requirements is set to TRUE, i.e. it does

not matter which are characteristics of the computing element where the job has to be executed, the RB will take into account all sites where the user is authorised to run her/his application.

- Since in the JDL the greater is the value of Rank the better is considered the match, if no expression for Rank has been provided, then the resources where the jobs waits a shorter time to pass from the SCHEDULED to the RUNNING status are preferred.

The RetryCount attribute reported in Table 1 can be used to specify the number of time the JSS must resubmit a job to a CE in case the submission fails (e.g. the resource is temporary unavailable). Only when all requested re-submissions fail the job is aborted. If the user does not provide the value of RetryCount it is assigned with a default value by the UI (this default value is currently set to 3). The default values for the Requirements, Rank and RetryCount attributes can be set in the UI_ConfigEnv.cfg file.

OPTIONS-help

displays command usage.

-versiondisplays UI version.

-resource ce_id-r ce_id

if the command is launched with this option, the job-ad sent to the RB contains a line of the type CEId = ce_id and the job is submitted by the Resource Broker to the resource identified by ce_id without going through the match-making process.

-input input_file-i input_file

if this option is specified the user will be asked to choose a CEId from a list of CEs contained in the input_file. Once a CEId is selected the command behaves as explained for the -resource option. If this option is used together with the –noint one and the input file contains more than one CEId, then the first CEId in the list is taken into account for submitting the job.

-notify e_mail_address-n e_mail_address

IST-2000-25182 PUBLIC 59 / 97

Page 60: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

when a job is submitted with this option an e-mail message containing basic information pertaining the job identification and status is sent to the specified e_mail_address when the job enters one of the following status:- READY- RUNNING- ABORTED or DONE

-config path_name-c path_name

if the command is launched with this option, the configuration file pointed to by path_name is used instead of the standard configuration file.

IST-2000-25182 PUBLIC 60 / 97

Page 61: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

-output out_file-o out_file

writes the generated dg_jobId assigned to the submitted job in the file specified by out_file. out_file can be either a simple name or an absolute path (on the submitting machine). In the former case the file out_file is created in the current working directory.

-hours H-h H

allows the user to specify the user proxy duration H, in hours, needed for submitting the job. When used with this option the dg-job-submit command behaves as follows:

the command checks for user proxy existence and if the proxy does not exist a new proxy with H hours duration is created

if the proxy exists then its duration is checked against the value specified with the -hours option. If proxy duration is greater than H hours then the job is submitted with the existing proxy, otherwise the old proxy is destroyed and a new one with H hours duration is created and used for submitting the job.

This mechanism allows the user to create before submission a proxy with a suitable duration for her/his job; moreover the user is not obliged to enter the PEM pass-phrase at each submission i.e. in all those cases where the existing proxy has a validity great enough for the job.

-nomsgthis option makes the command print on the standard output only the dg_jobId generated for the job if submission was successful; the location of the log file containing massages and diagnostics is printed otherwise.

-nointif this option is specified every interactive question to the user is skipped, moreover only the dg_jobId is returned on the standard output. All warning messages and errors (if occurred) are written to the file dg-job-submit_<UID>_<PID>.log under the /tmp directory. Log file location is configurable.

-debugwhen this option is specified, information about parameters used for the API functions calls inside the command are displayed on the standard output and are written to dg-job-submit_<UID>_<PID>.log file under the /tmp directory too. Log file location is configurable.

job_description_filethis is the file containing the classad describing the job to be submitted. It must be the first argument of the command.

IST-2000-25182 PUBLIC 61 / 97

Page 62: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

EXIT STATUSdg-job-submit exits with a status value of 0 (zero) upon success, and 1 (one) upon failure.

IST-2000-25182 PUBLIC 62 / 97

Page 63: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

EXAMPLES1. $> dg-job-submit myjob1.jdl

where myjob1.jdl is as follows:############################################## #

# -------- Job description file ----------# ############################################## Executable = "$(CMS)/fpacini/exe/sum.exe";InputData = "LF:testbed0-00019";ReplicaCatalog = "ldap://sunlab2g.cnaf.infn.it:2010/rc=WP2 INFN Test Replica Catalog,dc=sunlab2g, dc=cnaf, dc=infn, dc=it";DataAccessProtocol = "gridftp";RetryCount = 10;Rank = other.MaxCpuTime;Requirements = other.LRMSType == "Condor" && \ other.Architecture == "INTEL" && other.OpSys== "LINUX" && \

other.FreeCpus >= 4;

submits sum.exe to a resource (supposed to contain the executable file) whose LRMS is Condor. The command returns the following output to the user, containing the job handle (dg_jobid):

================= dg-job-submit Success ===================================The job has been successfully submitted to the Resource Broker. Your job is identified by (dg_jobId): https://grid004f.cnaf.infn.it:7846/155.198.211.205/161251122764136?grid004f.cnaf.infn.it:7771

Use dg-job-status command to display current job status.======================================================================

2. $> dg-job-submit myjob2.jdl –notify [email protected] the job described by myjob2.jdl , returns the same output as above to the user and sends a notification by e-mail at well defined job status changes to [email protected].

SEE ALSO[A1], [A2], dg-job-list-match.

IST-2000-25182 PUBLIC 63 / 97

Page 64: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

– dg-job-get-outputThis command requests the RB for the job output files (specified by the OutputSandbox attribute of the job-ad) and stores them on the submitting machine local disk.

SYNOPSISdg-job-get-output [-help] dg-job-get-output [-version]dg-job-get-output < dg_jobId1 …. dg_jobIdn | -input input_file > [-dir directory_path] [-config path_name] [-noint] [-debug]

DESCRIPTION The dg-job-get-output command can be used to retrieve the output files of a job that has been submitted through the dg-job-submit command with a job description file including the OutputSandbox attribute. After the submission, when the job has terminated its execution, the user can load the files generated by the job and temporarily stored on the RB machine as specified by the OutputSandbox attribute, issuing the dg-job-get-output with as input the dg_jobId returned by the dg-job-submit. It is also possible to specify a list of job identifiers when calling this command or an input file containing dg_jobIds by means of the –input option. When the –input is used, the user is requested to choose all, one or a subset of the job identifiers contained in the input file.The user can decide the local directory path on the UI machine where these files have to be stored by means of the –dir option, otherwise the retrieved files are put in a default location specified in the UI_ConfigENV.cfg configuration file (DEFAULT_STORAGE_AREA_IN parameter). In both cases a sub-directory will be added to the path supplied. The name of this sub-directory is the “<time><PID><RND>” unique number of the dg_jobId identifier (see command dg-job-submit for details on the dg_jobId structure).If the user wants to use his “private” configuration file, this can be done using option -config path_name. As a consequence the dg-job-get-output command looks for the file “path_name” instead of the standard configuration file. If this file does not exist the user is notified with an error message and the command is aborted.

OPTIONS-help

displays command usage.

-versiondisplays UI version.

IST-2000-25182 PUBLIC 64 / 97

Page 65: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

-dir directory_pathretrieved files (previously listed by the user through the OutputSandbox attribute of the job description file) are stored in the location indicated by directory_path/<dg_jobId unique string>.

-config path_name-c path_name

if the command is launched with this option, the configuration file pointed to by path_name is used instead of the standard configuration file.

-nointif this option is specified every interactive question to the user is skipped. All warning messages and errors (if occurred) are written to the file dg-job-get-output_<UID>_<PID>.log under the /tmp directory. Location of log file is configurable.

-debugwhen this option is specified, information about parameters used for the API functions calls inside the command are displayed on the standard output and are written to dg-get_job_output_<UID>_<PID>.log file under the /tmp directory too. Location of log file is configurable.

dg_jobIdjob identifier returned by dg-job-submit. If a list of jobs identifiers is specified, dg_jobIds have to be separated by a blank. Job identifiers must be the first argument of the command.

-input input_file-i input_file

this option makes the command return the OutputSandbox files for each dg_jobId contained in the input_files. This option can’t be used if one (or more) dg_jobIds have been already specified. The format of the input file must be as follows: one dg_jobId for each line and comment lines must begin with a “#” or a “*” character. This option, if used, must be provided as first argument of the command.

EXIT STATUSdg-job-get-output exits with a status value of 0 (zero) upon success, >0 upon failure and <0 upon partial failure. An example of partial failure is when more than one job identifiers has been specified and the OuputSandbox could be retrieved only for some of them.

IST-2000-25182 PUBLIC 65 / 97

Page 66: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

EXAMPLESLet us consider the following command:

$> dg-job-get-output https://grid004.it:2234/124.75.74.12/12354732109721?firefox.esrin.esa.it:4577 –dir /home/dataIt retrieves the files listed in the OutputSandbox attribute of job identified by https://grid004.it:2234/124.75.74.12/12354732109721?firefox.esrin.esa.it:4577 from the RB and stores them locally in /home/data/12354732109721.

IST-2000-25182 PUBLIC 66 / 97

Page 67: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

– dg-job-list-match Returns the list of resources fulfilling job requirements.

SYNOPSISdg-job-list-match [-help] dg-job-list-match [-version]dg-job-list-match <job_description_file> [-verbose] [-config path_name] [-output output_file] [-noint] [-debug]

DESCRIPTION dg-job-list-match displays the list of identifiers of the resources accessible by the user and satisfying the job requirements included in the job description file. The CE identifiers are returned either on the standard output or in a file according to the chosen command options and are strings univocally identifying the CEs published in the GIS. dg-job-list-match requires a job description file in which job characteristics and requirements are expressed by means of a Condor class-ad. The job description file is first syntactically checked and then used as the main command-line argument to dg-job-list-match. The Resource Broker is only contacted to find job compatible resources; the job is never submitted. See the dg-job-submit section and in particular Table 1 for general rules for building the job description file.If the user wants to use his “private” configuration, file this can be done using option –config path_name. The option -verbose of the dg-job-list-match command can be used to obtain on the standard output the class-ad sent to the RB generated from the job description.The –output option makes the command save the list of compatible resources into the specified file. If the provided file name is not an absolute path, then the output file is created in the current working dir.

JOB DESCRIPTION FILESee dg-job-submit for details.

IST-2000-25182 PUBLIC 67 / 97

Page 68: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

OPTIONS

-help displays command usage.

-versiondisplays UI version.

-verbose-v

displays on the standard output the job class-ad that is sent to the Resource Broker generated from the job description file. This differs from the content of the job description file since the UI adds to it some attributes that cannot be directly inserted by the user (e.g. CertificateSubject, defaults for RetryCount, Rank and Requirements if not provided).

-config path_name-c path_name

if the command is launched with this option, the configuration file pointed to by path_name is used instead of the standard configuration file.

-output output_file-o output_file

returns the CEIds list in the file specified by output_file. output_file can be either a simple name or an absolute path (on the submitting machine). In the former case the file output_file is created in the current working directory.

-nointif this option is specified every interactive question to the user is skipped. All warning messages and errors (if any) are written to the file dg-job-list-match <UID>_<PID>.log under the /tmp directory. Location of the log file is configurable.

-debugwhen this option is specified, information about the API functions called inside the command are displayed on the standard output and are written to the file dg-job-list-match_<UID>_<PID>.log under the /tmp directory too. Location of the log file is configurable.

job_description_file

IST-2000-25182 PUBLIC 68 / 97

Page 69: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

this is the file containing the classad describing the job to be submitted. It must be the first argument of the command.

EXIT STATUSdg-job-list-match exits with a status value of 0 (zero) upon success, and a non-zero value upon failure.

EXAMPLESLet us consider the following command:$> dg-job-list-match myjob.jdlwhere the job description file myjob.jdl looks like:

######################################### # # ---- Sample Job Description File ---- # ######################################### Executable = "sum.exe";StdInput = "data.in";InputSandbox = {"/home_firefox/fpacini/exe/sum.exe","/home1/data.in"};OutputSandbox = {"data.out","sum.err"};RetryCount = 4;Rank = other.MaxCpuTime;Requirements = other.LRMSType == "Condor" && other.Architecture == "INTEL" && other.OpSys== "LINUX" &&

other.FreeCpus >= 2;

In this case the job requires CEs being Condor Pools of INTEL LINUX machines with at least 2 free Cpus. Moreover the Rank expression states that queues with higher maximum Cpu time allowed for jobs are preferred. The response of such a command is something as follows:********************************************************************************* Computing Element IDs LIST The following CE(s) matching your job requirements have been found:- bbq.mi.infn.it:2119/jobmanager-pbs-dque- skurut.cesnet.cz:2119/jobmanager-pbs-wp1********************************************************************************* $>

SEE ALSO

IST-2000-25182 PUBLIC 69 / 97

Page 70: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

[A1],[A2], dg-job-submit.

IST-2000-25182 PUBLIC 70 / 97

Page 71: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

– dg-job-cancel Cancels one or more submitted jobs.

SYNOPSISdg-job-cancel [-help] dg-job-cancel [-version]dg-job-cancel < dg_jobId1 …. dg_jobIdn | -input input_file | -all > [-notify e_mail_address] [-config path_name] [-output output_file] [-noint] [-debug]

DESCRIPTION This command cancels a job previously submitted using dg-job-submit. Before cancellation, it prompts the user for confirmation. The cancel request is sent to the Resource Broker that forwards it to the JSS that fulfils it.dg-job-cancel can remove one or more jobs: the jobs to be removed are identified by their job identifiers (dg_jobIds returned by dg-job-submit) provided as arguments to the command and separated by a blank space. The result of the cancel operation is reported to the user for each specified dg_jobId.If the -all option is specified, all the jobs owned by the user submitting the command are removed. When the command is launched with the -all option, no dg_jobId can be specified. It has to be remarked that for PM9 only the owner of the job can remove the job. When the –all option is specified the dg-job-cancel command contacts every Resource Broker listed in the UI_ConfigEnv.cfg file and asks for the cancellation of all jobs owned by the user identified by her/his certificate subject. If the user wants to use his “private” configuration file this could be done using option –config path_nameThe –input option permits to specify a file (input_file) that contains the dg_jobIds to be removed. The format of the file must be as follows: one dg_jobId for each line and comment lines must begin with a “#” or a “*” character. When using this option the user is interrogated for choosing among all, one or a subset of the listed job identifiers. If the input_file does not represent an absolute path the file will be searched in the current working directory.Possible job cancellation notifications are: Cancel Success i.e. the job has been successfully cancelled Cancel Not Allowed i.e. the job has already entered the status Done or Aborted, is

being deleted by another UI or the user issuing the cancel request is not the job owner. Cancel Failure i.e. the cancellation request has reached the JSS but has failed

for some reason on the CE.The –notify option can be used to receive jobs cancellation notifications by e-mail. When this option is used the UI does not wait for the cancel notifications from the RB and returns control to the user immediately after the RB has accepted the cancellation request. This can be useful when a great number of jobs to cancel have been specified and the user wants to be able to perform other operations without waiting for the command results.

IST-2000-25182 PUBLIC 71 / 97

Page 72: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

IST-2000-25182 PUBLIC 72 / 97

Page 73: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

OPTIONS

-help displays command usage.

-versiondisplays UI version.

-allcancels all job owned by the user submitting the command. This option can’t be used either if one or more dg_jobIds have been specified explicitly or with the –input option. If used, this option must be provided as first argument of the command.

-input input_file-i input_file

cancels dg_jobId contained in the input_files. This option can’t be used neither if one or more dg_jobIds have been specified nor with the –all option. If used, this option must be provided as first argument of the command.

-notify e_mail_address-n e_mail_address

when a cancel request is submitted with this option, an e-mail message will be returned to the e_mail_address specified. The message will report on cancellation success/failure of the job specified in input. When the –all option has been specified or cancellation involves more than one job, an e-mail message is sent to the user for each RB that has performed cancellations on behalf of the UI.

-config path_name-c path_name

if the command is launched with this option, the configuration file pointed to by path_name is used instead of the standard configuration file.

-output output_file-o output_file

writes the cancel results in the file specified by output_file instead of the standard output. output_file can be either a simple name or an absolute path (on the submitting machine). In the former case the file output_file is created in the current working directory.

IST-2000-25182 PUBLIC 73 / 97

Page 74: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

-nointif this option is specified every interactive question to the user is skipped. All warning messages and errors (if occurred) are written to the file dg-job-cancel_<UID>_<PID>.log under the /tmp directory. Location of the log file is configurable.

-debugwhen this option is specified, information about the API functions called inside the command are displayed on the standard output and are written to the file dg-job-cancel_<UID>_<PID>.log under the /tmp directory too. Location of the log file is configurable.

dg_jobIdjob identifier returned by dg-job-submit. The job identifier list must be first argument

of this command. EXIT STATUSdg-job-cancel exits with a status value 0 if all the specified jobs were cancelled successfully, >0 if errors occurred for each specified job id and <0 in case of partial failure. An example of partial failure is when more then one job has been specified: some jobs could be successfully removed and some others could be not removed.

EXAMPLES1. $> dg-job-cancel dg_jobId1 dg_jobId2

displays the following confirmation message:Are you sure you want to remove all jobs specified? [y/n]n: y

*********************************************** JOBS CANCEL OUTCOME********************************************** Cancel Success for job: - dg_jobId1Cancel Failure for job:

- dg_jobId2**********************************************

$> In this case the command exit code is –1.

IST-2000-25182 PUBLIC 74 / 97

Page 75: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

2. $> dg-job-cancel –all

displays the following confirmation message:Are you sure you want to remove all jobs owned by user Fabrizio Pacini? [y/n]n: y

********************************************** JOBS CANCEL OUTCOMECancel Success for job: - dg_jobId1

Cancel Success for job: - dg_jobId2

**********************************************

$>The exit code in this case is 0

SEE ALSO[A2], dg-job-submit.

IST-2000-25182 PUBLIC 75 / 97

Page 76: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

– dg-job-statusDisplays bookkeeping information about submitted jobs.

SYNOPSISdg-job-status [-help] dg-job-status [-version]dg-job-status < dg_jobId1 …. dg_jobIdn | -input input_file | -all > [-full] [-config path_name] [-output output_file] [-noint] [-debug]

DESCRIPTION This command prints the status of a job previously submitted using dg-job-submit. The job status request is sent to the LB that provides the requested information. This can be done during the whole job life.dg-job-status can monitor one or more jobs: the jobs to be checked are identified by one or more job identifiers (dg_jobIds returned by dg-job-submit) provided as arguments to the command and separated by a blank space. If the -all option is specified, information about all the jobs owned by the user submitting the command is printed on the standard output. When the command is launched with the –all option, neither can a dg_jobId be specified nor can the –input option be specified. The –input option permits to specify a file (input_file) that contains the dg_jobIds to monitor. The format of the file must be as follows: one dg_jobId for each line and comment lines have to begin with a “#” or a “*” character. When using this option the user is requested for choosing among all, one or a subset of the listed job identifiers. If the input_file does not represent an absolute path, it will be searched in the current working directory.If the user wants to use his “private” configuration file, this can be done using option –config path_name. The job information displayed to the user encompasses (bookkeeping information):

- dg_jobId (the job unique identifier)- Status (the job current status)- Job Owner (User Certificate Subject)- Location (Id of RB, JSS or CE)- Destination (Id of CE where the job will be transferred to)- Status Enter Time (when the job entered actual state)- Last Update Time (last known event timestamp)- Status Reason (reason for being in this state)

If the -full option is specified, dg-job-status displays a long description of the queried jobs by printing in addition the following information:

- CE Node (id of cluster(s) node where the job is running)

IST-2000-25182 PUBLIC 76 / 97

Page 77: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

- JssId (job identifier in the JSS)- GlobusId (job identifier in the Globus job-manager)- LocalId (id in the CE queue (PBS, LSF, ..)) - Job Description (JDL) (complete JDL description of the job)- JSS Job Description (JDL) (complete JDL job description as sent to the JSS)- Job Description (job description for Condor-G built from the JDL one)- Moving (intermediate state: JobTransfer

but neither JobAccepted nor JobRefused has been logged yet; in this case ‘state’ and ‘location’ refer to the source of job transfer.)

- Cancelling (whether job cancellation is in progress)- Cancel Reason (cancellation status message)

Information fields that are not available (i.e. not returned by the LB) are not printed at all to the user.The job Status possible values are reported in Annex 7.2. Details on the Job Status Diagram can be found in [A4].

OPTIONS

-help displays command usage.

-versiondisplays UI version.

-alldisplays status information about all job owned by the user submitting the command. This option can’t be used either if one or more dg_jobIds have been specified or if the –input option has been specified. All LBs listed in the UI configuration file UI_ConfigENV.cfg are contacted to fulfil this request. If used this option shall be providede as first argument of the command.

-input input_file-i input_file

displays bookkeeping info about dg_jobIds contained in the input_files. When using this option the user is interrogated for choosing among all, one or a subset of the listed job identifiers. This option can’t be used either if one or more dg_jobIds have been specified or if the –all option has been specified. If used, this option shall be provided as first argument of the command.

IST-2000-25182 PUBLIC 77 / 97

Page 78: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

-fulldisplays a long description of the queried jobs

-config path_name-c path_name

if the command is launched with this option, the configuration file pointed to by path_name is used instead of the standard configuration file.

-output output_file-o output_file

writes the bookkeping information in the file specified by output_file instead of the standard output. output_file can be either a simple name or an absolute path (on the submitting machine). In the former case the file output_file is created in the current working directory.

-nointif this option is specified every interactive question to the user is skipped. All warning messages and errors (if any) are written to the file dg-job-status_<UID>_<PID>.log under the /tmp directory. Location of log file is configurable.

-debugwhen this option is specified, information about the API functions called inside the command are displayed on the standard output and are written to the file dg-job-status_<UID>_<PID>.log under the /tmp directory too. Location of log file is configurable.

dg_jobIdjob identifier returned by dg-job-submit. Job identifiers must always be provided as first arguments of the command.

EXIT STATUSdg-job-status exits with a value of 0 if the status of all the specified jobs is retrieved correctly, >0 if errors occurred for each specified job id and <0 in case of partial failure. An example of partial failure is when more then one job is specified: status info could be successfully retrieved for some jobs and not retrieved for some others.

IST-2000-25182 PUBLIC 78 / 97

Page 79: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

EXAMPLES$> dg-job-status dg_jobId2

displays the following lines:

********************************************************************

BOOKKEEPING INFORMATION

Printing status for the job: https://grid004f.cnaf.infn.it:7846/155.198.211.205/085936117861491?grid004f.cnaf.infn.it:7771

---

dg_JobId = https://grid004f.cnaf.infn.it:7846/155.198.211.205/085936117861491? grid004f.cnaf.infn.it:7771

Job Owner = /C=IT/O=ESA/OU=ESRIN/CN=Fabrizio Pacini/[email protected]

Status = Ready

Location = grid004f.cnaf.infn.it

Job Destination = skurut.cesnet.cz:2119/jobmanager-pbs-wp1

Status Enter Time = Wed Sep 19 08:20:53 2001

Last Update Time = Wed Sep 19 08:35:13 2001

********************************************************************

$>

SEE ALSO[A1], [A2], [A4], dg-job-submit.

IST-2000-25182 PUBLIC 79 / 97

Page 80: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

– dg-job-get-logging-infoDisplays logging information about submitted jobs.

SYNOPSISdg-job-get-logging-info [-help]dg-job-get-logging-info [-version]dg-job-get-logging-info < dg_jobId1 …. dg_jobIdn | -input input_file | -all > [-from T1] [-to T2] [-full] [-level] [-config path_name] [-output output_file] [-noint] [-debug]

DESCRIPTION This command queries the LB persistent DB for logging information about jobs previously submitted using dg-job-submit. The job logging information are stored permanently by the LB service and can be retrieved also after the job has terminated its life-cycle, differently from the bookkeeping information that are in some way “consumed” by the user during the job existence. The dg-job-get-logging-info request is sent to the LB service that queries the DB and returns the retrieved information. Contents of the logging information are:

- Event Type (possible event types are listed in Annex 7.3)- dg_jobId - Logging Level- Date (UTC)- Job Transfer Destination- Host Name- Job Run Node- Source Program- Job Owner

If the command is issued with the –full option additional information consisting in the job description, in JDL or the one for Condor-G or both according to the WMS component that has logged the event, is printed to the user. Data on several jobs can be queried by specifying a list of job identifiers separated by a blank space as arguments of the command. Moreover the –input option permits to specify a file (input_file) which contains the dg_jobId whose information are requested. The format of the file must be as follows: one dg_jobId for each line and comment lines have to begin with a “#” or a “*” character. When using this option the user is interrogated for choosing among all, one or a subset of the listed job identifiers. If the input_file does not represent an absolute path, it will be searched in the current working directory.If the -all option is used, logging information about all the jobs owned by the user submitting the command are printed on the standard output. When the command is launched with the -all option, neither can one (or more) dg_jobId be specified nor is the –input option allowed.

IST-2000-25182 PUBLIC 80 / 97

Page 81: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

To perform more complex queries, the user can specify a time range he is interested to by using the -from (T1) and -to (T2) options. These options take as input timestamps in the format hhmmssDDMMYYYY (UTC) and make the command retrieve job logging information only for the specified time interval. If these options are not specified the default values are: Unix Epoch Time (for T1) and current time, i.e. the time the command has been submitted (for T2).Each event logged in the LB has an associated log level according to “Universal Format for Logger Messages” (see draft-abela-ulm-05.txt available at http://www-didc.lbl.gov/NetLogger/draft-abela-ulm-05.txt). Default value for the log level used by WMS components is System, anyway there could be special situations in which problems investigation is needed and additional events are logged with the Debug log level. The –level option of the dg-job-get-logging-info command allows the user to have returned from the LB also events having a Debug log level. If no –level option is used only events with System log level are returned.The -output option can be used to have the retrieved information written in the file identified by output_file instead of the standard output. output_file can be either a simple name or an absolute path (on the submitting machine). In the former case the file output_file is created in the current working directory.If the user wants to use his “private” configuration file this could be done using option –config path_name.

OPTIONS

-help displays command usage.

-versiondisplays UI version.

-allretrieves logging information about all job owned by the user submitting the command. If used, this option must be provided as first command argument.

-input input_file-i input_file

retrieves logging info for all dg_jobIds contained in the input_files. This option can’t be used either if specifying one or more dg_jobIds nor if using the –all option. If used, this option must be provided as first command argument.

IST-2000-25182 PUBLIC 81 / 97

Page 82: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

-from T1 gets job events logged since the specified date T1. T1 must be in the form hhmmss[DDMMYYYY]. If DDMMYYYY is not provided, input time is considered in the current day.

-to T2 gets job events logged up to the specified date T2. T2 must be in the form hhmmss [DDMMYYYY]. If DDMMYYYY is not provided, input time is considered in the current day.

-full makes the command display addition job information fields (i.e. the job description in JDL and /or the one for Condor-G).

-level makes the command retrieve job’s information for events having a log level equal to System and Debug. Otherwise only events with a System log level are returned.

-config path_name-c path_name

if the command is launched with this option, the configuration file pointed to by path_name is used instead of the standard configuration file.

-output output_file-o output_file

writes the logging information in the file specified by output_file instead of the standard output. output_file can be either a simple name or an absolute path (on the submitting machine). In the former case the file output_file is created in the current working directory.

-nointif this option is specified every interactive question to the user is skipped. All warning messages and errors (if occurred) are written to the file dg-job-logging_<UID>_<PID>.log under the /tmp directory. Location for log file is configurable.

IST-2000-25182 PUBLIC 82 / 97

Page 83: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

-debugwhen this option is specified, information about the API functions called inside the command are displayed on the standard output and are written to the file dg-job-logging_<UID>_<PID>.log under the /tmp directory too. Location for log file is configurable.

dg_jobIdjob identifier returned by dg-job-submit. Job identifiers must always be provided as first arguments for this command.

EXIT STATUSdg-job-get-logging-info exits with a value of 0 if the status of all the specified jobs is retrieved correctly, >0 if errors occurred for each specified job and <0 in case of partial failure. An example of partial failure is when more then one job is specified: some job’s logging info could be successfully retrieved and some others could be not retrieved.

EXAMPLES1. $> dg-job-get-logging-info –all –from 12150005052001 –to 10000006052001 –output

mylog.txt

writes in file mylog.txt in the current working directory logging information about my jobs for the time since 12:15 on 5 May 2001 up to 10 o’clock on 6 May 2001.

2. $> dg-job-get-logging-info dg_jobId1 –from 113500where dg_jobId1 = https://grid004f.cnaf.infn.it:7846/131.154.99.104/14010479391529?grid004f.cnaf.infn.it:7771displays the following output:*******************************************************************LOGGING INFORMATION:

Printing info for the Job : dg_jobId1For the Event : JobTransfer ---Event Type = JobTransferdg_jobId = dg_jobId1Logging Level = SystemDate(UTC) = Tue Sep 4 16:12:56 2001Job Destination = ResourceBroker/grid004f.cnaf.infn.it:7771

IST-2000-25182 PUBLIC 83 / 97

Page 84: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Host Name = lx01Source Program = UserInterfaceJob Owner = /O=Grid/O=UKHEP/OU=hep.ph.ic.ac.uk/CN=Fabrizio PaciniJob Descr (JDL) = [ - InputSandboxPath = "/tmp/datamat_161251122764136" - CertificateSubject = "/O=Grid/O=UKHEP/OU=hep.ph.ic.ac.uk/ CN=Fabrizio Pacini" - Rank = other.FreeCPUs * other.AverageSI00 – other.EstimatedTraversalTime - Executable = "WP1testA" - retrycount = 3 - UserContact = "[email protected]" - StdInput = "sim.dat" - InputSandbox = {"/home/datamat/HandsOn-0409/file*","/ home/datamat/DATA/*"} - StdOutput = "sim.out" - StdError = "sim.err" - requirements = other.OpSys == "Linux RH 6.1" || other.OpSys == "Linux RH 6.2" - OutputSandbox = {"/tmp/sim.err","/tmp/test.out"} - dg_jobId = dg_jobId1 ]********************************************************************$>

SEE ALSO[A2], [A4], dg-job-submit.

IST-2000-25182 PUBLIC 84 / 97

Page 85: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

- dg-job-id-infoThis is a simple utility for the user; it just parses the dg_jobId string and displays formatted information contained in the job identifier. This is a “local” command since it does not need any interaction of the UI with the other WMS components.

SYNOPSISdg-job-id-info [-help]dg-job-id-info [-version]dg-job-id-info < dg_jobId1 …. dg_jobIdn | -input input_file> [-output output_file]

DESCRIPTION This command is used to display formatted information about the job from the dg_jobId of a job previously submitted. It is possible to supply one or more dg_jobIds as input to this command. Moreover it is possible to parse the dg_jobIds listed in a file using the –input option. The parsed information is printed on standard output; redirection of the output in a file can be done through the –output option.It is important to remark that since no interaction of the UI with other external components is foreseen for this command, it does not need any certificate to work.

OPTIONS-help

displays command usage.

-versiondisplays UI version.

-input input_file-i input_file

parses the dg_jobIds listed in the input_file. This option can’t be used specifying one (or more) dg_jobIds. If used, this option must be first argument of the command.

-output output_file-o output_file

writes the formatted information in the file specified by output_file instead of the standard output. output_file can be either a simple name or an absolute path (on the submitting machine). In the first case the file output_file is created in the current working directory.

IST-2000-25182 PUBLIC 85 / 97

Page 86: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

dg_jobIdjob identifier returned by dg-job-submit. Job identifiers must be first argument of this command.

EXIT STATUSdg-job-id-info exit with a value of 0 if no error occurs, >0 if errors occurred for each specified job identifier and <0 in case of partial failure.

EXAMPLES$> dg-job-id-info https://grid001f.pd.infn.it:2234/124.75.74.12/134534534534234?http://grid004f.cnaf.infn.it:4577

displays the following output:********************************************************************JOB ID INFOPrinting info for the Job ID :https://grid004.it:2234/124.75.74.12/134534534534234?www.rb.com:4577

Logging and Bookkeeping Server Address = https://grid001f.pd.infn.itLogging and Bookkeeping Server Port = 2234 Resource Broker Server Address = http://grid004f.cnaf.infn.it Resource Broker Server Port = 4577 Submission Time (hh:mm:ss) = 13:45:34 (UTC) User Interface Machine IP Address = 124.75.74.12 User Interface Process Identifier = 53453 Randomly Generated Number (0000-9999) = 4234 ********************************************************************$>

IST-2000-25182 PUBLIC 86 / 97

Page 87: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

7. ANNEXES

7.1. JDL ATTRIBUTESThe JDL is a fully extensible language (i.e. it does not rely on a fixed schema), hence the user is allowed to use whatever attribute for the description of a job without incurring in er-rors. Anyway only a certain set of attributes (that we will refer to as “supported” attributes) can be taken into account by the WMS components for scheduling a submitted job. Indeed in order to be actually used for selecting a resource, an attribute used in a job class-ad needs to have a correlation with some characteristic of the resources that are published in the GIS (aka MDS).The “supported” attributes, their meaning and the way to use them to describe a job are dealt in detail in document [A7] also available at the following URL:http://www.infn.it/workload-grid/docs/DataGrid-01-NOT-0101-0_4.pdf

7.2. JOB STATUS DIAGRAMThe following Figure 1 reports the status that a job can assume during its life cycle.

Figure 1 Job Life Cycle

Job status in Figure 1 are briefly described hereafter (see [A4] for further details):

STATUS:

- SUBMITTED: job is submitted but not yet received by the RB (i.e. it is waiting in the UI).- WAITING: job is waiting in the queue in the RB for various reasons (e.g. no appropriate

CE (cluster) found; required dataset is not available, dependency on other job etc.).- READY: appropriate CE found; job is transferred to the CE.- SCHEDULED: job is waiting in the queue on the CE.

IST-2000-25182 PUBLIC 87 / 97

Page 88: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

- RUNNING: job is running.- CHKPT: job is check-pointed and is waiting for restart; this is a system checkpointing of

jobs running on a CE, independently from Application Checkpointing.- DONE: job exited.- OUTPUTREADY: job exited and RB is ready to return output sandbox. - ABORTED: job was aborted for various reasons (e.g. waiting in the queue in RB, JSS or

CE for too long, over-use of quotas, expiration of user credentials, etc.).- CLEARED: output files were transferred to the user, job is removed from bookkeeping

database.

IST-2000-25182 PUBLIC 88 / 97

Page 89: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

7.3. JOB EVENT TYPESHereafter is reported the list of job event types that could be returned to the user by the dg-job-get-logging-info command. They are organized in several categories:

Events concerning a job transfer between components: JobTransfer A component generates this event when it tries to transfer a job to

some other component. This event contains the identification of the receiver and possibly the job description expressed in the language accepted by the receiver. The result of the transfer, i.e. success or failure, as seen by the sender is also included.

JobAccept A component generates this event when it receives a job from another WMS component. This event contains also the locally assigned job identifier.

JobRefuse A component generates this event when the transfer of a job to some other component fails. The source of this event, which also includes the reason for the failure, can be either the receiver or the sender, e.g. when the receiver is not available.

Events concerning a job state change during processing within a component: JobAbort The job processing is stopped due to system conditions or user

request. The reason is included in the event. JobRun The job is started on a CE. JobChkpt The job is check-pointed on a CE. The reason is included in the event. JobDone The job has completed. The process exit status is included in the

event. JobClear The user has successfully retrieved the job results, e.g. the output files

specified in the output sandbox (see Section 6.1.3); the job will be removed from the bookkeeping database in the near future.

JobScheduled The job has been successfully submitted to the appropriate CE, i.e. passed to the LRMS.

JobFail The job failed during its execution on the CE.

Events associated with the Resource Broker only: JobMatch An appropriate match between a job and a Computing Element has

been found. The event contains the identifier of the selected CE. JobPending A match between a job and a suitable Computing Element was not

found, so the job is kept pending by the RB. The event contains the reason why no match was found.

JobStatus It contains information about resources consumed by the job. This event is generated periodically by the CE and eliminates the need for direct communication from the LB Service to the CE. Two types of information should be considered: cumulative information (e.g. CPU time) and non-cumulative information (e.g. memory consumption). For cumulative properties only the most recent value is kept in the database. For non-cumulative

IST-2000-25182 PUBLIC 89 / 97

Page 90: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

properties, on the other end, all the values are stored in order to allow for example.

More details on job event types can be found in [A4].

IST-2000-25182 PUBLIC 90 / 97

Page 91: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

7.4. WILDCARD PATTERNSThe wildcard patterns that can be included in the InputSandbox attribute expression are used by the UI to perform file name “globbing” in a fashion similar to the UNIX csh shell. The result of the “globbing” is a list of the files whose names match any of the specified patterns. The admitted special characters together with their meaning are listed hereafter:

- * wildcard for any string- ? wildcard for any single character- [chars ] delimits a wildcard matching any of the enclosed

characters. If chars contains a sequence of the form a-b then any character between a and b (inclusive) will match. Such an expression can be negated by means of the special character “!” ([!chars] matches any character not in chars).

EXAMPLESConsider a directory where “ls –F” gives:

1file a1.f apple.o bob.o h4374.f john.o2files ab apps/ foo.c h4374.o mydir/ABS ab.f bob foo.f john stuff/a1 apple.f bob.f gh john.f

That is to say some files and directories. The examples below show the way the mentioned wildcards are expanded (the notation => indicates the result of typing the command).

1) Every two letter file name:echo ?? => a1 ab gh

2) Every two character name starting with “a“:echo a? => a1 ab

3) Every file starting with j, o, h, or n:echo [john]* => h4374.f h4374.o john john.f john.o

4) Include a range, e.g. everything starting with an upper case letter or a digit: echo [A-Z0-9]* => 1file 2files ABS

5) Negate a range:

IST-2000-25182 PUBLIC 91 / 97

Page 92: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

echo [!john]*.f => a1.f ab.f apple.f bob.f foo.f

6) Every file starting in “a” and ending in .f: echo a*.f => a1.f ab.f apple.f

IST-2000-25182 PUBLIC 92 / 97

Page 93: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

7.5. THE MATCH MAKING ALGORITHMThe main task performed by the RB is to find the best suitable Computing Element to execute the job at. In order to accomplish this task the RB interacts with the other WMS components. More precisely, the Replica Catalogue (RC) and the Information Index (II) are the two main WMS components which supply the RB with all the information required for the actual resolution of the matches between job requirements and Computing Element capabilities (i.e. runtime environments, data access features, processing resources etc.). The following sections provide a description of the matchmaking algorithm performed by the RB. At this aim it is worth to identify three different scenarios to be dealt with separately : direct job submission, job submission without data-access requirements, job submission with data-access requirements.

7.5.1. Direct Job SubmissionThe simplest scenario is to consider the case where the JDL submitted by the UI contains a link to the resource to submit the job at, i.e. the Computing Element identifier (CEId). In this case the RB doesn’t perform any matchmaking algorithm at all, but simply limits its action to the delegation of the submission request to the JSS, for the actual submission.

Figure 2 - Submission with CEId known

It should be pointed out that, if the CEId is specified then the RB neither checks whether the user who owns the job is authorised to access the given CE, nor interacts with the RC for the resolution of files requirements, if any. The only check performed by the RB is the JDL syntax one, while converting the JDL into a ClassAd.

7.5.2. Job submission without data-accesss requirementsLet’s do a little step onwards and consider the scenario where the user specifies a job with given execution requirements, but without data files ones. Once the JDL has been received by the RB and successfully converted into ClassAd (job-ad) the RB starts the actual match-making algorithm to find if the characteristics and status of Grid resources matches the job requirements. The matchmaking algorithm consists of two different phases: requirements check and the rank computation. During the requirements check phase the RB contacts the II in order to create a set of the more likely CEs to execute the job at, thus compliant with user requirements and user certificate subject, as well. Taking into account that all the CE attributes involved in the JDL

IST-2000-25182 PUBLIC 93 / 97

Page 94: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

requirements (defined by the user to express his/her needs) are almost constant in time (i.e. it’s improbable that a CE changes its operating system or its runtime environment in the very short term, e.g. every half an hour), it is clear that all the information cached in the II represent a good source for testing matches between job requirements and CE features. It is clearly more efficient than contacting each CE to find out the same information.

Figure 3 - Requirements checking phase

Once the RB has created the set of the suitable CEs where the job can be executed, the RB performs the second phase of the matchmaking algorithm, which allows the RB to acquire information about the “quality” of the just found suitable CEs. On the other hand if no suitable CEs have been found the RB sends an e-mail notification at the user recipient specified in the JDL.In the ranking phase the RB contacts directly the LDAP server of the involved CEs to obtain the values of those attributes, which appears in the rank expression of the received JDL. It should be pointed out that conversely to the previous phase, it is better to contact each suitable CE, rather than using the II as source of information, since the rank attributes represents variables varying in time very frequently (i.e. FreeCPUs, FreeMemory).Currently if all the suitable CEs are assigned with the same rank value the RB performs a “random” choice, i.e. the first CE in the list of suitable ones will receive the job for executing it. It is clear that a more sophisticated method should be adopted by the RB in case of equal ranking CEs, decoupling the user from the need of defining significant rank expressions. One possibility could be the execution of a post-ranking selection, which depending on performance factor, which should be defined, supply the user with the optimal CE choice for the actual submission of a given job. Rank computation is depicted in Figure 4.

IST-2000-25182 PUBLIC 94 / 97

Page 95: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Figure 4 – Rank computation phase

7.5.3. Job submission with data-access requirementsThe Resource Broker interacts with the Replica Management services in order to find out the most suitable Computing Element taking into account the Storage Elements where both input data sets are physically stored and output data sets should be staged on completion of job execution. Before describing the action taken by the RB upon reception of a JDL where both data-access and computing requirements are present, it is worth to recall the JDL attributes which represent a data requirement at the RB side: OutputSE, InputData and DataAccessProtocol, respectively representing the Storage Element (SE) where the output file should be staged, the input files (LFN, PFN) required as input for the actual job execution and the protocol “spoken” by the application to access such files.The main two phases of the match making algorithm performed by the RB remain unchanged, but the RB executes the requirements check and ranking for each class of CEs satisfying the data-access requirements. Additionally, the RB performs a pre-match processing to find out and classify those CEs satisfying both data-access and user authorisation requirements.During the pre-match processing phase the RB contacts the RC (the one specified through the ReplicaCatalog JDL attribute) in order to resolve logical file names and collect all the information about SEs containing at least one input data file. This information will be used to write down the broker-info-file, which will be sent to the JSS for the actual submission within the input sandbox. At this point the RB is ready to start the CEs classification procedure, during which the RB contacts the II in order to find the CEs satisfying both the authorization requirements and having the OutputSE within its own LAN (CloseSE). Using the information retrieved during the file name resolution, the RB classifies those CEs depending on the number of input files

IST-2000-25182 PUBLIC 95 / 97

Page 96: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

stored in storage element(s) which is (are) close to the CE itself and speak at least one of the protocols specified in the DataAccessProtocol within the JDL.

Figure 5 - CEs classification procedure

Upon completion of the CE data classification, the RB is ready for the actual match making and starts the requirements checking phase for each CE belonging to the first non-empty class of CEs, which can access the highest number of distinguished files. If a CE doesn’t satisfy the user requirements it is removed from its class. The requirements checking phase is repeated until at least a CE matching the user requirement is found.Once the requirements checking phase is completed either the RB knows a set of CEs satisfying both data-access and computing requirements having access to the maximum number of distinguished input files, or there does not exist a suitable CE matching such requirements. In the first case the RB starts the ranking phase in order to find the best CE to whichsubmit the job. In the second one the RB sends an e-mail notification at the user’s recipient specified in the JDL.

IST-2000-25182 PUBLIC 96 / 97

Page 97: WMS SW Admin and User Guideserver11.infn.it/workload-grid/docs/aDataGrid-01-TEN... · Web viewTitle WMS SW Admin and User Guide Subject Installation, Configuration and Usage of WMS

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

(PM9 Release)

Doc. Identifier:

DataGrid-01-TEN-0118-0_2

Date: 22/01/2002

Figure 6 - Match-Making algorithm

IST-2000-25182 PUBLIC 97 / 97