155
DataGrid WP1 - WMS S OFTWARE A DMINISTRATOR AND U SER G UIDE Document identifier: DataGrid-01-TEN-0118-0_7 Date: 19/07/2002 Work package: WP1 Partner: Datamat SpA Document status Deliverable identifier: IST-2000- 25182 PUBLIC 1 / 155

server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

DataGr id

W P 1 - W M S S O F T W A R E A D M I N I S T R A T O R A N D U S E R G U I D E

Document identifier: DataGrid-01-TEN-0118-0_7

Date: 20/07/2002

Work package: WP1

Partner: Datamat SpA

Document status

Deliverable identifier:

Abstract: This note provides the administrator and user guide for the WP1 WMS software.

IST-2000-25182 PUBLIC 1 / 116

Page 2: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

Delivery Slip

Name Partner Date Signature

From Fabrizio Pacini Datamat SpA 19/07/2002

Verified by Stefano Beco Datamat SpA 19/07/2002

Approved by

Document Log

Issue Date Comment Author

0_0 21/12/2001 First draft Fabrizio Pacini

0_1 14/01/2002 Draft Fabrizio Pacini

0_2 24/01/2002 Draft Fabrizio Pacini

0_3 05/02/2002 Draft Fabrizio Pacini

0_4 15/02/2002 Draft Fabrizio Pacini

0_5 08/04/2002 Draft Fabrizio Pacini

0_6 13/05/2002 Fabrizio Pacini

0_7 19/07/2002 Fabrizio Pacini

Document Change Record

Issue Item Reason for Change

0_1 General update

Take into account changes in the rpm generation procedure.

Add missing info about daemons (RB/JSS/CondorG) starting accounts

Some general corrections

0_2 General Update Add Cancelling and Cancel Reason information.

Add OUTPUTREADY job state.

Add new profile rpms.

IST-2000-25182 PUBLIC 2 / 116

Page 3: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

Document Change Record

Issue Item Reason for Change Remove /etc/workload* shell scripts.

Add summary map table (user / daemon).

Add CEId format check.

Add new job cancel notification.

0_3 General Update

Modified RB/JSS start-up procedure

Add gridmap-file users/groups issues

Add proxy certificate usage by daemons

Job attribute CEId changed to SubmitTo

Add DGLOG_TIMEOUT setting

Add workload-profile and userinterface-profile rpms

0_4 General Update

Add configure option –enable-wl for system configuration files

Add installation checking option –with-globus for Globus to the Workload configure

Add new Information Index configure options

Remove edg-profile and edg-user-env rpms from II and UI dependencies

Add security configuration rpm’s for all the Certificate Authorities to UI dependencies

Add new parameters to RB configuration file

Add new Job Exit Code field to the returned job status info

Remove dependence from SWIG in the userinterface binary rpm

0_5 General Update

Modify command options syntax (getopt-like style)

Add MyProxy server and client package installation/utilisation

Modify job cancel notification

Add Userguide rpm

0_6 General Update Modify configure options for the various components

UI commands modified to use python2 executable

IST-2000-25182 PUBLIC 3 / 116

Page 4: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

Document Change Record

Issue Item Reason for Change Clarify myproxy usage

Explain how RB/LB addresses in the UI config file are used by the commands

Add –logfile option to the UI commands

0_7 General Update

Modify configure options for the various components

Clarify UI commands –notify option usage

Add make test target for UI

Files

Software Products User files

Word 97 document.doc

Acrobat Exchange 4.0 DataGrid-01-TEN-0118-0_7-Document.pdf

IST-2000-25182 PUBLIC 4 / 116

Page 5: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

Content

1. Introduction......................................................................................................................................... 7

1.1. OBJECTIVES OF THIS DOCUMENT........................................................................................................71.2. APPLICATION AREA............................................................................................................................ 71.3. APPLICABLE DOCUMENTS AND REFERENCE DOCUMENTS......................................................................71.4. DOCUMENT EVOLUTION PROCEDURE...................................................................................................81.5. TERMINOLOGY................................................................................................................................... 8

2. EXECUTIVE SUMMARY.................................................................................................................. 10

3. BUILD PROCEDURE....................................................................................................................... 113.1. REQUIRED SOFTWARE..................................................................................................................... 113.2. BUILD INSTRUCTIONS....................................................................................................................... 12

3.2.1. Environment Variables........................................................................................................123.2.2. Compiling the code.............................................................................................................14

3.3. RPM INSTALLATION......................................................................................................................... 24

4. INSTALLATION AND CONFIGURATION.......................................................................................264.1. LOGGING AND BOOKKEEPING SERVICES............................................................................................26

4.1.1. Required software...............................................................................................................264.1.1.1. LB local-logger...............................................................................................................................264.1.1.2. LB Server.......................................................................................................................................26

4.1.2. RPM installation.................................................................................................................. 274.1.3. The installation tree structure..............................................................................................28

4.1.3.1. LB local-logger...............................................................................................................................284.1.3.2. LB Server.......................................................................................................................................29

4.1.4. Configuration...................................................................................................................... 294.1.5. Environment Variables........................................................................................................29

4.2. RB AND JSS................................................................................................................................... 314.2.1. Required software...............................................................................................................31

4.2.1.1. PostgreSQL installation and configuration....................................................................................314.2.1.2. Condor-G installation and configuration........................................................................................324.2.1.3. ClassAd installation and configuration..........................................................................................334.2.1.4. ReplicaCatalog installation and configuration...............................................................................34

4.2.2. RPM installation.................................................................................................................. 344.2.3. The Installation Tree structure............................................................................................344.2.4. Configuration...................................................................................................................... 39

4.2.4.1. RB configuration............................................................................................................................394.2.4.2. JSS configuration..........................................................................................................................42

4.2.5. Environment variables........................................................................................................434.2.5.1. RB..................................................................................................................................................444.2.5.2. JSS................................................................................................................................................44

4.3. INFORMATION INDEX........................................................................................................................ 464.3.1. Required software...............................................................................................................464.3.2. RPM installation.................................................................................................................. 464.3.3. The Installation tree structure.............................................................................................474.3.4. Configuration...................................................................................................................... 474.3.5. Environment Variables........................................................................................................48

4.4. USER INTERFACE............................................................................................................................. 494.4.1. Required software...............................................................................................................494.4.2. RPM installation.................................................................................................................. 504.4.3. The tree structure...............................................................................................................514.4.4. Configuration...................................................................................................................... 524.4.5. Environment variables........................................................................................................54

IST-2000-25182 PUBLIC 5 / 116

Page 6: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

4.5. DOCUMENTATION....................................................................................................................... 54

5. OPERATING THE SYSTEM............................................................................................................565.1. LB LOCAL-LOGGER.......................................................................................................................... 56

5.1.1. Starting and stopping daemons..........................................................................................565.1.2. Troubleshooting.................................................................................................................. 57

5.2. LB SERVER..................................................................................................................................... 585.2.1. Starting and stopping daemons..........................................................................................585.2.2. Purging the LB database....................................................................................................595.2.3. Troubleshooting.................................................................................................................. 59

5.3. RB AND JSS................................................................................................................................... 605.3.1. Startig PostgreSQL.............................................................................................................605.3.2. Starting and stopping JSS and RB daemons......................................................................605.3.3. RB troubleshooting.............................................................................................................615.3.4. JSS troubleshooting............................................................................................................61

5.4. INFORMATION INDEX........................................................................................................................ 615.4.1. Starting and stopping daemons..........................................................................................61

6. USER GUIDE................................................................................................................................... 636.1. USER INTERFACE............................................................................................................................. 63

6.1.1. Security............................................................................................................................... 636.1.1.1. MyProxy.........................................................................................................................................64

6.1.2. Common behaviours...........................................................................................................666.1.3. Commands description.......................................................................................................70

7. ANNEXES...................................................................................................................................... 1057.1. JDL ATTRIBUTES........................................................................................................................... 1057.2. JOB STATUS DIAGRAM................................................................................................................... 1057.3. JOB EVENT TYPES......................................................................................................................... 1077.4. WILDCARD PATTERNS..................................................................................................................... 1097.5. THE MATCH MAKING ALGORITHM...................................................................................................111

7.5.1. Direct Job Submission......................................................................................................1117.5.2. Job submission without data-accesss requirements.........................................................1117.5.3. Job submission with data-access requirements................................................................113

7.6. Process/User Mapping Table......................................................................................................116

IST-2000-25182 PUBLIC 6 / 116

Page 7: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

1. INTRODUCTIONThis document provides a guide to the building, installation and usage of the WP1 WMS software released within the DataGrid project.

1.1. OBJECTIVES OF THIS DOCUMENTGoal of this document is to describe the complete process by which the WP1 WMS software can be installed and configured on the DataGrid test-bed platforms.Guidelines for operating the whole system and accessing provided functionalities are also provided.

1.2. APPLICATION AREAAdministrators can use this document as a basis for installing, configuring and operating WP1 WMS software. Users can refer to the User Guide chapter for accessing provided ser-vices through the User Interface.

1.3. APPLICABLE DOCUMENTS AND REFERENCE DOCUMENTSApplicable documents[A1] Job Description Language HowTo – DataGrid-01-TEN-0102-02 – 17/12/2001

(http://www.infn.it/workload-grid/docs/DataGrid-01-TEN-0102-0_2.pdf)

[A2] DATAGRID WP1 Job Submission User Interface for PM9 (revised presentation) – 23/03/2001 (http://www.infn.it/workload-grid/docs/20010320-JS-UI-datamat.pdf)

[A3] WP1 meeting - CESNET presentation in Milan – 20-21/03/2001(http://www.infn.it/workload-grid/docs/20010320-L_B-matyska.pdf)

[A4] Logging and Bookkeeping Service – 0705/2001(http://www.infn.it/workload-grid/docs/20010508-lb_draft-ruda.pdf)

[A5] Results of Meeting on Workload Manager Components Interaction – 09/05/2001(http://www.infn.it/workload-grid/docs/20010508-WM-Interactions-pacini.pdf)

[A6] Resource Broker Architecture and APIs – 13/06/2001 (http://www.infn.it/workload-grid/docs/20010613-RBArch-2.doc)

[A7] JDL Attributes - DataGrid-01-NOT-0101-0_6 – 04/02/2002(http://www.infn.it/workload-grid/docs/DataGrid-01-NOT-0101-0_6.pdf)

Reference documents[R1]

IST-2000-25182 PUBLIC 7 / 116

Page 8: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

1.4. DOCUMENT EVOLUTION PROCEDUREThe content of this document will be subjected to modification according to the following events: Comments received from Datagrid project members, Changes/evolutions/additions to the WMS components.

1.5. TERMINOLOGYDefinitionsCondor Condor is a High Throughput Computing (HTC) environment that can

manage very large collections of distributively owned workstationsGlobus The Globus Toolkit is a set of software tools and libraries aimed at the

building of computational grids and grid-based applications.

Glossaryclass-ad Classified advertisementCE Computing ElementDB Data BaseFQDN Fully Qualified Domain NameGDMP Grid Data Management Pilot ProjectGIS Grid Information Service, aka MDSGSI Grid Security Infrastructurejob-ad Class-ad describing a jobJDL Job Description LanguageJSS Job Submission ServiceLB Logging and Bookkeeping ServiceLRMS Local Resource Management SystemMDS Metacomputing Directory Service, aka GISMPI Message Passing Interface

PID Process Identifier

PM Project MonthRB Resource BrokerRC Replica CatalogueSE Storage ElementSI00 Spec Int 2000SMP Symmetric Multi Processor

IST-2000-25182 PUBLIC 8 / 116

Page 9: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

TBC To Be ConfirmedTBD To Be DefinedUI User InterfaceUID User IdentifierWMS Workload Management SystemWP Work Package

IST-2000-25182 PUBLIC 9 / 116

Page 10: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

2. EXECUTIVE SUMMARYThis document comprises the following main sections:Section 3: Build Procedure

Outlines the software required to build the system and the actual process for building it and generating rpms for the WMS components; a step-by-step guide is included.

Section 4: Installation and ConfigurationDescribes changes that need to be made to the environment and the steps to be performed for installing the WMS software on the test-bed target platforms. The resulting installation tree structure is detailed for each system component.

Section 5: Operating the SystemProvides actual procedures for starting/stopping WMS components processes and utilities.

Section 6: User GuideDescribes in a Unix man pages style all User Interface component commands allowing the user to access WMS provided services.

Section 7: AnnexesDeepens arguments introduced in the User Guide section that are considered useful for the user to better understand system behaviour.

IST-2000-25182 PUBLIC 10 / 116

Page 11: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

3. BUILD PROCEDUREIn the following section we give detailed instructions for the installation of the WP1 WMS software package. We provide a source code distribution as well as a binary distribution and explain installation procedures for both cases.

3.1. REQUIRED SOFTWAREThe WP1 software runs and has been tested on platforms running Globus Toolkit 2.0 Beta Release 21 on top of Linux RedHat 6.2. Hereafter are listed the software packages, apart from WP1 software version 1.0, that are required to be installed locally on a given site in order to be able to build the WP1 WMS on it. They are:

Globus Toolkit 2.0 Beta 21 or higher (download at http://datagrid.in2p3.fr/distribution/globus/beta-21)

Python 2.1.1 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

Swig 1.3.9 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

Expat 1.95.1 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

Expat-devel 1.95.1 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

MySQL Version 9.38 Distribution 3.22.32, for pc-linux-gnu (i686) (download at http://datagrid.in2p3.fr/distribution/config/external_services.html)

Postgresql 7.1.3 (http://datagrid.in2p3.fr/distribution/config/external_services.html)

Classads library (download at http://datagrid.in2p3.fr/distribution/external/RPMS/classads-0.0-edg2.i386.rpm)

CondorG 6.3.1 for INTEL-LINUX-GLIBC21 (download at http://datagrid.in2p3.fr/distribution/external/RPMS/CondorG-6.3.1-edg5.i386.rpm)

Perl IO Stty 0.02, Perl IO Tty 0.04 (download at http://datagrid.in2p3.fr/distribution/config/external.html )

MyProxy-0.4.4 (download at http://datagrid.in2p3.fr/distribution/external/RPMS/myproxy-0.4.4-edg1.i386.rpm )

Perl 5 (download at http://datagrid.in2p3.fr/distribution/config/external.html)

IST-2000-25182 PUBLIC 11 / 116

Page 12: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

gcc version 2.95.2

GNU make version 3.78.1 or higher

GNU autoconf version 2.13

GNU libtool 1.3.5

GNU automake 1.4

GNU m4 1.4 or higher

RPM 3.0.5

sendmail 8.11.6

3.2. BUILD INSTRUCTIONSThe following instructions deal with the building of the WMS software and hence apply to the source code distribution.

3.2.1. Environment VariablesBefore starting the compilation, some environment variables related to the WMS components can be set or configured by means of the configure script. This is needed only if package defaults are not suitable. Involved variables are listed below:

- GLOBUS_LOCATION base directory of the Globus installationThe default path is /opt/globus.

- MYSQL_INSTALL_PATH base directory of the MySQL installationThe default path is /usr.

- EXPAT_INSTALL_PATH base directory of the Expat installation.The default path is /usr.

- GDMP_INSTALL_PATH base directory of the Gdmp installationThe default path is /opt/edg.

- PGSQL_INSTALL_PATH base directory of the Pgsql installation. The default path is /usr.

IST-2000-25182 PUBLIC 12 / 116

Page 13: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

- CLASSAD_INSTALL_PATH base directory of the Classad library installation. The default path is /opt/classads.

- CONDORG_INSTALL_PATH base directory of the Condor installation. The default path is /opt/CondorG.

- PYTHON_INSTALL_PATH base directory of the Python installation.The default path is /usr.

- SWIG_INSTALL_PATH base directory of the Swig installation .The default path is /usr/local.

- MYPROXY_INSTALL_PATH base directory of the MyProxy installation .The default path is /usr/local.

In order to build the whole WP1 package, all the environment variables in the previous list must be set. Instead for building the User Interface module, the environment variables that need to be set are the following:

- GLOBUS_LOCATION- CLASSAD_INSTALL_PATH- PYTHON_INSTALL_PATH- SWIG_INSTALL_PATH- EXPAT_INSTALL_PATH

If you plan to build the Job Submission and Resource Broker module, variable to set are:

- GLOBUS_LOCATION- MYSQL_INSTALL_PATH- EXPAT_INSTALL_PATH- GDMP_INSTALL_PATH- PGSQL_INSTALL_PATH- CLASSAD_INSTALL_PATH- CONDORG_INSTALL_PATH

If you plan to build the Proxy module, variables to set are:- GLOBUS_LOCATION

IST-2000-25182 PUBLIC 13 / 116

Page 14: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

- MYPROXY_INSTALL_PATHWhilst the LB server and Local Logger modules, to be built need the following environment variables:

- GLOBUS_LOCATION- MYSQL_INSTALL_PATH- EXPAT_INSTALL_PATH

Finally, the LB library module needs:

- GLOBUS_LOCATION- EXPAT_INSTALL_PATH

and the Information Index module only:

- GLOBUS_LOCATION

3.2.2. Compiling the codeAfter having unpacked the WP1 source distribution tar file, or having downloaded the code directly from the CVS repository, change your working directory to be the WP1 base directory, i.e. the Workload directory, and run the following command:

./recursive-autogen.sh

At this point the configure command can be run. The configure script has to be invoked as follows:

./configure <options>

The list of options that are recognized by configure is reported hereafter:

---help

--prefix=<installation path> It is used to specify the Workload installation dir. The default installation dir is /opt/edg.

--enable-all

IST-2000-25182 PUBLIC 14 / 116

Page 15: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

It is used to enable the build of the whole WP1 package. By default this option is turned on.

--enable-userinterface It is used to enable the build of the User Interface module with Logging/Client, Broker/Client, Broker/Socket++ and ThirdParty/trio/src sub modules. By default this option is turned off.

--enable-userinterface_profile

It is used to enable the installation of the User Interface profile. By default this option is turned off.

--enable-jss_rbIt is used to enable the build of the Job Submission and Resource Broker modules with Logging/Client, Common, test, Proxy/Dgpr, and ThirdParty/trio/src submodules. By default this option is turned off.

--enable-jss_profile

It is used to enable the installation of the Job Submission and Resource Broker profile with JobSubmission/utils, and Broker/utils sub modules. By default this option is turned off.

--enable-lbserverIt is used to enable the build of the LB Server service with Logging/Client, Logging/etc, Logging/Server, Logging/InterLogger/Net, Logging/InterLogger/SSL, Logging/InterLogger/Error, Logging/InterLogger/Lbserver and ThirdParty/trio/src sub modules. By default this option is turned off.

--enable-localloggerIt is used to enable the build of the LB Local Logger service with Logging/Client, Logging/InterLogger/Net, Logging/InterLogger/SSL, Logging/InterLogger/Error, Logging/InterLogger/InterLogger, Logging/LocalLogger, man and ThirdParty/trio/src sub modules. By default this option is turned off.

--enable-locallogger_profileIt is used to enable the installation of the LB LocalLogger profile. By default this option is turned off.

--enable-logging_devIt is used to enable the build of the LB Client Library with Logging/Client and ThirdParty/trio/src sub modules. By default this option is turned off.

IST-2000-25182 PUBLIC 15 / 116

Page 16: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

--enable-informationIt is used to enable the build of the Information Index module.By default this option is turned off.

--enable-information_profileIt is used to enable the installation of the Information Index profile with InformIndex/utils sub module. By default this option is turned off.

--enable-wl

It is used to enable the installation of system configuration files that are in the Workload/etc directory. By default this option is turned off.

--enable-proxyIt is used to enable the build of the Proxy module. By default this option is turned off.

--with-globus-install=<dir>It allows specifying the Globus installation directory without setting the environment variable GLOBUS_LOCATION.

--with-pgsql-install=<dir>It allows specifying the Pgsql installation directory without setting the environment variable PGSQL_INSTALL_PATH.

--with-gdmp-install=<dir>

It allows specifying the GDMP installation directory without setting the environment variable GDMP_INSTALL_PATH.

--with-expat-install=<dir>It allows specifying the Expat installation directory without setting the environment variable EXPAT_INSTALL_PATH.

--with-mysql-install=<dir>It allows to specify the MySQL installation directory without setting the environment variable MYSQL_INSTALL_PATH.

--with-myproxy-install=<dir>

It allows to specify the MyProxy installation directory without setting the environment variable MYPROXY_INSTALL_PATH

IST-2000-25182 PUBLIC 16 / 116

Page 17: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

During the configure step, 12 spec files (i.e. wl-userinterface.spec, wl-locallogger.spec, wl lbserver.spec, wl-logging_dev.spec, wl-jss_rb.spec, wl-information.spec, wl-userinterface-profile.spec, wl-jss_rb-profile.spec, wl-information-profile.spec, wl-lbserver-profile.spec and wl-locallogger-profile.spec, wl-workload-profile.spec) are created in the following source sub-directories to produce a flavour specific version:

- Workload/UserInterface- Workload/Proxy- Workload/Logging- Workload/JobSubmission- Workload/InformIndex- Workload

Once the configure script has terminated its execution, check that the make from the GNU distribution is in your path and then always in the Workload source code directory run:

make

then:

make apidoc

and then:

make check

to build the test code. If the two previous steps complete successfully, the installation of the software can be performed. In order to install the package in the installation directory specified either by the --prefix option of the configure script or by the default value (i.e. /opt/edg), you can now issue the command:

make install

It is possible to run "make clean" to remove object files, executable files, library files and all the other files that are created during ”make” and “make check”. The command:

make -i dist

IST-2000-25182 PUBLIC 17 / 116

Page 18: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

can be used to produce in the workload-X.Y.Z directory, located in the Workload's base directory, a binary gzipped tar ball of the Workload distribution. This tar ball can be both transferred on other platforms and used as source for the RPM creation.For creating the RPMs for Workload 1.0 (according to the configure options you have used) make sure that your PATH is set in such a way that the GNU autotools, make and the gcc compiler can be used and edit the file $HOME/.rpmmacros (if this file does not exist in your home directory, then you have to create it) to set the following entry:

%_topdir <your home dir>/rpm/redhat

Then you can issue the command:

make rpm

that generates the RPMs in $(HOME)/rpm/redhat/RPMS.For example if before building the package you have used the configure as follows:

./configure –-enable-all

then the make rpm command creates the directories:

$(HOME)/rpm/redhat/SOURCES$(HOME)/rpm/redhat/SPECS$(HOME)/rpm/redhat/BUILD$(HOME)/rpm/redhat/RPMS$(HOME)/rpm/redhat/SRPMS

and copies the previously created tar ball workload-X.Y.Z/Workload.tar.gz in $(HOME)/rpm/redhat/SOURCES. Moreover it copies the generated spec files:

JobSubmission/wl-jss_rb.specJobSubmission/wl-jss_rb-profile.specUserInterface/wl-userinterface.specUserInterface/wl-userinterface.specInformIndex/wl-information.specInformIndex/wl-informationpthr.specInformIndex/wl-information-profile.specLogging/wl-lbserver.specLogging/wl-lbserver-profile.spec

IST-2000-25182 PUBLIC 18 / 116

Page 19: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

Logging/wl-locallogger.specLogging/wl-locallogger-profile.specLogging/wl-logging_dev.specProxy/wl-proxy.specWorkload/wl-workload-profile.specWorkload/wl-userguide.spec

in $(HOME)/rpm/redhat/SPECS and finally executes the following commands:

rpm -ba wl-userinterface.specrpm –ba wl-userinterface-profile.specrpm -ba wl-locallogger.specrpm -ba wl-locallogger-profile.specrpm -ba wl-lbserver.specrpm -ba wl-lbserver-profile.specrpm -ba wl-logging_dev.specrpm -ba wl-jss_rb.specrpm -ba wl-jss_rb-profile.specrpm -ba wl-information.specrpm -ba wl-informationpthr.specrpm -ba wl-information-profile.specrpm -ba wl-proxy.specrpm –ba wl-workload-profile.specrpm -ba wl-userguide.spec

generating respectively the following rpms in the $(HOME)/rpm/redhat/RPMS directory:

- userinterface-X.Y.Z-K.i386.rpm- userinterface-profile-X.Y.Z-K.i386.rpm- locallogger- X.Y.Z-K.i386.rpm- locallogger-profile- X.Y.Z-K.i386.rpm- lbserver- X.Y.Z-K.i386.rpm- lbserver-profile- X.Y.Z-K.i386.rpm- logging_dev- X.Y.Z-K.i386.rpm- jobsubmission- X.Y.Z-K.i386.rpm- jobsubmission-profile- X.Y.Z-K.i386.rpm- informationindex- X.Y.Z-K.i386.rpm

IST-2000-25182 PUBLIC 19 / 116

Page 20: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

- informationindexpthr-X.Y.Z-K.i386.rpm- informationindex-profile- X.Y.Z-K.i386.rpm- proxy-X.Y.Z-K.i386.rpm- workload-profile-X.Y.Z-K.i386.rpm- userguide-X.Y.Z-K.i386.rpm

where X.Y.Z-K indicates the rpms release. This document is issued together with the rpms in version 1.1.1-1.If you have instead built only the User Interface, i.e. used:

./configure --disable-all --enable-userinterface

the make rpm command will copy only the file UserInterface/wl-userinterface.spec and the file UserInterface/wl-userinterface-profile.spec in $(HOME)/rpm/redhat/SPECS and will create only the User Interface rpms (userinterface-X.Y.Z-K.i386.rpm and userinterface-profile-X.Y.Z-K.i386.rpm).The User Interface has an additional make target to install the userinterface test suite allowing the performing of unit tests (i.e. without contacting any external component). You have to run the following commands in Worklaod/UserInterface:

./autogen.sh

./configure –disable-all –enable-testsmake tests

and you will find the commands ready to run together with the test files in Worklaod/UserInterface/test.An alternative procedure can be followed to build the II and Logging packages. To do this, move in the Workoad/InformIndex dir and run the following commands:

./autogen.sh

./configure [option]

where the recognised options are:

--prefix=<install path>It is used to specify the Information Index installation dir. The default installation dir is /opt/edg

--with-globus-install=<dir>

IST-2000-25182 PUBLIC 20 / 116

Page 21: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

It allows to specify the Globus install directory without setting the environment variable GLOBUS_LOCATION.

Then issue:makemake install

Afterwards move into the Workload/Logging directory and run the following commands:

./autogen.sh

./configure [option]

where the recognised options are:--enable-all

It is used to enable the build of the Logging and Bookkeeping package.By default this option is turned on.

--enable-userinterfaceIt is used to enable the build of the Client sub module. By default this option is turned off.

--enable-graphical_userinterface

It is used to enable the build of the Client sub module. By default this option is turned off.

--enable-jss_rb

It is used to enable the build of the Client sub module. By default this option is turned off.

--enable-lbserver

It is used to enable the build of the Logging And Bookkeeping Server service with Client, etc, Server, InterLogger/Net, InterLogger/SSL, InterLogger/Error, InterLogger/Lbserver and ThirdParty/trio/src sub modules. By default this option is turned off.

--enable-lbserver_profileIt is used to enable the installation of the LB Server profile with Logging/utils sub module. By default this option is turned off.

--enable-locallogger

IST-2000-25182 PUBLIC 21 / 116

Page 22: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

It is used to enable the build of the Logging And Bookkeeping Local Logger service with Client, InterLogger/Net, InterLogger/SSL, InterLogger/Error, InterLogger/InterLogger, LocalLogger, Apidoc, and ThirdParty/trio/src sub modules. By default this option is turned off.

--enable-logging_dev

It is used to enable the build of the Logging And Bookkeeping Client Library with Client and ThirdParty/trio/src sub modules. By default this option is turned off.

--prefix=<install path>

It is used to specify the Logging installation dir. The default installation dir is /opt/edg

--with-globus-install=<dir>It allows specifying the Globus install directory without setting the environment variable GLOBUS_LOCATION.

--with-expat-install=<dir>

It allows specifying the Expat install directory without setting the environment variable EXPAT_INSTALL_PATH

--with-mysql-install=<dir>

It allows specifying the MySQL install directory without setting the environment variable MYSQL_INSTALL_PATH.

Then issue:makemake apidoc

make checkmake install

Summarising, in relation to the WMS module you want to build, the configure script has to be run with the following options:

- all./configure

- userinterface

IST-2000-25182 PUBLIC 22 / 116

Page 23: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

./configure --disable-all --enable-userinterface

- information./configure --disable-all --enable-information

- lbserver./configure --disable-all --enable-lbserver

- locallogger./configure --disable-all --enable-locallogger

- logging for developers./configure --disable-all --enable-logging_dev

- jobsubmission and broker./configure --disable-all --enable-jss_rb

- wl./configure –disable-all –enable-wl

- proxy./configure --disable-all --enable-proxy

- userinterface profile ./configure --disable-all --enable-userinterface_profile

- information profile./configure --disable-all --enable-information_profile

- information pthread./configure --disable-all --enable-information --with-globus-

flavor=gcc32dbgpthr

- lbserver profile./configure --disable-all --enable-lbserver_profile

- locallogger profile./configure --disable-all --enable-locallogger_profile

IST-2000-25182 PUBLIC 23 / 116

Page 24: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

- jobsubmission and broker profile./configure --disable-all --enable-jss_profile

3.3. RPM INSTALLATIONIn order to install the WP1 RPMs on the target platforms, the following commands have to be executed as root:

rpm -ivh workload-profile.X.Y.Z-K.i386.rpmrpm -ivh userinterface-X.Y.Z-K.i386.rpmrpm –ivh userinterface-profile-X.Y.Z-K.i386.rpmrpm -ivh informationindex-X.Y.Z-K.i386.rpmrpm –ivh informationindexpthr-X.Y.Z-K.i386.rpmrpm -ivh informationindex-profile-X.Y.Z-K.i386.rpmrpm -ivh jobsubmission-X.Y.Z.i386.rpmrpm -ivh jobsubmission-profile-X.Y.Z-K.i386.rpmrpm -ivh locallogger-X.Y.Z-K.i386.rpmrpm -ivh locallogger-profile-X.Y.Z-K.i386.rpmrpm -ivh lbserver-X.Y.Z-K.i386.rpmrpm -ivh lbserver-profile-X.Y.Z-K.i386.rpmrpm -ivh logging_dev-X.Y.Z-K.i386.rpmrpm -ivh proxy-X.Y.Z-K.i386.rpmrpm –ivh workload-profile-X.Y.Z-K.i386.rpmrpm -ivh userguide-X.Y.Z-k.i386.rpm

By default all the rpms install the software in the /opt/edg directory, but the profile rpms (i.e. informationindex-profile, jobsubmission-profile, locallogger-profile and lbserver-profile) that install instead in /etc/rc.d/init.d. If you install one of the following rpms:

- jobsubmission-X.Y.Z-K.i386.rpm- locallogger-X.Y.Z-K.i386.rpm- lbserver-X.Y.Z-K.i386.rpm- informationindex-X.Y.Z-K.i386.rpm- informationindexpthr-X.Y.Z-K.i386.rpm

you will have all needed files installed in /opt/edg and it is necessary to install the configuration and start-up files also in /etc/rc.d/init.d additionally installing the corresponding profile rpms. Namely using the rpms:

- jobsubmission-profile-X.Y.Z-K.i386.rpm- locallogger-profile-X.Y.Z-K.i386.rpm

IST-2000-25182 PUBLIC 24 / 116

Page 25: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

- lbserver-profile-X.Y.Z-K.i386.rpm- informationindex-profile-X.Y.Z-K.i386.rpm

the following scripts are respectively installed in /etc/rc.d/init.d- broker and jobsubmission- locallogger- lbserver- information_index

The administrator (with root privileges) has then to issue from /etc/rc.d/init.d the command:$ <script> start to start the desired component. All start-up scripts accept the start, stop, restart and status options but the information_index that only supports start/stop.The workload-profile-X.Y.Z-K.rpm installs some scripts common to all services of the workload management:

/etc/sysconfig/edg_workload/etc/sysconfig/edg_workload.csh<install-path>/etc/workload.sh<install-path>/etc/workload.csh

They are needed to define and export some variables for the startup script environment. Above all, the PATH and the LD_LIBRARY_PATH needed to correctly run all the software.The jobsubmission-profile-X.Y.Z-K.i386.rpm as premised, additionally installs the wl-jss_rb-env.sh configuration file in /opt/edg/etc, that is read by the broker and jobsubmission startup files when they are launched as root. The /opt/edg/etc/wl-jss_rb-env.sh file contains setting for the following variables: CONDORG_INSTALL_PATH the CondorG installation path. Default value is

/home/dguser/CondorG CONDOR_IDS this is needed by condor to know under which

user it has to run. Value for this variable has to be set in the format uid.gid where uid is the user identifier and gid is the group identifier. This value has to be set by the system administrator.

JSSRB_USER the user running RB and JSS processes. Generally the value of this variable is the user name corresponding to the uid.gid set for the CONDOR_IDS variable.

Details on the installation and configuration and of each of the listed rpms are provided in section 4 of this document. For further information about RPM please consult the man pages or http://www.rpm.org.

IST-2000-25182 PUBLIC 25 / 116

Page 26: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

4. INSTALLATION AND CONFIGURATION This section deals with the procedures for installing and configuring the WP1 WMS components on the target platforms. For each of them, before starting with the installation procedure which is described through step-by-step examples, is reported the list of dependencies i.e. the software required on the same machine by the component to run. Moreover a description of needed configuration items and environment variables settings is also provided. It is important to remark that since the rpms are generated using gcc 2.95.2 and RPM 3.0.5 it is expected to find the same configuration on the target platforms.

4.1. LOGGING AND BOOKKEEPING SERVICESFrom the installation point of view LB services can be split in two main components:

The LB services responsible for accepting messages from their sources and forwarding them to the logging and/or bookkeeping servers, which we will refer as LB local-logger services.

The LB services responsible for accepting messages from the LB local-logger services, saving them on their permanent storage and supporting queries generated by the consumer API, that we will refer as LB server services.

The LB local-logger services must be installed on all the machines hosting processes pushing information into the LB system, i.e. the machines running RB and JSS, and the gatekeeper machine of the CE. An exception is the submitting machine (i.e. the machine running the User Interface) on which this component can be installed but is not mandatory: The LB server services need instead to be installed only on a server machine that usually coincides with the RB server one.

4.1.1. Required software

4.1.1.1. LB local-loggerFor the installation of the LB local-logger the only software required is the Globus Toolkit 2.0 (actually only GSI rpms are needed). Globus 2 rpms are available at http://datagrid.in2p3.fr/distribution/globus under the directory beta-xx/RPMS (recommended beta is 21 or higher). All rpms can be downloaded with the command

wget -nd –r <URL>/<rpm name>

and installed with rpm –ivh <rpm name>

4.1.1.2. LB ServerFor the installation of the LB server the Globus Toolkit 2.0 (actually only GSI rpms are needed). Globus 2 rpms are available at http://datagrid.in2p3.fr/distribution/globus under the directory beta-xx/RPMS (recommended beta is 21 or higher). All rpms can be downloaded with the command

wget -nd –r <URL>/<rpm name>

and installed with

IST-2000-25182 PUBLIC 26 / 116

Page 27: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

rpm –ivh <rpm name>

Besides Globus Toolkit 2.0 for the LB server to work properly it is also necessary to install MySQL Distribution 3.22.31 or higher.Instructions about MySQL installation can be found at the following URLs: http://www.redhat.com/support/resources/faqs/RH-apache-FAQ/MySQL/mysql-install.htmPackages and more general documentation can be found at:http://www.mysql.org/listcats3.php?menu=21&page_id=9.Anyway the rpm of MySQL Ver 9.38 Distribution 3.22.32, for pc-linux-gnu (i686) is available at http://datagrid.in2p3.fr/distribution/config/external_services.html.At least packages MySQL-3.22.32 and MySQL-client-3.32.22 have to be installed for creating and configuring the LB database.LB server stores the logging data in a MySQL database that must hence be created. The following assumes the database and the server daemons (bkserver and ileventd) run on the same machine, which is considered to be secure, i.e. no database authentication is used. In a different set-up the procedure has to be adjusted accordingly as well as a secure database connection (via ssh tunnel etc.) established.The action list below contains placeholders DB_NAME and USER_NAME, real values have to be substituted. They form the database connection string required on some LB daemons invocation. Suggested value for both DB_NAME and USER_NAME is `lbserver', this value is also the compiled-in default (i.e. when used, the database connection string needn't be specified at all).The following needed steps require MySQL root privileges:1) Create the database: mysqladmin -u root -p create DB_NAMEwhere DB_NAME is the name of the database.

2) Create a dedicated LB database user: mysql -u root -p -e 'grant create,drop,select,insert, \

update,delete on DB_NAME.* to USER_NAME@localhost'

where USER_NAME is the name of the user running the LB server daemons.

3) Create the database tables: mysql -u USER_NAME DB_NAME < server.sqlwhere server.sql is a file containing sql commands for creating needed tables. server.sql can be found in the directory “<install path>/etc” created by the LB server rpm installation.

4.1.2. RPM installationIn order to install the LB local-logger and the LB server services, the following command have to be respectively issued with root privileges:

IST-2000-25182 PUBLIC 27 / 116

Page 28: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

rpm -ivh workload-profile.X.Y.Z-K.i386.rpmrpm –ivh locallogger-X.Y.Z-K.i386.rpmrpm -ivh locallogger-profile-X.Y.Z-K.i386.rpmrpm -ivh lbserver-X.Y.Z-K.i386.rpmrpm -ivh lbserver-profile-X.Y.Z-K.i386.rpm

By default the locallogger-X.Y.Z-K.i386.rpm and lbserver-X.Y.Z-K.i386.rpm rpms install the software in the “/opt/edg” directory whilst the remaining two in “/etc/rc.d/init.d”.

4.1.3. The installation tree structure

4.1.3.1. LB local-loggerWhen the LB local-logger RPMs are installed, the following directory tree is created:

<install-path>/info<install-path>/info/interlogger.info<install-path>/lib<install-path>/man<install-path>/man/man1<install-path>/man/man1/interlogger.1<install-path>/man/man3<install-path>/man/man3/_dgLBJobStat.3<install-path>/man/man3/_dgLBQueryRec.3<install-path>/man/man3/dgLBEvent.3<install-path>/man/man3/dglbevents.3<install-path>/man/man3/dglog.3<install-path>/man/man3/dgssl.3<install-path>/man/man3/dgxferlog.3<install-path>/man/man3/escape.3<install-path>/man/man3/lbapi.3<install-path>/sbin<install-path>/sbin/dglogd<install-path>/sbin/interlogger<install-path>/sbin/locallogger<install-path>/share<install-path>/share/doc<install-path>/share/doc/Workload<install-path>/share/doc/Workload/Logging<install-path>/share/doc/Workload/Logging/html<install-path>/share/doc/Workload/Logging/html/annotated.html<install-path>/share/doc/Workload/Logging/html/class__dgLBJobStat-include.html<install-path>/share/doc/Workload/Logging/html/class__dgLBJobStat-members.html<install-path>/share/doc/Workload/Logging/html/class__dgLBJobStat.html<install-path>/share/doc/Workload/Logging/html/class__dgLBQueryRec-include.html<install-path>/share/doc/Workload/Logging/html/class__dgLBQueryRec-members.html<install-path>/share/doc/Workload/Logging/html/class__dgLBQueryRec.html

IST-2000-25182 PUBLIC 28 / 116

Page 29: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

<install-path>/share/doc/Workload/Logging/html/class_dgLBEvent-include.html<install-path>/share/doc/Workload/Logging/html/class_dgLBEvent.html<install-path>/share/doc/Workload/Logging/html/doxygen.gif<install-path>/share/doc/Workload/Logging/html/files.html<install-path>/share/doc/Workload/Logging/html/functions.html<install-path>/share/doc/Workload/Logging/html/globals.html<install-path>/share/doc/Workload/Logging/html/headers.html<install-path>/share/doc/Workload/Logging/html/index.html<install-path>/share/doc/Workload/Logging/html/null.gif<install-path>/share/doc/Workload/Logging/refman.ps

/etc/rc.d/init.d/etc/rc.d/init.d/locallogger

The sbin directory contains all the LB local-logger daemons executables. The script locallogger contained in “/etc/rc.d/init.d “ has to be used for starting daemons. In the man directory can be found the man page for the inter-logger daemon.

4.1.3.2. LB ServerWhen the LB server RPMs are installed, the following directory tree is created:

<install-path>/etc<install-path>/etc/server.sql<install-path>/lib<install-path>/sbin<install-path>/sbin/bkpurge<install-path>/sbin/bkserver<install-path>/sbin/ileventd<install-path>/sbin/lbserver

/etc/rc.d/init.d/etc/rc.d/init.d/lbserver

where the sbin directory contains all the LB server daemons executables. The script lbserver contained in “/etc/rc.d/init.d “ has to be used for starting daemons.

4.1.4. ConfigurationBoth the LB local-logger and LB server have no configuration files so no action is needed for this task.

4.1.5. Environment VariablesAll LB components need the following environment variables to be set:

IST-2000-25182 PUBLIC 29 / 116

Page 30: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

X509_USER_KEY the user private key file path X509_USER_CERT the user certificate file path X509_CERT_DIR the trusted certificate directory and ca-signing-policy

directory X509_USER_PROXY the user proxy certificate file path

as required by GSI. However, in case of LB daemons, the recommended way for specifying security files locations is using --cert, --key, --CAdir options explicitly.The Logging library i.e. the library that is linked into UI, RB, JSS and Jobmanager, reads its immediate logging destination form the variable DGLOG_DEST. It defaults to “x-dglog://localhost:15830“ which is the correct value, hence it normally does not need to be set but on the submitting machine. Correct format for this variable is:DGLOG_DEST=x-dglog://HOST:PORTwhere as already mentioned HOST defaults to localhost and PORT defaults to 15830.On the submitting machine if the variable is not set, it is dynamically assigned by the UI with the value:DGLOG_DEST=x-dglog://<LB_CONTACT>:15830where LB_CONTACT is the hostname of the machine where the LB server currently associated to the RB used for submitting jobs is running. The Logging library functions timeout is read from the environment variable DGLOG_TIMEOUT. It defaults to 2 seconds that is the correct value for locals logging. On the submitting machine the value for this variable is set dynamically by the UI to 10 seconds (recommended value for non-locals logging is 10 to 15 seconds) and it is anyway configurable through the UI configuration.Finally there is LBDB, the environment variable needed by the LB Server daemons (ileventd, bkserver and bkpurge). LBDB represents the MySQL database connect-string, defaults to “lbserver/@localhost:lbserver” and in the recommended set-up (see section 4.1.1.2) does not need to be set. Otherwise it should be set as follows:LBDB=USER_NAME/PASSWORD@DB_HOSTNAME:DB_NAMEwhere - USER_NAME is the name of database user, - PASSWORD is user password for the database - DB_HOSTNAME is hostname of the host where the database is located - DB_NAME is name of the database.

IST-2000-25182 PUBLIC 30 / 116

Page 31: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

4.2. RB AND JSSThe Resource Broker and the Job Submission Services are the WMS components allowing the submission of jobs to the CEs. They are dealt with together since they always reside on the same host and consequently are distributed by means of a single rpm.

4.2.1. Required softwareFor the installation of RB and JSS the Globus Toolkit 2.0 rpms available at http://datagrid.in2p3.fr/distribution/globus under the directory beta-xx/RPMS (recommended beta is 21 or higher) are required to be installed on the target platform. All needed rpms can be downloaded with the command

wget -nd –r <URL>/<rpm name>and installed with

rpm –ivh <rpm name>

The Globus gridftp server package must also be installed and configured on the same host (see http://marianne.in2p3.fr/datagrid/documentation/EDG-Install-HOWTO.html for details).It is important to recall that the Globus grid-mapfile located in /etc/grid-security on the RB server machine must be filled with the certificate subjects of all the users allowed to use the Resource Broker functionalities. Users being mapped into the gridmap-file have to belong to a group having the same name of the user itself. At the same time the dedicated user dguser has to belong to all these groups.Moreover on the same platform the following products are expected to be installed:

LB local-logger services (see section 4.1.1.1) PostgreSQL (RB and JSS) Condor-G (JSS) ClassAd library (RB and JSS) ReplicaCatalog from the WP2 distribution (RB)

4.2.1.1. PostgreSQL installation and configurationBoth RB and JSS use PostgreSQL database for implementing the internal job queue. The installation kit and the documentation for PostgreSQL can be found at the following URL:http://www3.us.postgresql.org/sites.htmlRequired PostgreSQL version is 7.1.3 or higher. The following packages need to be installed (respecting the order in which they are listed): postgresql-libs, posgresql-devel, postgresql, postgresql-server, postgresql-tcl, postgresql-tk and postgresql-docs.PostgreSQL also needs packages cyrus-sasl-1-5-11 (or higher), openssl-0.9.5a and openssl-devel-0.9.5a (or higher). All of them can be found at the following URL:http://datagrid.in2p3.fr/distribution/external/RPMSHereafter are reported the configuration options that must be used when installing the package:

--with-CXX

IST-2000-25182 PUBLIC 31 / 116

Page 32: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

--with-tcl --enable-odbcPostgresql 7.1.3 is also available in rpm format (to be installed as root) at the URL :http://datagrid.in2p3.fr/distribution/external/RPMS Once PostgreSQL has been installed, you need as root to create a new system account dguser using the (RH specific) command adduser –r –m dguser This command allows indeed creating a system account having a home directory. Then follow steps reported here below to create an empty database for JSS:

su – postgres (become the postgres user)createuser –d –A dguser (create the new database user dguser)su – dguser (become the user dguser)createdb <DBNAME> (create the new database for JSS)

The name of the created database must be the same as the one assigned to the Database_name attribute in file jss.conf (see section 4.2.4.2 for more details), otherwise JSS will use as default the "template1" database. Avoiding use of the template database is anyway strongly recommended.The RB server uses instead another database named "rb", which is created by RB itself.

4.2.1.1.1. Upgrading from a previous versionOn upgrading from version 1.1.x to version 1.2.y administrators must remember to completely remove the table containing the old version database registry. This is because the 1.2.x JSS uses a new field inside the PostGreSQL database to store the proxy file path.Commands that have to be issued as root: are:

psql template1 postgres (to connect to the database)

and then change template1 to the database name contained inside the jss.conf file.Once inside the psql client do:

DROP TABLE condor_submit (to remove the table)

and change condor_submit to the table name contained inside the jss.conf file.

4.2.1.2. Condor-G installation and configurationCondor-G release required by JSS is CondorG 6.3.1 for INTEL-LINUX-GLIBC21. The Condor-G installation toolkit can be found at the following URL:

IST-2000-25182 PUBLIC 32 / 116

Page 33: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

http://www.cs.wisc.edu/condor/downloads/condorg.license.html. whilst it is available in rpm format (to be installed as root) at:http://datagrid.in2p3.fr/distribution/external/RPMS Installation and configuration are quite straightforward and for details the reader can refer to the README file included in the Condor-G package. Main steps to be performed after having unpacked the package as root are: become dguser (su – dguser) make sure the directory where you are going to install CondorG is owned by dguser make sure the Globus Toolkit 2.0 has been installed on the platform run the /opt/CondorG/setup.sh installation script remove the link ~dguser/.globus/certificates created by the installation script Moreover some additional configuration steps have to be performed in the Condor configuration file pointed to by the CONDOR_CONFIG environment variable set during installation. In the $CONDOR_CONFIG file the following attributes need to be modified:RELEASE_DIR = $(CONDORG_INSTALL_PATH)CONDOR_ADMIN = <a valid e-mail address of the Condor-G administrator>UID_DOMAIN = < the domain of the machine (e.g. pd.infn.it)>FILESYSTEM_DOMAIN = < the domain of the machine (e.g. pd.infn.it)>HOSTALLOW_WRITE = *CRED_MIN_TIME_LEFT = 0GLOBUSRUN = $(GLOBUS_LOCATION)/bin/globusrun

and the following entries need to be added:

SKIP_AUTHENTICATION = YESAUTHENTICATION_METHODS = CLAIMTOBEDISABLE_AUTH_NEGOTIATION = TRUEGRIDMANAGER_CHECKPROXY_INTERVAL = 600GRIDMANAGER_MINIMUM_PROXY_TIME = 180

The environment variable CONDORG_INSTALL_PATH is also set during installation and points to the path where the Condor-G package has been installed.The current version of Condor-G for working properly requires file /etc/grid-security/certificates/ca-signing-policy.conf that has been instead eliminated from the Globus Toolkit 2.0 distribution and must hence be created by the administrator. This need will be removed with next release of Condor-G that will be fully Globus Toolkit 2.0 compliant.

4.2.1.3. ClassAd installation and configurationThe ClassAd release required by JSS and RB is classads-0.9 (or higher). The ClassAd library documentation can be found at the following URL:

IST-2000-25182 PUBLIC 33 / 116

Page 34: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

http://www.cs.wisc.edu/condor/classad. whilst it is available in rpm format (to be installed as root) at:http://datagrid.in2p3.fr/distribution/external/RPMS

4.2.1.4. ReplicaCatalog installation and configurationThe ReplicaCatalog release required by RB is ReplicaCatalogue-gcc32dbg-2.0 (or higher) that is available in rpm format (to be installed as root) at:http://datagrid.in2p3.fr/distribution/wp2/RPMS

4.2.2. RPM installationIn order to install the Resource Broker and the Job Submission services, the following command has to be issued with root privileges: rpm -ivh workload-profile.X.Y.Z-K.i386.rpmrpm -ivh proxy-X.Y.Z-K.i386.rpmrpm -ivh jobsubmission-X.Y.Z-K.i386.rpmrpm -ivh jobsubmission-profile-X.Y.Z-K.i386.rpm

By default the jobsubmission-X.Y.Z-K.i386.rpm and the proxy-X.Y.Z-K.i386.rpm rpms install the software in the “/opt/edg” directory whilst jobsubmission-profile-X.Y.Z-K.i386.rpm in “/etc/rc.d/init.d” and “/etc/sysconfig”.

4.2.3. The Installation Tree structureWhen the jobsubmission rpms have been installed, the following directory tree is created:

<install-path>/bin<install-path>/bin/RBserver<install-path>/bin/jssparser<install-path>/bin/jssserver<install-path>/etc<install-path>/etc/jss.conf<install-path>/etc/rb.conf<install-path>/etc/wl-jss_rb-env.sh<install-path>/lib<install-path>/man<install-path>/man/man3<install-path>/man/man3/BROKER_INFOstruct.3<install-path>/man/man3/CannotConfigure.3<install-path>/man/man3/CannotReadFile.3<install-path>/man/man3/ConfSchema.3

IST-2000-25182 PUBLIC 34 / 116

Page 35: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

<install-path>/man/man3/DeletePointer.3<install-path>/man/man3/GDMP_ReplicaCatalog.3<install-path>/man/man3/InvalidURL.3<install-path>/man/man3/JSSConfiguration.3<install-path>/man/man3/JobWrapper.3<install-path>/man/man3/JssClient.3<install-path>/man/man3/LDAPConnection.3<install-path>/man/man3/LDAPSynchConnection.3<install-path>/man/man3/LogManager.3<install-path>/man/man3/MalformedFile.3<install-path>/man/man3/RBJobRegistry.3<install-path>/man/man3/RBMaster.3<install-path>/man/man3/RBReplicaCatalog.3<install-path>/man/man3/RBReplicaCatalogEx.3<install-path>/man/man3/RBjob.3<install-path>/man/man3/URL.3<install-path>/man/man3/brokerinfo.3<install-path>/man/man3/do_CloseSEs_supply_CE_with_nfiles.3<install-path>/man/man3/jsscommon.3<install-path>/man/man3/jssthreads.3<install-path>/man/man3/matchmaking.3<install-path>/man/man3/rbargs_t.3<install-path>/man/man3/rbhandlers.3<install-path>/man/man3/rbthreads.3<install-path>/man/man3/select_CE_on_files.3<install-path>/sbin<install-path>/sbin/broker<install-path>/sbin/jobsubmission<install-path>/share<install-path>/share/doc<install-path>/share/doc/Workload<install-path>/share/doc/Workload/Broker<install-path>/share/doc/Workload/Broker/COPYING<install-path>/share/doc/Workload/Broker/NEWS<install-path>/share/doc/Workload/Broker/README<install-path>/share/doc/Workload/Broker/html<install-path>/share/doc/Workload/Broker/html/annotated.html<install-path>/share/doc/Workload/Broker/html/class_BROKER_INFOstruct-include.html

IST-2000-25182 PUBLIC 35 / 116

Page 36: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

<install-path>/share/doc/Workload/Broker/html/class_BROKER_INFOstruct-members.html<install-path>/share/doc/Workload/Broker/html/class_BROKER_INFOstruct.html<install-path>/share/doc/Workload/Broker/html/class_ConfSchema-include.html<install-path>/share/doc/Workload/Broker/html/class_ConfSchema-members.html<install-path>/share/doc/Workload/Broker/html/class_ConfSchema.html<install-path>/share/doc/Workload/Broker/html/class_GDMP_ReplicaCatalog-include.html<install-path>/share/doc/Workload/Broker/html/class_GDMP_ReplicaCatalog-members.html<install-path>/share/doc/Workload/Broker/html/class_GDMP_ReplicaCatalog.gif<install-path>/share/doc/Workload/Broker/html/class_GDMP_ReplicaCatalog.html<install-path>/share/doc/Workload/Broker/html/class_LDAPConnection-include.html<install-path>/share/doc/Workload/Broker/html/class_LDAPConnection-members.html<install-path>/share/doc/Workload/Broker/html/class_LDAPConnection.gif<install-path>/share/doc/Workload/Broker/html/class_LDAPConnection.html<install-path>/share/doc/Workload/Broker/html/class_LDAPSynchConnection-include.html<install-path>/share/doc/Workload/Broker/html/class_LDAPSynchConnection-members.html<install-path>/share/doc/Workload/Broker/html/class_LDAPSynchConnection.gif<install-path>/share/doc/Workload/Broker/html/class_LDAPSynchConnection.html<install-path>/share/doc/Workload/Broker/html/class_RBJobRegistry-include.html<install-path>/share/doc/Workload/Broker/html/class_RBJobRegistry-members.html<install-path>/share/doc/Workload/Broker/html/class_RBJobRegistry.html<install-path>/share/doc/Workload/Broker/html/class_RBMaster-include.html<install-path>/share/doc/Workload/Broker/html/class_RBMaster-members.html<install-path>/share/doc/Workload/Broker/html/class_RBMaster.html<install-path>/share/doc/Workload/Broker/html/class_RBReplicaCatalog-include.html<install-path>/share/doc/Workload/Broker/html/class_RBReplicaCatalog-members.html<install-path>/share/doc/Workload/Broker/html/class_RBReplicaCatalog.gif<install-path>/share/doc/Workload/Broker/html/class_RBReplicaCatalog.html<install-path>/share/doc/Workload/Broker/html/class_RBReplicaCatalogEx-include.html<install-path>/share/doc/Workload/Broker/html/class_RBReplicaCatalogEx.html<install-path>/share/doc/Workload/Broker/html/class_RBjob-include.html<install-path>/share/doc/Workload/Broker/html/class_RBjob-members.html<install-path>/share/doc/Workload/Broker/html/class_RBjob.html<install-path>/share/doc/Workload/Broker/html/class_do_CloseSEs_supply_CE_with_nfiles-include.html<install-path>/share/doc/Workload/Broker/html/class_do_CloseSEs_supply_CE_with_nfiles-members.html<install-path>/share/doc/Workload/Broker/html/class_do_CloseSEs_supply_CE_with_nfiles.gif

IST-2000-25182 PUBLIC 36 / 116

Page 37: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

<install-path>/share/doc/Workload/Broker/html/class_do_CloseSEs_supply_CE_with_nfiles.html<install-path>/share/doc/Workload/Broker/html/class_select_CE_on_files-include.html<install-path>/share/doc/Workload/Broker/html/class_select_CE_on_files-members.html<install-path>/share/doc/Workload/Broker/html/class_select_CE_on_files.gif<install-path>/share/doc/Workload/Broker/html/class_select_CE_on_files.html<install-path>/share/doc/Workload/Broker/html/doxygen.gif<install-path>/share/doc/Workload/Broker/html/files.html<install-path>/share/doc/Workload/Broker/html/functions.html<install-path>/share/doc/Workload/Broker/html/globals.html<install-path>/share/doc/Workload/Broker/html/group_ReplicaCatalog.html<install-path>/share/doc/Workload/Broker/html/headers.html<install-path>/share/doc/Workload/Broker/html/hierarchy.html<install-path>/share/doc/Workload/Broker/html/index.html<install-path>/share/doc/Workload/Broker/html/modules.html<install-path>/share/doc/Workload/Broker/html/null.gif<install-path>/share/doc/Workload/Broker/refman.ps<install-path>/share/doc/Workload/Common<install-path>/share/doc/Workload/Common/html<install-path>/share/doc/Workload/Common/html/annotated.html<install-path>/share/doc/Workload/Common/html/class_DeletePointer-include.html<install-path>/share/doc/Workload/Common/html/class_DeletePointer-members.html<install-path>/share/doc/Workload/Common/html/class_DeletePointer.html<install-path>/share/doc/Workload/Common/html/class_InvalidURL-include.html<install-path>/share/doc/Workload/Common/html/class_InvalidURL.html<install-path>/share/doc/Workload/Common/html/class_URL-include.html<install-path>/share/doc/Workload/Common/html/class_URL-members.html<install-path>/share/doc/Workload/Common/html/class_URL.html<install-path>/share/doc/Workload/Common/html/doxygen.gif<install-path>/share/doc/Workload/Common/html/files.html<install-path>/share/doc/Workload/Common/html/functions.html<install-path>/share/doc/Workload/Common/html/group_Common.html<install-path>/share/doc/Workload/Common/html/headers.html<install-path>/share/doc/Workload/Common/html/index.html<install-path>/share/doc/Workload/Common/html/modules.html<install-path>/share/doc/Workload/Common/html/null.gif<install-path>/share/doc/Workload/Common/refman.ps<install-path>/share/doc/Workload/JobSubmission

IST-2000-25182 PUBLIC 37 / 116

Page 38: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

<install-path>/share/doc/Workload/JobSubmission/AUTHORS<install-path>/share/doc/Workload/JobSubmission/COPYING<install-path>/share/doc/Workload/JobSubmission/NEWS<install-path>/share/doc/Workload/JobSubmission/README<install-path>/share/doc/Workload/JobSubmission/html<install-path>/share/doc/Workload/JobSubmission/html/annotated.html<install-path>/share/doc/Workload/JobSubmission/html/class_CannotConfigure-include.html<install-path>/share/doc/Workload/JobSubmission/html/class_CannotConfigure-members.html<install-path>/share/doc/Workload/JobSubmission/html/class_CannotConfigure.html<install-path>/share/doc/Workload/JobSubmission/html/class_CannotReadFile-include.html<install-path>/share/doc/Workload/JobSubmission/html/class_CannotReadFile-members.html<install-path>/share/doc/Workload/JobSubmission/html/class_CannotReadFile.html<install-path>/share/doc/Workload/JobSubmission/html/class_JSSConfiguration-include.html<install-path>/share/doc/Workload/JobSubmission/html/class_JSSConfiguration-members.html<install-path>/share/doc/Workload/JobSubmission/html/class_JSSConfiguration.html<install-path>/share/doc/Workload/JobSubmission/html/class_JobWrapper-include.html<install-path>/share/doc/Workload/JobSubmission/html/class_JobWrapper-members.html<install-path>/share/doc/Workload/JobSubmission/html/class_JobWrapper.html<install-path>/share/doc/Workload/JobSubmission/html/class_JssClient-include.html<install-path>/share/doc/Workload/JobSubmission/html/class_JssClient-members.html<install-path>/share/doc/Workload/JobSubmission/html/class_JssClient.html<install-path>/share/doc/Workload/JobSubmission/html/class_LogManager-include.html<install-path>/share/doc/Workload/JobSubmission/html/class_LogManager-members.html<install-path>/share/doc/Workload/JobSubmission/html/class_LogManager.html<install-path>/share/doc/Workload/JobSubmission/html/class_MalformedFile-include.html<install-path>/share/doc/Workload/JobSubmission/html/class_MalformedFile-members.html<install-path>/share/doc/Workload/JobSubmission/html/class_MalformedFile.html<install-path>/share/doc/Workload/JobSubmission/html/class_rbargs_t-include.html<install-path>/share/doc/Workload/JobSubmission/html/class_rbargs_t-members.html<install-path>/share/doc/Workload/JobSubmission/html/class_rbargs_t.gif<install-path>/share/doc/Workload/JobSubmission/html/class_rbargs_t.html<install-path>/share/doc/Workload/JobSubmission/html/doxygen.gif<install-path>/share/doc/Workload/JobSubmission/html/files.html<install-path>/share/doc/Workload/JobSubmission/html/functions.html<install-path>/share/doc/Workload/JobSubmission/html/globals.html<install-path>/share/doc/Workload/JobSubmission/html/group_JobWrapper.html

IST-2000-25182 PUBLIC 38 / 116

Page 39: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

<install-path>/share/doc/Workload/JobSubmission/html/group_JssClient.html<install-path>/share/doc/Workload/JobSubmission/html/group_JssConfigure.html<install-path>/share/doc/Workload/JobSubmission/html/group_JssError.html<install-path>/share/doc/Workload/JobSubmission/html/group_JssParser.html<install-path>/share/doc/Workload/JobSubmission/html/group_JssThreads.html<install-path>/share/doc/Workload/JobSubmission/html/headers.html<install-path>/share/doc/Workload/JobSubmission/html/hierarchy.html<install-path>/share/doc/Workload/JobSubmission/html/index.html<install-path>/share/doc/Workload/JobSubmission/html/modules.html<install-path>/share/doc/Workload/JobSubmission/html/null.gif<install-path>/share/doc/Workload/JobSubmission/refman.ps

/etc/rc.d/init.d/etc/rc.d/init.d/broker/etc/rc.d/init.d/jobsubmission

The directory bin contains all the RB and JSS server process executables Rbserver, jssserver and jssparser. In etc are stored the configuration files (see below Section 4.2.4.1 and section 4.2.4.2). The scripts to start and stop the RB and JSS processes are contained in “/etc/rc.d/init.d”.

4.2.4. ConfigurationOnce the rpm has been installed, the RB and JSS services must be properly configured. This can be done editing the two files rb.conf and jss.conf that are stored in <install-path >/etc. Actions to be performed to configure the Resource Broker and the Job Submission Service are described in the following two sections.

4.2.4.1. RB configurationConfiguration of the Resource Broker is accomplished editing the file “<install-path>/etc/rb.conf:” to set opportunely the contained attributes. They are listed hereafter grouped according to the functionality they are related with: MDS_contact, MDS_port and MDS_timeout refer to the II service and respectively

represent the hostname where this service is running, the port number, and the timeout in seconds when the RB queries the II. E.g.:

MDS_contact = "grid001f.cnaf.infn.it"; MDS_port = 2170; MDS_timeout = 60;

IST-2000-25182 PUBLIC 39 / 116

Page 40: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

MDS_gris_port refers to the port to be used by RB to contact GRIS’es. E.g.: MDS_gris_port = 2135;

MDS_multi_attributes define the list of the attribute that in the MDS are multi-valued (i.e. that this can assume multiple values). It is recommended to not modify the default value for this parameter which is currently:

MDS_multi_attributes = {"AuthorizedUser","RunTimeEnvironment","CloseCE"

};

MDS_basedn defines the basedn, which represents the distinguished name (DN) to use as a starting place for searches in the information index. It is recommended to not modify the default value for this parameter which is currently set to:

MDS_basedn = "o=Grid"

LB_CONTACT and LB_PORT refer to the LB Server service and represent respectively the hostname and port where the LB server is listening for connections. E.g.:

LB_contact = "grid004f.cnaf.infn.it";LB_port = 7846;

The Logging library i.e. the library providing APIs for logging job events to the LB (that is linked into RB) reads its immediate logging destination form the environment variable DGLOG_DEST (see section 4.1.5) hence it is not dealt with in the configuration file. DGLOG_DEST defaults to “x-dglog://localhost:15830“ which is the correct value, hence it normally does not need to be set indicating that the LB local-logger services should normally run on the same host as the RB server. The logging function timeout is instead read from the environment variable DGLOG_TIMEOUT that defaults to 2 seconds.

JSS_contact and JSS_server_port refer to the JSS and represent respectively the hostname (it must be the same host of the RB server one) and the port number (it must match with the RB_client_port parameter in the jss.conf file - see section 4.2.4.2) where the JSS server is listening. Moreover JSS_client_port represents the port used by RB to listen for JSS communications. Value of the latter parameter must match with the JSS_server_port parameter in the jss.conf file (see section 4.2.4.2). Hereafter is reported an example for these parameters:

JSS_contact = "grid004f.cnaf.infn.it"; JSS_client_port = 8881; JSS_server_port = 9991;

JSS_backlog and UI_backlog define the maximum number of simultaneous connections

from JSS and UI supported by the socket . Default values are:

IST-2000-25182 PUBLIC 40 / 116

Page 41: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

JSS_backlog = 5; UI_backlog = 5;

UI_server port is the port used by the RB server to listen for requests coming from the User Interface. Default value for this parameter is:

UI_server_port = 7771;

RB_pool_size represents the maximum number of request managed simultaneously by the RB server. Default value for this parameter is:

RB_pool_size = 16;

RB_purge_threshold that defines the threshold age in seconds for RBRegistry information. Indeed RB purges all the information and frees storage space of a job (input/output sandboxes) when the last update of the internal information database has taken place since more than RB_purge_threshold seconds. Default value for this parameter is about one week:

RB_purge_threshold = 600000;

RB_cleanup_threshold represents the span of time (expressed in seconds) between two consecutive cleanups of job registry. During the registry cleanup the RB removes all the entries of those jobs classified as ABORTED. At the end of the cleanup if it is needed (see RB_purge_trheshold) the purging of the registry is performed, as well. The default value for this configuration parameter is:

RB_cleanup_threshold = 3600;

The administrator according to the estimated amount of jobs input/sandbox files in the given period must anyway tailor this value in order to not overfull RB machine disk space.

RB_sandbox_path, which represents the pathname of the root sandboxes directory i.e. the complete pathname linking to the directory where the RB creates both input/output sandboxes directories and stores the “.Brokerinfo” file. Default value for this parameter is the temporary directory:

RB_sandbox_path = "/tmp";

RB_logfile that defines the name of the file used by the RB for recording its various events. The default value for this parameter is:

IST-2000-25182 PUBLIC 41 / 116

Page 42: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

RB_logfile = "/var/tmp/RBserver.log";

RB_logfile_size. This parameter limits the size of the RB log file to the specified size, each time it grows beyond this maximum the RB flushes its content in a new file with the same name of the original but having .old as extension. The size should be expressed in bytes. Default value for this parameter is:

RB_logfile_size = 5120000;

RB_logfile_level. This parameter allows the user to specify the verbosity of the information the RB records in its log file. Possible values are: 0 (none), 1 (verylow), 2 (low), 3 (medium), 4 (high), 5 (veryhigh) and 6 (ugly). The default value for this configuration parameter is:

RB_logfile_level = 3;

RB_submission_retries. This parameter allows the user to specify the number of times the RB has to re-submit the job to JSS in case the submission to the CE fails (e.g. globus down on the CE, network problem etc.). The default value for this configuration parameter is:

RB_submission_retries = 3;

MyProxyServer. This parameter allows the user to specify the server host name of the MyProxy credential repository system to be contacted for periodic credential renewal. An example for this configuration parameter is provided hereafter:

MyProxyServer = "skurut.cesnet.cz"

No semicolon has to be put at the end of last field in the rb.conf file.

4.2.4.2. JSS configurationConfiguration of the Job Submission Service is accomplished editing the file “<install-path>/etc/jss.conf:” to set opportunely the contained parameters. They are listed hereafter together wit their meanings: Condor_submit_file_prefix defines the prefix for the CondorG submission file. The job

identifier dg_jobId is then appended to this prefix to build the actual submission file name). Default value for this parameter is:

Condor_submit_file_prefix = "/var/tmp/CondorG.sub";

IST-2000-25182 PUBLIC 42 / 116

Page 43: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

Condor_log_file defines the absolute path name of the CondorG log file, i.e. the file where the events for the submitted jobs are recorded. Default value for this parameter is:

Condor_log_file = "/var/tmp/CondorG.log";

Condor_stdoe_dir defines the directory where the standard output and standard error files of CondorG are temporarily saved. Default value is:

Condor_stdoe_dir = "/var/tmp";

Job_wrapper_file_prefix is the prefix for the Job Wrapper file name (i.e. the script wrapping the actual job which is submitted on the CE). As before the job identifier dg_jobId is appended to this prefix to build the actual file name. Default value for this parameter is:

Job_wrapper_file_prefix = "/var/tmp/Job_wrapper.sh";

Database_name is the name of the Postgres database where JSS registers information about submitted jobs. This name must correspond to an existing database (how to create it is briefly described in section 4.2.1.1). Default value for the database name is the one of the database automatically created when installing Postgres, i.e.:

Database_name = "template1";

Database_table_name is the name of the table in the previous database. This table is created by the JSS itself if not found. Default value for this parameter is:

Database_table_name = "condor_submit";

JSS_server_port and RB_client_port represent respectively the port used by JSS to listen for RB communication and to communicate to the RB server (e.g. for sending notifications). The two mentioned parameters have to match respectively with the JSS_client_port and JSS_server_port parameters in the rb.conf file (see section 4.2.4.1). Default values are:

JSS_server_port = 8881;RB_client_port = 9991;

Condor_log_file_size indicates the size in bytes at which the CondorG.log log file has to be splitted. Default value is:

Condor_log_file_size = 64000;

IST-2000-25182 PUBLIC 43 / 116

Page 44: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

4.2.5. Environment variables

4.2.5.1. RBEnvironment variables that have to be set for the RB are listed hereafter: PGSQL_INSTALL_PATH the Postgres database installation path. Default value is

/usr/local/pgsql PGDATA the path where are stored the Postgres database data

Files. Default value is /usr/local/pgsql/data GDMP_INSTALL_PATH the gdmp installation path. Default value is /opt/edg.

Setting of PGSQL_INSTALL_PATH and PGDATA is only needed if installation is not performed from rpm. Moreover $GDMP_INSTALL_PATH/lib has to be added to LD_LIBRARY_PATH. Finally, there are other environment variables needed at run-time by RB. They are: EDG_WL_RB_CONFIG_DIR the RB configuration directory X509_HOST_CERT the user certificate file path X509_HOST_KEY the user private key file path X509_USER_PROXY the user proxy certificate file path GRIDMAP location of the Globus grid-mapfile that translates X509

certificate subjects into local Unix usernames. The default is /etc/grid-security/grid-mapfile.

Anyway, all variable in the latter group are set by the broker start-up script.

4.2.5.2. JSSEnvironment variables that have to be set for the JSS are listed hereafter: PGSQL_INSTALL_PATH the Postgres database installation path. Default value is

/usr/local/pgsql PGDATA the path where are stored the Postgres database data

Files. Default value is /usr/local/pgsql/data PGUSER the user that has been used to start postgres services.

Default value is postgres CONDOR_CONFIG The CondorG configuration file path. Default value is

/ home/dguser/CondorG/etc/condor_config CONDORG_INSTALL_PATH the CondorG installation path. Default value is

/home/dguser/CondorG

Setting of the former variables is only needed if installation is not performed from rpms. However don't forget to check them in the file /opt/edg/etc/wl-jss_rb-env.sh when you install rpms. Moreover: $CONDORG_INSTALL_PATH/bin

IST-2000-25182 PUBLIC 44 / 116

Page 45: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

$CONDORG_INSTALL_PATH/sbin $PGSQL_INSTALL_PATH/bin (only if installation is not performed from rpm)must be included in the PATH environment variable and $CONDORG_INSTALL_PATH/lib, $PGSQL_INSTALL_PATH/lib (only if installation is not performed from rpm)have to be added to LD_LIBRARY_PATH. Finally, there are other environment variables needed at run-time by JSS. They are: EDG_WL_JSS_CONFIG_DIR the JSS configuration directory X509_HOST_CERT the user certificate file path X509_HOST_KEY the user private key file path X509_USER_PROXY the user proxy certificate file path GRIDMAP location of the Globus grid-mapfile that translates X509

certificate subjects into local Unix usernames. The default is /etc/grid-security/grid-mapfile.

Anyway all variables in the latter group are set into the jobsubmission start-up script.

IST-2000-25182 PUBLIC 45 / 116

Page 46: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

4.3. INFORMATION INDEXThe Information Index (II) is the service queried by the Resource Broker to get information about resources for the submitted jobs during the matchmaking process. An II must hence be deployed for each RB/JSS instance.This section describes steps to be performed to install and configure the Information Index service.

4.3.1. Required softwareFor installing the II, apart from the informationindex and the informationindex-profile rpms (see section 4.3.2 for details), the following Globus Toolkit 2.0 and Datagrid rpms are needed:

globus_ssl_utils-gcc32dbg_rtl version >= 2.1 globus_gram_reporter-noflavor_data version >= 2.0 globus_gss_assist-gcc32dbg_rtl version >= 2.0 globus_libtool-gcc32dbgpthr_rtl version >= 1.4 globus_openssl-gcc32dbg_rtl version >= 0.9.6b globus_openldap-gcc32dbg_pgm version >= 2.0.14 globus_libtool-gcc32dbg_rtl version >= 1.4 globus_openssl-gcc32dbgpthr_rtl version >= 0.9.6b globus_openldap-gcc32dbg_rtl version >= 2.0.14 globus_mds_back_giis-gcc32dbg_pgm version >= 0.3 globus_mds_gris-noflavor_data version >= 2.2 globus_cyrus_sasl-gcc32dbg_rtl version >= 1.5.27 globus_cyrus_sasl-gcc32dbgpthr_rtl version >= 1.5.27 globus_gssapi_gsi-gcc32dbg_rtl version >= 2.0 globus_openldap-gcc32dbgpthr_rtl version >= 2.0.14 edg-info-main version >= 1.0.0

The above listed rpms are available at http://datagrid.in2p3.fr/distribution/globus under the directory beta-xx/RPMS (recommended beta is 21 or higher) and at http://datagrid.in2p3.fr/distribution/datagrid/wp6. All the needed packages can be downloaded with the command

wget -nd –r <URL>/<rpm name>

and installed with rpm –ivh <rpm name>

4.3.2. RPM installation

IST-2000-25182 PUBLIC 46 / 116

Page 47: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

In order to install the Information Index service, the following command has to be issued with root privileges:

rpm -ivh workload-profile.X.Y.Z-K.i386.rpmrpm -ivh informationindex.X.Y.Z-K.i386.rpmrpm -ivh informationindex-profile.X.Y.Z-K.i386.rpm

By default the first rpm installs the software in the “/opt/edg” directory whilst the second in “/etc/rc.d/init.d”.

4.3.3. The Installation tree structureWhen the informationindex rpms have been installed, the following directory tree is created:

<install-path>/etc<install-path>/etc/grid-info-site-giis.conf<install-path>/etc/grid-info-slapd-giis.conf<install-path>/sbin<install-path>/sbin/information_index<install-path>/share<install-path>/share/doc<install-path>/share/doc/Workload<install-path>/share/doc/Workload/InformIndex<install-path>/share/doc/Workload/InformIndex/COPYING<install-path>/share/doc/Workload/InformIndex/NEWS<install-path>/share/doc/Workload/InformIndex/README<install-path>/var

/etc/rc.d/init.d/etc/rc.d/init.d/information_index

Under the installation path in etc are stored the configuration files and var (initially empty) is used by the II to store files created at start-up, containing args and pid of the II process. The information_index script file can be used both from /etc/rc.d/init.d and <install-path>/sbin to start the II.

4.3.4. ConfigurationThe II has two configuration files that are located in <install-path>/etc and are named:

grid-info-slapd-giis.conf

IST-2000-25182 PUBLIC 47 / 116

Page 48: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

grid-info-site-giis.confIn grid-info-slapd-giis.conf are specified the schema file locations and the database type, whilst in grid-info-site-giis.conf are listed the entries for the GRISes that are registered to this II. Each entry has the following format:dn: service=register, dc=mi, dc=infn, dc=it, o=gridobjectclass: GlobusTopobjectclass: GlobusDaemonobjectclass: GlobusServiceobjectclass: GlobusServiceMDSResourceMds-Service-type: ldapMds-Service-hn: bbq.mi.infn.itMds-Service-port: 2135Mds-Service-Ldap-sizelimit: 20Mds-Service-Ldap-ttl: 200Mds-Service-Ldap-cachettl: 50Mds-Service-Ldap-timeout: 30Mds-Service-Ldap-suffix: o=grid

The field Mds-Service-hn specifies the GRIS address; the Mds-Service-port specifies the GRIS port (2135 is strongly recommended) whilst the other entries are related to ldap sizelimit and ldap ttl. To add a new GRIS to the given II, it suffices to add a new entry like the one just showed, to the grid-info-site-giis.conf file.

Another file that can be used to configure the II is the start-up script information_index. In this file is indeed specified the number of the port that is used by the II to listen for requests whose default is 2170. This value can be changed to make II listen on another port provided it matches with the value of the MDS_port attribute in the RB configuration file rb.conf (see section 4.2.4.1).

4.3.5. Environment VariablesThe only environment variable needed by the II to run is the Globus installation path GLOBUS_LOCATION that is anyway set by the start-up script information_index.

IST-2000-25182 PUBLIC 48 / 116

Page 49: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

4.4. USER INTERFACEThis section describes the steps needed to install and configure the User Interface, which is the software module of the WMS allowing the user to access main services made available by the components of the scheduling sub-layer.

4.4.1. Required softwareIn order to install the UI, apart from the userinterface and workload-profile rpms (see section 4.4.2 for details) you will need the following packages:

workload-profile.X.Y.Z-K.i386.rpm userinterface-profile.X.Y.Z-K.i386.rpm userinterface-X.Y.Z-K.i386.rpm

the following Globus Toolkit 2.0 and Datagrid rpms available respectively at http://datagrid.in2p3.fr/distribution/globus and http://datagrid.in2p3.fr/distribution/datagrid/wp6 are needed:

globus_gss_assist-gcc32dbgpthr_rtl-2.0-21 globus_gssapi_gsi-gcc32dbgpthr_rtl-2.0-21 globus_ssl_utils-gcc32dbgpthr_rtl-2.1-21 globus_gass_transfer-gcc32dbg_rtl-2.0-21 globus_openssl-gcc32dbgpthr_rtl-0.9.6b-21 globus_ftp_control-gcc32dbg_rtl-1.0-21 globus_user_env-noflavor_data-2.1-21 globus_gss_assist-gcc32dbg_rtl-2.0-21 globus_gssapi_gsi-gcc32dbg_rtl-2.0-21 globus_ftp_client-gcc32dbg_rtl-1.1-21 globus_ssl_utils-gcc32dbg_rtl-2.1-21 globus_ssl_utils-gcc32dbg_pgm-2.1-21 globus_gass_copy-gcc32dbg_rtl-2.0-21 globus_gass_copy-gcc32dbg_pgm-2.0-21 globus_openssl-gcc32dbg_rtl-0.9.6b-21 globus_common-gcc32dbg_rtl-2.0-21 globus_profile-edgconfig-0.9-1 globus_io-gcc32dbg_rtl-2.0-21 globus_core-edgconfig-0.6-2 obj-globus-1.0-4.edg globus_cyrus_sasl-gcc32dbgpthr_rtl-1.5.27-21 globus_libtool-gcc32dbgpthr_rtl-1.4-21

IST-2000-25182 PUBLIC 49 / 116

Page 50: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

globus_mds_common-gcc32dbg_pgm-2.2-21 globus_openldap-gcc32dbg_pgm-2.0.14-21 globus_openldap-gcc32dbgpthr_rtl-2.0.14-21 globus_core-gcc32dbg_pgm-2.1-21

Moreover the set of security configuration rpm’s for all the Certificate Authorities in Testbed1 available at http://datagrid.in2p3.fr/distribution/datagrid/security/RPMS/ have to be installed together with the rpm to be used for renewing your certificate for your CA. This is available at http://datagrid.in2p3.fr/distribution/datagrid/security/RPMS/local/. The Python interpreter, version 2.1.1 has also to be installed on the submitting machine. The rpm for this package is available at http://datagrid.in2p3.fr/distribution/external/RPMS as:

python-2.1.1-3.i386.rpmInformation about python and the package sources can be found at www.python.org. Since the Linux RH 6.2 and RH 7.2 distribution already encompasses Python-1.5 installed and the recent standard Python2 rpms from RedHat and from python.org avoid the conflict with previous versions by only create python2* binaries, the UI scripts use “python2” executable as Python interpreter. Before using the UI commands it is hence important to check that the “python2” executable is available on the submission platform and if it is not the case the necessary symbolic link should be created.All the needed packages can be downloaded with the command

wget -nd –r <URL>/<rpm name>

and installed with rpm –ivh <rpm name>

4.4.2. RPM installationIn order to install the User Interface, the following command has to be issued with root privileges:

rpm –ivh workload-profile.X.Y.Z-K.i386.rpmrpm –ivh userinterface-profile.X.Y.Z-K.i386.rpmrpm -ivh userinterface-X.Y.Z-K.i386.rpm

By default the rpm installs the software in the “/opt/edg” directory.

IST-2000-25182 PUBLIC 50 / 116

Page 51: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

4.4.3. The tree structureAfter the userinterface* and the workload rpms have been installed, the following directory tree is created:<install-path>/bin<install-path>/bin/JobAdv.py<install-path>/bin/JobAdv.pyc<install-path>/bin/UIchecks.py<install-path>/bin/UIchecks.pyc<install-path>/bin/UIutils.py<install-path>/bin/UIutils.pyc<install-path>/bin/dg-job-cancel<install-path>/bin/dg-job-get-logging-info<install-path>/bin/dg-job-get-output<install-path>/bin/dg-job-id-info<install-path>/bin/dg-job-list-match<install-path>/bin/dg-job-status<install-path>/bin/dg-job-submit<install-path>/bin/libRBapi.py<install-path>/bin/libRBapi.pyc<install-path>/etc<install-path>/etc/UI_ConfigENV.cfg<install-path>/etc/UI_Errors.cfg<install-path>/etc/UI_Help.cfg<install-path>/etc/job_template.tpl<install-path>/lib<install-path>/lib/libLBapi.a<install-path>/lib/libLBapi.la<install-path>/lib/libLBapi.so<install-path>/lib/libLBapi.so.0<install-path>/lib/libLBapi.so.0.0.0<install-path>/lib/libLOGapi.a<install-path>/lib/libLOGapi.la<install-path>/lib/libLOGapi.so<install-path>/lib/libLOGapi.so.0<install-path>/lib/libLOGapi.so.0.0.0<install-path>/lib/libRBapic.a<install-path>/lib/libRBapic.la<install-path>/lib/libRBapic.so<install-path>/lib/libRBapic.so.0<install-path>/lib/libRBapic.so.0.0.0<install-path>/share<install-path>/share/doc<install-path>/share/doc/Workload<install-path>/share/doc/Workload/UserInterface

IST-2000-25182 PUBLIC 51 / 116

Page 52: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

<install-path>/share/doc/Workload/UserInterface/COPYING<install-path>/share/doc/Workload/UserInterface/NEWS<install-path>/share/doc/Workload/UserInterface/README

/etc/profile.d/wl-ui-env.sh/etc/profile.d/wl-ui-env.csh

The bin directory contains all UI python scripts including the commands made available to the user. In lib are installed all the API wrappers shared libraries, while in etc can be found the errors and configuration files UI_ConfigENV.cfg and UI_Errors.cfg plus the help file (UI_Help.cfg) and a template of a job description in JDL (job_template.tpl).

4.4.4. ConfigurationConfiguration of the User Interface is accomplished editing the file “<install-path>/etc/UI_ConfigENV.cfg:” to set opportunely the contained parameters. They are listed hereafter together wit their meanings:

DEFAULT_STORAGE_AREA_IN defines the path of the directory where files coming from RB (i.e. the jobs Output Sandbox files) are stored if not specified by the user through commands options. Default value for this parameter is:

DEFAULT_STORAGE_AREA_IN = /tmp

requirements, rank represent the values that are assigned by the UI to the corresponding job attributes (mandatory attributes) if these have not been provided by the user in the JDL file describing the job. Default values are:

requirements = TRUErank = - other.EstimatedTraversalTime

ErrorStorage represents the path of the location where the UI creates log files. Default location is:

ErrorStorage = /tmp

RetryCountLB and RetryCountJobId are the number of UI retrials on fatal errors respectively when opening connection with an LB and when querying the LB for information about a given job. Default values for these parameters are:

RetryCountLB = 1RetryCountJobId = 1

IST-2000-25182 PUBLIC 52 / 116

Page 53: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

LoggingTimeout represents the timeout of the dgLogTransfer LB API called by the UI for logging the JobTransfer event. This parameter makes the UI set accordingly the environment variable DGLOG_TIMEOUT. If not provided in the configuration file, it defaults to 2 seconds (UI and logging services on the same host). Recommended value for UI that are non-local to the logging services is 10 to 15 seconds. Value for this variable in the UI configuration file is

LoggingTimeout = 10

Moreover there are two sections reserved to the addresses of the LBs and RBs that are accessible for the UI from the machine where it is installed. Special markers (e.g. %%beginLB%%) that must not be modified, indicate the sections begin-end. Hereafter is reported an example of the two mentioned sections:

%%beginLB%%https://grid013g.cnaf.infn.it:7846https://grid004f.cnaf.infn.it:7846https://skurut.cesnet.cz:7846%%endLB%%

%%beginRB%%grid013g.cnaf.infn.it:7771grid004f.cnaf.infn.it:7771%%endRB%%

LB addresses must be in the format:[<protocol>://]<hostname>:<port>where if not provided, default for <protocol> is “https” and for <port> is 7846.RB addresses must instead be in the format:<hostname>:<port>

i.e. no protocol is admitted. If not provided, default for <port> is 7771.

The LB addresses are used by the User Interface to know which LB servers have to be contacted for querying about job info. They are used only when the issued command pertain “all jobs owned by a user” (e.g. see dg-job-status –all in section 6.1.3). Indeed in this case all listed LB are taken into account for querying, whilst when a job identifier (dg_jobId) is specified the LB address is taken directly from dg_jobId.The RB addresses are used by the User Interface to know which Resource Brokers can be accessed for job submission. When the user submits a job, the first RB in the list is considered and in case this is not available for some reason, the connection to second one is tried and so one until an available RB is found. The same happens when asking the list of matching CEs for a job (see dg-job-submit and dg-job-list-match commands at section 6.1.3).

IST-2000-25182 PUBLIC 53 / 116

Page 54: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

The RB addresses are used instead in a similar way as for the LB when the user asks for cancellation of all its jobs. In this case indeed all listed RB are asked for deletion of jobs owned by the requesting user (see dg-job-cancel –all at section 6.1.3).

4.4.5. Environment variablesEnvironment variables that have to be set for the User Interface are listed hereafter:

X509_USER_KEY the user private key file path. Default value is $HOME/.globus/userkey.pem

X509_USER_CERT the user certificate file path.Default value is $HOME/.globus/usercert.pem

X509_CERT_DIR the trusted certificate directory and ca-signing-policy directory. Default value is /etc/grid-security/certificates

X509_USER_PROXY the user proxy certificate file path. Default value is /tmp/x509up_u<UID> where UID is the user identifier on the machine

as required by GSI. Moreover there are:

EDG_WL_UI_INSTALL_PATH UI install path. It has to be set only if installation has been made in a non default location. It defaults to /opt/edg

EDG_WL_UI_CONFIG_PATH Non standard location of the UI configuration file UI_ConfigENV.cfg. This variable points to the file absolute path.

GLOBUS_LOCATION The Globus rpms installation path.

The Logging library i.e. the library that is linked into UI for logging the jobs transfer events reads its immediate logging destination form the variable DGLOG_DEST. Correct format for this variable is:DGLOG_DEST=x-dglog://HOST:PORTwhere HOST defaults to localhost and PORT defaults to 15830. On the submitting machine if the variable is not set it is dynamically assigned by the UI with the value:DGLOG_DEST=x-dglog://<LB_CONTACT>:15830where LB_CONTACT is the hostname of the machine where the LB server currently associated to the RB used for submitting jobs is running.

4.5. DOCUMENTATIONThe userguide documentation package (see section 3.2.2 for more details) provides you all the information needed to download ,configure, install and use the Datagrid software. Once you have installed the userguide rpm, the following directory tree is created:

IST-2000-25182 PUBLIC 54 / 116

Page 55: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

<install-path>/share<install-path>/share/doc<install-path>/share/doc/DataGrid_01_TEN_0118_0_X_Document.pdf

IST-2000-25182 PUBLIC 55 / 116

Page 56: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

5. OPERATING THE SYSTEMFor security purposes all the WMS daemons run with proxy certificates. These certificates are generated from the start-up scripts that are described in the following section, before the applications are started. Lifetime of proxies created by the start-up scripts is 24 hours. In order to provide the daemons with valid proxies for all their lifetime the administrators need to ensure regular generation of new proxies. This can be achieved adding the following lines to the machine /etc/crontab:

57 2,8,14,20 * * * root service locallogger proxy57 2,8,14,20 * * * root service lbserver proxy57 2,8,14,20 * * * root service broker proxy57 2,8,14,20 * * * root service jobsubmission proxy

This will make proxies be created by cron.

5.1. LB LOCAL-LOGGER

5.1.1. Starting and stopping daemonsTo run the LB local-logger services, it suffices to issue as root the following command:

/etc/rc.d/init.d/locallogger start

if the locallogger-profile rpm has been installed. Otherwise you can use

<install path>/sbin/locallogger start

This makes both the dglogd and the interlogger processes start. The same can be done issuing the following commands:

<install path>/sbin/dglogd <options><install path>/sbin/interlogger <options>

Both daemons recognize a common set of options:--key=<keyfile> host certificate private key file (this option overrides value of

the environment variable X509_USER_KEY). Here below an example of option usage:--key=/etc/grid-security/hostkey.pem

--cert=<certfile> host certificate file (this option overrides value of the environment variable X509_USER_CERT). Here below an example of option usage:

IST-2000-25182 PUBLIC 56 / 116

Page 57: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

--cert=/etc/grid-security/hostcert.pem

--CAdir=<certdir> trusted certificate and ca-signing-policy directory (this option overrides value of the environment variable X509_CERT_DIR). Here below an example of option usage:--CAdir=/etc/grid-security/certificates

--file-prefix=<file path> Absolute path of the file where are stored locally the logged events. The default value is /tmp/dglog, which can result in risk of data loss in case of reboot. Note that the same value must be specified for dglogd and interlogger.

--debug make the process run in foreground to produce diagnostics

Using the options explicitly is recommended rather than relying on the correspondent environment variables.Stop of the LB local-logger services can be performed using the locallogger script with the stop option.

5.1.2. TroubleshootingIf the LB local-logger services are started in debug mode (i.e. using the –-debug option), the daemons log fatal failures with syslog().

IST-2000-25182 PUBLIC 57 / 116

Page 58: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

5.2. LB SERVER

5.2.1. Starting and stopping daemonsTo run the LB server services, it suffices to issue as root the following command:

/etc/rc.d/init.d/lbserver start

if the lbserver-profile rpm has been installed. Otherwise you can use

<install path>/sbin/lbserver start

This makes both the bkserver and the ileventd processes start. The same can be done issuing the following commands:

<install path>/sbin/ileventd <options><install path>/sbin/bkserver <options>

Both daemons recognize a common set of options:

--key=<keyfile> host certificate private key file (this option overrides value of the environment variable X509_USER_KEY). Here below an example of option usage:--key=/etc/grid-security/hostkey.pem

--cert=<certfile> host certificate file (this option overrides value of the environment variable X509_USER_CERT). Here below an example of option usage:--cert=/etc/grid-security/hostcert.pem

--CAdir=<certdir> trusted certificate and ca-signing-policy directory (this option overrides value of the environment variable X509_CERT_DIR). Here below an example of option usage:--CAdir=/etc/grid-security/certificates

--debug make the process run in foreground to produce diagnostics

Using the options explicitly is recommended rather than relying on the correspondent environment variables.

IST-2000-25182 PUBLIC 58 / 116

Page 59: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

Stop of the LB server services can be performed using the lbserver script with the stop option.

5.2.2. Purging the LB databaseThe bkpurge process, whose executable is installed in <install path>/sbin, is not a daemon but an utility which should be run periodically (e.g. using a cron job) in order to remove inactive jobs (i.e. those that have already entered the Cleared status since a certain amount of time) from the LB database. This utility recognizes the following set of options:

--log data being purged from database are dumped on the stdout

--outfile=<file> data being purged from database are dumped in the file named <file>

--mysql=<database> name of the database to be purged. It must be the same used by bkserver (this option is not required in the standard set-up

--timeout=<timeout>[smhd] removes data for all jobs that entered the “Cleared” status since more than <timeout> [seconds/minutes/hours/days].

--debug print diagnostics on the stderr--nopurge dry run mode. It doesn't really purge (useful for

debugging purposes)--aborted, -a delete from the database data also for jobs that have

entered the “Aborted” statusIf --log is specified, the data in ULM format are dumped to stdout (or <file>). Normally information is appended to the file. The file is locked with flock (_LOCK_EX) to prevent race conditions, e.g. rotating logs.An example of usage of this utility could be the issuing once a day, using a cron job, of a bkpurge like:

bkpurge --log --outfile=/var/log/dglb-data.log --timeout=14d

5.2.3. TroubleshootingIf the LB server services are started in debug mode (that is using the –-debug option) the daemons log fatal failures with syslog().

IST-2000-25182 PUBLIC 59 / 116

Page 60: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

5.3. RB AND JSS

5.3.1. Startig PostgreSQLBoth RB and JSS use the service offered by the PostGreSQl database. It must be started before one of these daemons using its own startup script:

/etc/rc.d/init.d/postgresql start

or using RedHat service command:

service postgresql start

Stopping is achieved by the same commands with the stop parameter:

/etc/rc.d/init.d/postgresql stop

orservice postgresql stop

5.3.2. Starting and stopping JSS and RB daemonsThe packages *-profile.X.Y.Z.rpm provide the SysV RedHat-like scripts that allow starting these daemons. In particular startup of RB or JSS can be achieved issuing directly:

/etc/rc.d/init.d/broker start/etc/rc.d/init.d/jobsubmission start

or, indirectly, using RedHat dedicated commands:

service broker startservice jobsubmission start

In the same way stopping is achieved by:

/etc/rc.d/init.d/broker stop/etc/rc.d/init.d/jobsubmission stop

or

IST-2000-25182 PUBLIC 60 / 116

Page 61: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

service broker stopservice jobsubmission stop

The startup script for JSS also starts and stops the underlying CondorG service. If any of the configuration steps described in section 4.2 has been followed, these scripts will start the daemons with the correct selected users (see also Table 2 in section 7.6). However do not forget to put the right files (hostkey.pem and hostcert.key) in the locations pointed respectively by the variables X509_HOST_KEY and X509_HOST_CERT (this must be located in the subdirectory hostcert of the home directory of the dguser account).Startup scripts can also be used to know the current status of the daemons using the status option:

service broker status service jobsubmission status

Moreover it is strongly recommended to set the configuration of the machine in such a way that all these services (PostGreSQL, RB and JSS) will be started at the startup of the system. For these issue, refer to the RedHat chkconfig SysV script manager command.

5.3.3. RB troubleshootingThe RB supplies with a log file recording its various events. This file can be used to debug abnormal behaviours of the service. . The RB log-file name and other properties can be changed by directly modifying the rb.conf configuration file. You can change the name of the file, the debug level and the maximum file size in bytes, as well.

5.3.4. JSS troubleshootingThe script responsible to start JSS also includes the definition of the JSS log files. There are two of them and their pathname is set respectively to: /var/tmp/JSSserver.log and /var/tmp/JSSparser.log. As before, modifying these locations implies a modification of the /etc/rc.d/init.d/jobsubmission script in the following two lines:

SERVERLOG=/var/tmp/JSSserver.logPARSERLOG=/var/tmp/JSSparser.log

5.4. INFORMATION INDEX

5.4.1. Starting and stopping daemonsTo start/stop the II, the following command has to be used as root:

IST-2000-25182 PUBLIC 61 / 116

Page 62: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

/etc/rc.d/init.d/information_index {start | stop}

IST-2000-25182 PUBLIC 62 / 116

Page 63: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

6. USER GUIDEThe software module of the WMS allowing the user to access main services made available by the components of the scheduling sub-layer is the User Interface that hence represents the entry-point to the whole system. Sections 6.1.1 and 6.1.2 provide a general description of the UI, dealing with the security management, common behaviours, environment variables to be set etc. Section 6.1.3 instead describes the Job Submission User Interface commands in a Unix man-page style.

6.1. USER INTERFACE The Job Submission UI is the module of the WMS allowing the user to access main services made available by the components of the scheduling sub-layer. The user interaction with the system is assured by means of a JDL and a command-driven user interface providing com-mands to perform a certain set of basic operations. Main operations made possible by the UI are:

- Submit a job for execution on a remote Computing Element, also encompassing: automatic resource discovery and selection staging of the application sandbox (input sandbox)

- Find the list of resources suitable to run a specific job- Cancel one or more submitted jobs- Retrieve the output files of a completed job (output sandbox)- Retrieve and display bookkeeping information about submitted jobs- Retrieve and display logging information about submitted jobs.

The User Interface depends on two other Workload Management System components:- the Resource Broker that provides support for the job control functionality- the Logging and Bookkeeping Service provides support for the job monitoring

functionality.

6.1.1. Security For the DataGrid to be an effective framework for largely distributed computation, users, user processes and grid services must work in a secure environmentDue to this, all interactions between WMS components, especially those that are network-separated, will be mutually authenticated: depending on the specific interaction, an entity au-thenticates itself to the other peer using either its own credential or a delegated user creden-tial or both. For example when the User Interface passes a job to the Resource Broker, the UI authenticates using a delegated user credential (a proxy certificate) whereas the RB uses its own service credential. The same happens when the UI interacts with the Logging and Bookkeeping service. The UI uses a delegated user credential to limit the risk of compromis-ing the original credential in the hands of the user.The user or service identity and their public key are included in a X.509 certificate signed by a DataGrid trusted Certification Authority (CA), whose purpose is to guarantee the associa-tion between that public key and its ownerAccording to what just premised, to take advantage of UI commands the user has to possess a valid X.509 certificate on the submitting machine, consisting of two files: the certificate file and the private key file. The location of the two mentioned files is assumed to be either pointed to respectively by “$X509_USER_CERT” and “$X509_USER_KEY” or by “$HOME/.globus/usercert.pem” and “$HOME/.globus/userkey.pem” if the X509 environment variables are not set. The user certificate and private key files are needed for the creation of

IST-2000-25182 PUBLIC 63 / 116

Page 64: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

the delegated user credentials and for reading the user certificate subject that acts as an identifier for the submitter. All UI commands, when started, check for the existence and expiration date of a user proxy certificate in the location pointed to by “$X509_USER_PROXY” or in “/tmp/x509up_u<UID>” (<UID> is the user identifier in the submitting machine OS) if the X509 environment variable is not set. If the proxy certificate does not exist or has expired a new one with default duration of 24 hours is automatically created by the UI using the GSI services (grid-proxy-init and grid-proxy-info). The user proxy certificate is created either as “$X509_USER_PROXY” or as “/tmp/x509up_u<UID>”. Once a job has been submitted by the UI, it passes through several components of the WMS (e.g. the RB, the JSS etc.) before it completes its execution. At each step operations that are related with the job could require authentication by a certificate. For example during the scheduling phase, the RB needs to get some information about the user who wants to sched-ule a job and the certificate of the user could be needed to access this information. Similarly, a valid user’s certificate is needed by JSS to submit a job to the CE. Moreover JSS has to be able to repeat this process e.g. in case of crashing of the CE which the job is running on, therefore, a valid user’s certificate is needed for all the job lifetime.A job gets a valid proxy certificate when it is submitted by the UI to RB. Validity of such a cer-tificate is usually set to 12 hours, hence problems could occur if the job spends on CE (in a queue or running) more time than lifetime of its proxy certificate.The UI dg-job-submit command (see description later in this document) supplies an option (--hours H) allowing the specification of the duration in hours of the proxy certificate that is created on behalf of the user. Due to this, it being understood that the certificates files search paths remains as before, the proxy checking mechanism for this command slightly differs from that of the other commands, i.e.: If the “--hours H” option has not been specified, the proxy certificate check is done as ex-

plained before If the “--hours H” option has been specified, then a new proxy certificate having a dura-

tion of H hours is created both when no existing proxy is found and when the existing proxy lifetime is less than H. In the latter case the existing proxy certificate is destroyed before creating the new one.

This allows the user to submit jobs running longer then the default proxy duration (12 hours).Another way for achieving this in a more secure way is to deploy the features of MyProxy package. The underlying idea is that the user registers in a MyProxy server a valid long-term certificate proxy that will be used by JSS to perform a periodic credential renewal for the sub-mitted job; in this way the user is no longer obliged to create very long lifetime proxies when submitting jobs lasting for a great amount of time. A more detailed description of this mecha-nism is provided in the following paragraph.

6.1.1.1. MyProxyThe MyProxy credential repository system consists of a server and a set of client tools that can be used to delegate and retrieve credentials to and from a server. Normally, a user would start by using the myproxy_init client program along with the permanent credentials necessary to contact the server and delegate a set of proxy credentials to the server along with authentication information and retrieval restrictions.The MyProxy Toolkit is available at the following URL:http://lindir.ics.muni.cz/dg_public/myproxy-0.4.4-edg.tar.gz

IST-2000-25182 PUBLIC 64 / 116

Page 65: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

In order to compile the package you'll have to follow the common Unix/Linux configure/make commands:

./configure --with-gsi=/opt/globus --with-globus-flavor=gcc32dbg \--disable-anonymous-auth --prefix=/opt/myproxy

Type ./configure --help for all the detailed options (such as binaries, server configuration paths, etc)Once you have successfully launched the configure script you can compile the source and install the package launching 'make' and 'make install'.Before using the MyProxy tools, you have to restrict the users that are allowed to store credentials within the myproxy server and, more importantly, which clients are allowed to retrieve credentials from the myproxy server. To do that, just locate and edit the myproxy-server.config file (if you installed the package in the standard directory the file can be found in /opt/myproxy/etc/). For DataGrid it is sufficient to define the accepted_credentials and authorized_renewers fields. This is an example part of a myproxy server configuration file:

accepted_credentials "/C=CZ/O=CESNET/*" accepted_credentials "/O=CESNET/*" accepted_credentials "/C=IT/O=INFN/*" authorized_renewers "/C=CZ/O=CESNET/O=Masaryk University/CN=host/*" authorized_renewers "/O=CESNET/O=Masaryk University/CN=host/*"

MyProxy Servermyproxy-server is a daemon that runs on a trusted, secure host that manages a database of proxy credentials for use from remote sites. Proxies have a lifetime that is controlled by the myproxy-init program. When a proxy is requested to the myproxy-server, via the myproxy-get-delegation command, further delegation insures that the lifetime of the new proxy is less than the original to enforce greater security.A configuration file is responsible for maintaining a list of trusted portals and users that can access this service.In order to launch the demon you'll have to run the binary '<prefix>/sbin/myproxy-server'. The program will start up and background itself. It accepts connections on TCP port 7512, forking off a separate child to handle each incoming connection. It logs information via the syslog service.

MyProxyClientThe set of binaries provided for the client is made of the following files:

myproxy-initmyproxy-infomyproxy-destroymyproxy-get-delegation

IST-2000-25182 PUBLIC 65 / 116

Page 66: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

myproxy-init command allows you to create and send a delegated proxy to a myproxy server for later retrieval; in order to launch it you have to assure you're able to execute the grid-proxy-init GLOBUS command (i.e.the binary is visible from your $PATH environment and the required cert files are either stored in the common path or specified with the X509 variables). You can use the command as follows (you will be asked for your PEM passhprase):

myproxy-init -s <host name> -t <hours> -d –n

The myproxy-init command stores a user proxy in the repository specified by <host name> (the –s option). Default lifetime of proxies retrieved from the repository will be set to <hours> (see -t) and no password authorization is permitted when fetching the proxy from the repository (the -n option). The proxy is stored under the same username as is your subject in your certificate (-d).The myproxy-info command returns the remaining lifetime of the proxy in the repository along with subject name of the proxy owner (in our case it will be the same as in your proxy certificate). So If you want to get information about the stored proxies you can issue:

myproxy-info -s <host name> -d

where -s and -d options have already been explained in the myproy-init commandThe myproxy-destroy command simply destroys any existing proxy stored in the myproxy server. You can use it as follows:

myproxy-destroy -s <host name> -d

where -s and -d options have already been explained in the myproy-init commandThe myproxy-get-delegation command is indeed used to retrieve information about the proxies stored in the myproxy server. You can use it as follows:

myproxy-get-delegation -s <host name> -d -t <hours> \-o <output file> -a <user proxy>

You should end up with a retrieved proxy in <output file>, which is valid for<hours> hours.It is worth noting that the environment variable MYPROXY_SERVER can be set to tell to all these programs the hostname where the myproxy server is running.

6.1.2. Common behavioursA User Interface installation mainly consists of three directories bin, lib and etc that are created under the UI installation path that is usually pointed by the

IST-2000-25182 PUBLIC 66 / 116

Page 67: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

EDG_WL_UI_INSTALL_PATH environment variable. If this variable is not set or its value is not correct, default value is assumed to be “/opt/edg”.bin contains the commands executables and hence it is recommended to add it to the user PATH environment variable to allow her/him to use UI commands from whatever location. lib contains the shared libraries (wrappers of the RB/LB APIs) implementing functionalities for accessing the RB and LB services , whereas etc is the UI configuration area. The UI configuration area etc contains the job description template file job_template.jdl, the file containing the mapping between error codes and error messages UI_Errors.cfg, and the actual configuration file UI_ConfigEnv.cfg. The latter file is the only one that could need to be edited and tailored according to the user/platform characteristics and needs. It contains the following information that are read by and have influence on commands behaviour (see section 4.4.4 for details):

- address and port of accessible RBs ordered by priority,- address and port of accessible LBs ordered by priority,- default location of the local storage areas for the Input/Output sandbox files,- default values for the JDL mandatory attributes,- default number of retrials on fatal errors when connecting to the LB.

When started, UI commands first check if the EDG_WL_UI_INSTALL_PATH is set and then search for the etc directory containing its configuration files in the following locations, in order of precedence: “$EDG_WL_UI_INSTALL_PATH”, “/“, “/usr/local“ and “/opt/edg“. If none of the locations contains needed files an error is returned to the user.Since several users on the same machine can use a single installation of the UI, people concurrently issuing UI commands share the same configuration files. Anyway for users (or groups of users) having particular needs it is possible to “customise” the UI configuration through the --config option supported by each UI command.Indeed every command launched specifying “--config file_path” reads its configuration settings in the file “file_path” instead of the default configuration file. Hence the user only needs to create such file according to her/his needs and to use the --config option to work under “private” settings.Moreover if the user wants to make this change in some way permanent avoiding the use for each issued command of the --config option, she/he can set the environment variable EDG_WL_UI_CONFIG_PATH to point to the non-standard path of the configuration file. Indeed if that variable is set commands will read settings from file “$EDG_WL_UI_CONFIG_PATH”. Anyway the --config option takes precedence on all other settings.Hereafter are listed the options that are common to all UI commands (with the exception of dg-job-id-info that is a local utility):

- --config file_path- --noint- --debug- --logfile file_path- --version- --help

IST-2000-25182 PUBLIC 67 / 116

Page 68: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

The --noint option skips all interactive questions to the user and goes ahead in the command execution. All warning messages and errors (if any) are written to the file <command_name>_<UID>_<PID>.log in the “/tmp” directory instead of the standard output. It is important to note that when --noint is specified some checks on “dangerous actions” are skipped. For example if jobs cancellation is requested with this option, this action will be performed without requiring any confirmation to the user. The same applies if the command output will overwrite an existing file, so it is recommended to use the --noint option in a safe context.The --debug option is mainly thought for testing and debugging purposes; indeed it makes the commands print additional information while running. Every time an external API function call is encountered during the command execution, values of parameters passed to the API are printed to the user. The info messages are displayed on the standard output and are also written together with possible errors, to <command_name>_<UID>_<PID>.log file in the /tmp directory. An example of the debug messages format is as follows:#### Debug API #### - The function 'dgLBJobStatus' has been called with the following parameter(s): >>Struct 'dgLBContext': -> 0 -> 0 >>Struct 'dgJobId': -> lx01.hep.ph.ic.ac.uk/124445102160554 -> grid004f.cnaf.infn.it -> 7846 -> grid013g.cnaf.infn.it:7771>> 0

If --noint option is specified together with --debug option the debug message will not be printed on standard output.The –logfile <file_path> option allows re-location of the commands log files in the location pointed by file_path.The --version and --help options respectively make the commands display the UI current version and the command usage.Two further options that are common to almost all commands are --input and --output. The latter one makes the commands redirect the outcome to the file specified as option argument whilst the former reads a list of input items from the file given as option argument. The only exception is the dg-job-list-match command that does not have the --input option.For all commands, the file given as argument to the --input option shall contain a list of job identifiers in the following format: one dg_jobId for each line, comments beginning with a “#” or a “*” character. If the input file contains only one dg_jobId (see the description of dg-job-submit command later in this document for details about dg_jobId format), then the request is directly submitted taking the dg_jobId as input, otherwise a menu is displayed to the user listing all the contained items, i.e. something like:------------------------------------------------------------------------------------------------------------------------------------------1 : https://grid013g.cnaf.infn.it:7846/lx01.hep.ph.ic.ac.uk/133711137156527?grid013g.cnaf.infn.it:77812 : https://grid013g.cnaf.infn.it:7846/lx01.hep.ph.ic.ac.uk/133747137833158?grid013g.cnaf.infn.it:77813 : https://grid004f.cnaf.infn.it:7846/lx01.hep.ph.ic.ac.uk/133957138124219?grid004f.cnaf.infn.it:7771

IST-2000-25182 PUBLIC 68 / 116

Page 69: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

4 : https://grid013g.cnaf.infn.it:7846/lx01.hep.ph.ic.ac.uk/134030138239274?grid013g.cnaf.infn.it:77715 : https://grid001f.cnaf.infn.it:7846/lx01.hep.ph.ic.ac.uk/140706140477638?grid013g.cnaf.infn.it:7771a : allq : quit-------------------------------------------------------------------------------------------------------------------------------------------Choose one or more dg_jobId(s) in the list - [1-10]all:

The user can choose one or more jobs from the list entering the corresponding numbers. E.g.:

2 makes the command take the second listed dg_jobId as input 1,4 makes the command take the first and the fourth listed dg_jobIds as input 2-5 makes the command take listed dg_jobIds from 2 to 5 (ends included) as input all makes the command take all listed dg_jobIds as input q makes the command quit

Default value for the choice is all. If the –input option is used together with the --noint then all dg_jobIds contained in the input file are taken into account by the command.The only command whose --input behaviour differs from the one just described is dg-job-submit. First of all the input file contains in this case CEIds instead of dg_jobIds, moreover only one CE at a time can be the target of a submission hence the user is allowed to choose one and only one CEId. Default value for the choice is “1”, i.e. the first CEId in the list. This also the choice automatically made by the command when the --input option is used together with the --noint one.

IST-2000-25182 PUBLIC 69 / 116

Page 70: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

6.1.3. Commands descriptionIn this section we describe syntax and behavior of the commands made available by the UI to allow job submission, monitoring and control.In the commands synopsis the mandatory arguments are showed between angle brackets (<arg>) whilst the optional ones between square brackets ([arg]).

– dg-job-submitAllows the user to submit a job for execution on remote resources in a grid.

SYNOPSISdg-job-submit [options] <jdl file>Options: --help --version --template

--input, -i <input_file> --resource, -r <Ce_id> --notify, -n <e-mail_address(es)> --hours, -h <hours_number> --nomsg --config, -c <config_file> --output, -o <output_file> --noint --debug --logfile <log_file>

DESCRIPTIONdg-job-submit is the command for submitting jobs to the DataGrid and hence allows the user to run a job at one or several remote resources. dg-job-submit requires as input a job description file in which job characteristics and requirements are expressed by means of Condor class-ad-like expressions. While it does not matter the order of the other arguments, the job description file has to be the last argument of this command.The job description file given in input to this command is syntactically checked and default values are assigned to some of the not provided mandatory attributes in order to create a meaningful class-ad. The resulting job-ad is sent to the Resource Broker that finds the job best matching resource (match-making) and submits the job to it. The match-making algorithm is described in details in Annex 7.5.

IST-2000-25182 PUBLIC 70 / 116

Page 71: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

Upon successful completion this command returns to the user the submitted job identifier dg_jobId (a string that identifies unambiguously the job in the whole DataGrid), generated by the User Interface, that can be later used as a handle to perform monitor and control operations on the job (e.g. see dg-job-status described later in this document). The format of the dg_jobId is as follows:

<LBname>/<UIaddress>/<time><PID><RND>?<RBname>where:

- LBname is the LB server name and port- UIaddress is the UI machine IP address (or FQDN)- time is the current UTC time on the submitting machine in hhmmss format - PID is the command process identifier- RND is a random number generated at each job submission- RBname is the RB server hostname and port

The structure of the dg_jobId that could appear in some way complex and not easily readable has been conceived in order to assure uniqueness and the same time contain information that are needed by the components of the WMS to fulfil user requests. The --resource option can be used to target the job submission to a specific known resource identified by the provided Computing Element identifier ce_id (returned by dg-job-list-match described later in this document). A resource will be either a queue of an underlying LRMS, assuming that this queue represents a set of “homogeneous” resources or a “single” node. The CE identifier is a string, assigned by WP4 and published in the GIS (the CEId field) that univocally identifies a resource belonging to the Grid. CEId is obtained “combining” the GlobusResourceContactString and QueueName attribute, e.g. if lxde01.pd.infn.it:2119 is the Globus resource contact string and grid01 is the queue name then it looks like lxde01.pd.infn.it:2119/jobmanager-lsf-grid01. In other words the admitted format for CEId is:

<full-hostname>:<port-number>/jobmanager-<service>-<queue-name>where <service> can be lsf, pbs or bqs.When the --resource option is specified, the Resource Broker skips the match making process and directly submits the job to the requested CE. It is also possible to specify the target CE to which submit the job using the --input option. With the --input option an input_file must be supplied containing a list of target CE ids. In this case the dg-job-submit command parses the input_file and displays on the standard output the list of CE Ids written in the input_file. The user is then asked to choose one CEId between the listed ones. The command will then behave exactly like already explained for the --resource option. The basic idea of this command is to use as input_file the output file generated by the dg-job-list-match command when used with the --output option (see dg-job-list-match) that contains the list of CE Ids (if any) matching the requirements specified in the jobad.jdl file. An example of a possible sequence of commands is:>$ dg-job-list-match jobad.jdl --output CEList.out>$ dg-job-submit jobad.jdl --input CEList.out If CEList.out contains more than one CEId then the user is prompted for choosing one Id from the list.When dg-job-submit is used with the --notify option, the following schema is used to notify the user about job status changes:

IST-2000-25182 PUBLIC 71 / 116

Page 72: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

- an e-mail notification is sent to the specified e_mail_address when the match-making process has finished and the job is ready to be submitted to JSS (READY status)

- an e-mail notification is sent to the specified e_mail_address when the job starts running on the CE (RUNNING status)

- an e-mail notification is sent to the specified e_mail_address when the job has finished (ABORTED or DONE status).

The notification message will contain basic information about the job such as the job identifier, the Id of the assigned CE and a brief description of its status. Notification to multiple contacts can be requested by specifying the corresponding e-mail addresses separated by commas and without blanks.It is possible to redirect the returned dg_jobId to an output file using the --output option. If the file already exists, a check is performed: if the file was previously created by the command dg-job-submit (i.e. it contains a well defined header), the returned dg_jobId is appended to the existing file every time the command is launched. If the file wasn’t created by the command dg-job-submit the user will be prompted to choose if overwrite the file or not. If the answer is no the command will abort.The dg-job-submit command has a particular behaviour when the job description file contains the InputSandbox attribute whose value is a list of file paths on the UI machine local disk. The purpose of the introduction of the InputSandbox attribute is to stage, from the UI to the CE, files that are not available in any SE and are not published in any Replica Catalogue.To better understand, let’s suppose to have a job that needs for the execution a certain set of files having a small size and available on the submitting machine. Let’s also suppose that for performance reasons it is preferable not going through the WP2 data transfer services for the staging of these files on the executing node. Then the user can use the InputSandbox attribute to specify the files that have to be staged from the submitting machine to the executing CE. All of them are indeed transferred at job submission time together with the job class-ad to the RB that will store them temporarily on its local disk. The JSS will then perform the staging of these files on the executing node. The size of files to be transferred to the RB should be small since overfull of RB local storage means that no more job of this type can be submitted. This mechanism can also be used to stage a job executable available locally on the UI machine to the executing CE. Indeed in this case the user has to include this file in the InputSandbox list (specifying its absolute path in the file system of the UI machine) and as Executable attribute value has only to specify the file name. On the contrary, if the executable is already available in the file system of the executing machine, the user has to specify as Executable an absolute path name for this file (if necessary using environment variables). The same argument can be applied to the standard input file that is specified through the StdInput JDL attribute.For the standard output and error of the job the user shall instead always specify just file names (without any directory path) through the StdOutput and StdError JDL attributes. To have them staged back on the UI machine it suffices to list them in the OutputSandbox and use after job completion the dg-job-get-output command described later in this document.The list of data specification JDL attributes is completed by the InputData attribute that refers to data used as input by the job that are not subjected to staging and are stored in one or more storage elements and published in replica catalogues. Due to this when the user specifies the InputData attribute then he/she also has to provide the name of the replica catalogue (ReplicaCatalog attribute) where these data are published and the protocol her/his

IST-2000-25182 PUBLIC 72 / 116

Page 73: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

application is able to “speak” for accessing data (DataAccessProtocol attribute). The InputData attribute should normally contain a list of logical and/or physical file names. If InputData only contains PFNs then the ReplicaCatalog attribute specification is no more mandatory.Since the InputSandbox expression can consist of a great number of file names, it is admitted the use of wildcards and environment variables to specify the value of this attribute. Syntax and allowed wildcards are described in Annex 7.4.The --hours allows the user to specify the user proxy duration H, in hours, needed for submitting the job. This option has to be used for long-lasting jobs, indeed a job when submitted needs to be accompanied by a valid proxy certificate during all its life-time and the default duration of user proxy created by UI commands is 12 hours that could in some case not be enough.It is recalled that anyway a safer way for submitting long-running jobs is to use the myproxy-init command (see section 6.1.1.1) before the dg-job-submit. The myproxy-init command registers indeed in a MyProxy server a valid long-term certificate proxy that will be used by JSS to perform a periodic credential renewal for the submitted job. When using the myproxy-init command the hostname of the MyProxy server where to store the certificate proxy has to be specified. If the used sever host name is different from the default one used for the credential renewal, reported in the RB configuration file (rb.conf), it has to be specified within the JDL job description through the MyProxyServer attribute. An example is provided hereafter:

MyProxyServer = “skurut.cesnet.cz”;Note that the port number must not be provided.Lastly the --nomsg option makes the command display neither messages nor errors on the standard output. Only the dg_jobId assigned to the job is printed to the user if the command was successful. Otherwise the location of the generated log file containing error messages is printed on the standard output. This option has been provided to make easier use of the dg-job-submit command inside scripts in alternative to the --output option.

JOB DESCRIPTION FILE

A job description file contains a description of job characteristics and constraints in a class-ad style. Details on the class-ad language are reported in the document [A1] also available at the following URL:http://www.infn.it/workload-grid/docs/DataGrid-01-TEN-0102-0_2.pdf.The job description file must be edited by the user to insert relevant information about the job that is later needed by the RB to perform the match-making. A template of the job description file, containing a basic set of attributes can be obtained by calling the dg-job-submit command with the --template option. Job description file entries are strings having the format attribute = expression and are terminated by the semicolon character. If the entry spans more than one line, the end of line has to be indicated with a backslash (\) character. Comments must be preceded by a sharp character (#) at the beginning of each line.Being the class-ad an extensible language, it there doesn’t exist a fixed set of admitted attributes, i.e. the user can insert in the job description file whatever attribute he believes meaningful to describe her/his jobs, anyway only the attributes that can be in some way connected with the resource ones published in the GIS are taken into account by the Resource Broker for the match-making process. Unrelated attributes are simply ignored

IST-2000-25182 PUBLIC 73 / 116

Page 74: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

except when they are used to build the Requirements expression. In the latter case they are indeed evaluated and could affect the match-making result. The attributes taken into account by the RB together with their meaning are reported in document [A7].There is a small subset of class-ad attributes that are compulsory, i.e. that have to be present in a job class-ad before it is sent to the Resource Broker in order to make possible the performing of the match making process. They can be grouped in two categories: some of them must be provided by the user whilst some other, if not provided, are filled by the UI with configurable default values. The following Table 1 summarises what just stated.

Attribute Mandatory Mandatory with default value (default value)

Executable

Requirements (TRUE)

Rank (-other.EstimatedTraversalTime)

InputData (only if the ReplicaCatalog and/or the DataAccessProtocol attributes

have been specified)

ReplicaCatalog (only if the InputData attribute has

been specified)

DataAccessProtocol (only if the InputData attribute has

been specified)

Table 1 Mandatory Attributes

In Table 1 the default values for Requirements and Rank can be interpreted respectively as follows:- if the user has not provided job constraints then Requirements is set to TRUE, i.e. it does

not matter which are characteristics of the computing element where the job has to be executed, the RB will take into account all sites where the user is authorised to run her/his application.

- Since in the JDL the greater is the value of Rank the better is considered the match, if no expression for Rank has been provided, then the resources where the jobs waits a shorter time to pass from the SCHEDULED to the RUNNING status are preferred.

The default values for the Requirements and Rank attributes can be set in the UI_ConfigEnv.cfg file. If the job description file contains attributes that are unknown to the RB/JSS, the UI will print a warning (when used with the –debug option) listing all the unknown attributes. These attributes are anyway passed-through but will be ignored by the RB.

IST-2000-25182 PUBLIC 74 / 116

Page 75: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

OPTIONS--help

displays command usage.

--versiondisplays UI version.

--resource ce_id-r ce_id

if the command is launched with this option, the job-ad sent to the RB contains a line of the type SubmitTo = ce_id and the job is submitted by the Resource Broker to the resource identified by ce_id without going through the match-making process. Accepted format for the CEId is:<full hostname>:<port number>/jobmanager-<service>-<queue name>where valids for the <service> field are currently: lsf, pbs and bqs.

--input input_file-i input_file

if this option is specified the user will be asked to choose a CEId from a list of CEs contained in the input_file. Once a CEId has been selected the command behaves as explained for the --resource option. If this option is used together with the –noint one and the input file contains more than one CEId, then the first CEId in the list is taken into account for submitting the job.

--notify e_mail_address-n e_mail_address

when a job is submitted with this option an e-mail message containing basic information pertaining the job identification and status is sent to the specified e_mail_address when the job enters one of the following status:- READY- RUNNING- ABORTED or DONENotification to multiple contacts can be requested by specifying the corresponding e-mail addresses separated by commas and without blanks.

--config path_name-c path_name

if the command is launched with this option, the configuration file pointed to by path_name is used instead of the standard configuration file.

--output out_file

IST-2000-25182 PUBLIC 75 / 116

Page 76: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

-o out_filewrites the generated dg_jobId assigned to the submitted job in the file specified by out_file. out_file can be either a simple name or an absolute path (on the submitting machine). In the former case the file out_file is created in the current working directory.

--hours H-h H

allows the user to specify the user proxy duration H, in hours, needed for submitting the job. When used with this option the dg-job-submit command behaves as follows:

the command checks for user proxy existence and if the proxy does not exist a new proxy with H hours duration is created

if the proxy exists then its duration is checked against the value specified with the --hours option. If proxy duration is greater than H hours then the job is submitted with the existing proxy, otherwise the old proxy is destroyed and a new one with H hours duration is created and used for submitting the job.

This mechanism allows the user to create before submission a proxy with a suitable duration for her/his job; moreover the user is not obliged to enter the PEM pass-phrase at each submission i.e. in all those cases where the existing proxy has a validity great enough for the job.

--nomsgthis option makes the command print on the standard output only the dg_jobId generated for the job if submission was successful; the location of the log file containing massages and diagnostics is printed otherwise.

--nointif this option is specified every interactive question to the user is skipped, moreover only the dg_jobId is returned on the standard output. All warning messages and errors (if occurred) are written to the file dg-job-submit_<UID>_<PID>.log under the /tmp directory. Log file location is configurable.

--debugwhen this option is specified, information about parameters used for the API functions calls inside the command are displayed on the standard output and are written to dg-job-submit_<UID>_<PID>.log file under the /tmp directory too. Log file location is configurable.

--logfile log_filewhen this option is specified, the command log file is relocated to the location pointed by log_file

job_description_file

IST-2000-25182 PUBLIC 76 / 116

Page 77: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

this is the file containing the classad describing the job to be submitted. It must be the last argument of the command.

EXIT STATUSdg-job-submit exits with a status value of 0 (zero) upon success, and 1 (one) upon failure.

IST-2000-25182 PUBLIC 77 / 116

Page 78: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

EXAMPLES1. $> dg-job-submit myjob1.jdl

where myjob1.jdl is as follows:############################################## #

# -------- Job description file ----------# ############################################## Executable = "$(CMS)/fpacini/exe/sum.exe";InputData = "LF:testbed0-00019";ReplicaCatalog = "ldap://sunlab2g.cnaf.infn.it:2010/rc=WP2 INFN Test Replica Catalog,dc=sunlab2g, dc=cnaf, dc=infn, dc=it";DataAccessProtocol = "gridftp";Rank = other.MaxCpuTime;Requirements = other.LRMSType == "Condor" && \ other.Architecture == "INTEL" && other.OpSys== "LINUX" && \

other.FreeCpus >= 4;

submits sum.exe to a resource (supposed to contain the executable file) whose LRMS is Condor. The command returns the following output to the user, containing the job handle (dg_jobid):

================= dg-job-submit Success ===================================The job has been successfully submitted to the Resource Broker. Your job is identified by (dg_jobId): https://grid004f.cnaf.infn.it:7846/155.198.211.205/161251122764136?grid004f.cnaf.infn.it:7771Use dg-job-status command to display current job status.======================================================================

2. $> dg-job-submit myjob2.jdl --notify [email protected] the job described by myjob2.jdl , returns the same output as above to the user and sends a notification by e-mail at well defined job status changes to [email protected].

SEE ALSO[A1], [A2], dg-job-list-match.

IST-2000-25182 PUBLIC 78 / 116

Page 79: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

– dg-job-get-outputThis command requests the RB for the job output files (specified by the OutputSandbox attribute of the job-ad) and stores them on the submitting machine local disk.

SYNOPSISdg-job-get-output [options] <job Id(s)>Options: --help --version

--input, -i <input_file> --dir <directory_path> --config, -c <config_file> --noint --debug --logfile <log_file>

DESCRIPTION The dg-job-get-output command can be used to retrieve the output files of a job that has been submitted through the dg-job-submit command with a job description file including the OutputSandbox attribute. After the submission, when the job has terminated its execution, the user can load the files generated by the job and temporarily stored on the RB machine as specified by the OutputSandbox attribute, issuing the dg-job-get-output with as input the dg_jobId returned by the dg-job-submit. It is also possible to specify a list of job identifiers when calling this command or an input file containing dg_jobIds by means of the --input option. When the --input is used, the user is requested to choose all, one or a subset of the job identifiers contained in the input file. It is important to note that the OutputSandbox of a submitted job can only be retrieved when the job has reached the OutputReady status (see Annex 7.2) indicating that the job is done and the OutputSandbox files are ready for retrieval on the RB machine. dg-job-get-output will always fail for jobs that are not yet in the OutputReady status.The user can decide the local directory path on the UI machine where these files have to be stored by means of the --dir option, otherwise the retrieved files are put in a default location specified in the UI_ConfigENV.cfg configuration file (DEFAULT_STORAGE_AREA_IN parameter). In both cases a sub-directory will be added to the path supplied. The name of this sub-directory is the “<time><PID><RND>” unique number of the dg_jobId identifier (see command dg-job-submit for details on the dg_jobId structure).If the user wants to use his “private” configuration file, this can be done using option --config path_name. As a consequence the dg-job-get-output command looks for the file

IST-2000-25182 PUBLIC 79 / 116

Page 80: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

“path_name” instead of the standard configuration file. If this file does not exist the user is notified with an error message and the command is aborted.

IST-2000-25182 PUBLIC 80 / 116

Page 81: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

OPTIONS--help

displays command usage.

--versiondisplays UI version.

--dir directory_pathretrieved files (previously listed by the user through the OutputSandbox attribute of the job description file) are stored in the location indicated by directory_path/<dg_jobId unique string>.

--config path_name-c path_name

if the command is launched with this option, the configuration file pointed to by path_name is used instead of the standard configuration file.

--nointif this option is specified every interactive question to the user is skipped. All warning messages and errors (if occurred) are written to the file dg-job-get-output_<UID>_<PID>.log under the /tmp directory. Location of log file is configurable.

--debugwhen this option is specified, information about parameters used for the API functions calls inside the command are displayed on the standard output and are written to dg-get_job_output_<UID>_<PID>.log file under the /tmp directory too. Location of log file is configurable.

--logfile log_filewhen this option is specified, the command log file is relocated to the location pointed by log_file

dg_jobIdjob identifier returned by dg-job-submit. If a list of oe or more job identifiers is specified, dg_jobIds have to be separated by a blank. Job identifiers must be last argument of the command.

--input input_file-i input_file

IST-2000-25182 PUBLIC 81 / 116

Page 82: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

this option makes the command return the OutputSandbox files for each dg_jobId contained in the input_files. This option can’t be used if one (or more) dg_jobIds have been already specified. The format of the input file must be as follows: one dg_jobId for each line and comment lines must begin with a “#” or a “*” character.

EXIT STATUSdg-job-get-output exits with a status value of 0 (zero) upon success, >0 upon failure and <0 upon partial failure. An example of partial failure is when more than one job identifiers has been specified and the OuputSandbox could be retrieved only for some of them.

IST-2000-25182 PUBLIC 82 / 116

Page 83: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

EXAMPLESLet us consider the following command:

$> dg-job-get-output https://grid004.it:2234/124.75.74.12/12354732109721?firefox.esrin.esa.it:4577 --dir /home/dataIt retrieves the files listed in the OutputSandbox attribute of job identified by https://grid004.it:2234/124.75.74.12/12354732109721?firefox.esrin.esa.it:4577 from the RB and stores them locally in /home/data/12354732109721.

IST-2000-25182 PUBLIC 83 / 116

Page 84: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

– dg-job-list-match Returns the list of resources fulfilling job requirements.

SYNOPSISdg-job-list-match [options] <jdl file>Options: --help --version

--verbose --config, -c <config_file> --output, -o <output_file> --noint --debug --logfile <log_file>

DESCRIPTION dg-job-list-match displays the list of identifiers of the resources accessible by the user and satisfying the job requirements included in the job description file. The CE identifiers are returned either on the standard output or in a file according to the chosen command options and are strings univocally identifying the CEs published in the GIS. dg-job-list-match requires a job description file in which job characteristics and requirements are expressed by means of a Condor class-ad. The job description file is first syntactically checked and then used as the main command-line argument to dg-job-list-match. The Resource Broker is only contacted to find job compatible resources; the job is never submitted. See the dg-job-submit section and in particular Table 1 for general rules for building the job description file.If the user wants to use his “private” configuration, file this can be done using option --config path_name. The option --verbose of the dg-job-list-match command can be used to obtain on the standard output the class-ad sent to the RB generated from the job description.The --output option makes the command save the list of compatible resources into the specified file. If the provided file name is not an absolute path, then the output file is created in the current working dir.The CEId attribute of the JDL, being a resource attribute, is only taken into account by the dg-job-list-match command if present in the Requirements expression and if prefixed by “other.”. On the other hand the job attribute SubmitTo setting is a reserved to UI and it is hence discarded if provided directly in the jdl file by the user.

IST-2000-25182 PUBLIC 84 / 116

Page 85: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

JOB DESCRIPTION FILESee dg-job-submit for details.

IST-2000-25182 PUBLIC 85 / 116

Page 86: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

OPTIONS

--help displays command usage.

--versiondisplays UI version.

--verbose-v

displays on the standard output the job class-ad that is sent to the Resource Broker generated from the job description file. This differs from the content of the job description file since the UI adds to it some attributes that cannot be directly inserted by the user (e.g. CertificateSubject, defaults for Rank and Requirements if not provided).

--config path_name-c path_name

if the command is launched with this option, the configuration file pointed to by path_name is used instead of the standard configuration file.

--output output_file-o output_file

returns the CEIds list in the file specified by output_file. output_file can be either a simple name or an absolute path (on the submitting machine). In the former case the file output_file is created in the current working directory.

--nointif this option is specified every interactive question to the user is skipped. All warning messages and errors (if any) are written to the file dg-job-list-match <UID>_<PID>.log under the /tmp directory. Location of the log file is configurable.

--debugwhen this option is specified, information about the API functions called inside the command are displayed on the standard output and are written to the file dg-job-list-match_<UID>_<PID>.log under the /tmp directory too. Location of the log file is configurable.

IST-2000-25182 PUBLIC 86 / 116

Page 87: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

--logfile log_filewhen this option is specified, the command log file is relocated to the location pointed by log_file

job_description_filethis is the file containing the classad describing the job to be submitted. It must be the last argument of the command.

EXIT STATUSdg-job-list-match exits with a status value of 0 (zero) upon success, and a non-zero value upon failure.

EXAMPLESLet us consider the following command:$> dg-job-list-match myjob.jdlwhere the job description file myjob.jdl looks like:

######################################### # # ---- Sample Job Description File ---- # ######################################### Executable = "sum.exe";StdInput = "data.in";InputSandbox = {"/home_firefox/fpacini/exe/sum.exe","/home1/data.in"};OutputSandbox = {"data.out","sum.err"};Rank = other.MaxCpuTime;Requirements = other.LRMSType == "Condor" && other.Architecture == "INTEL" && other.OpSys== "LINUX" &&

other.FreeCpus >= 2;

In this case the job requires CEs being Condor Pools of INTEL LINUX machines with at least 2 free Cpus. Moreover the Rank expression states that queues with higher maximum Cpu time allowed for jobs are preferred. The response of such a command is something as follows:*************************************************************************** Computing Element IDs LIST The following CE(s) matching your job requirements have been found:- bbq.mi.infn.it:2119/jobmanager-pbs-dque- skurut.cesnet.cz:2119/jobmanager-pbs-wp1

IST-2000-25182 PUBLIC 87 / 116

Page 88: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

***************************************************************************

$>

SEE ALSO[A1],[A2], dg-job-submit.

– dg-job-cancel Cancels one or more submitted jobs.

SYNOPSISdg-job-cancel [options] <job Id(s)>

Options: --help --version

--all --input, -i <input_file> --notify, -n <e-mail_address(es)> --config, -c <config_file> --output, -o <output_file> --noint --debug --logfile <log_file>

DESCRIPTION This command cancels a job previously submitted using dg-job-submit. Before cancellation, it prompts the user for confirmation. The cancel request is sent to the Resource Broker that forwards it to the JSS that fulfils it.dg-job-cancel can remove one or more jobs: the jobs to be removed are identified by their job identifiers (dg_jobIds returned by dg-job-submit) provided as arguments to the command and separated by a blank space. The result of the cancel operation is reported to the user for each specified dg_jobId.If the --all option is specified, all the jobs owned by the user submitting the command are removed. When the command is launched with the --all option, no dg_jobId can be specified. It has to be remarked that only the owner of the job can remove the job. When the --all option is specified the dg-job-cancel command contacts every Resource Broker listed in the UI_ConfigEnv.cfg file and asks for the cancellation of all jobs owned by the user identified by her/his certificate subject.

IST-2000-25182 PUBLIC 88 / 116

Page 89: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

If the user wants to use his “private” configuration file this could be done using option --config path_nameThe --input option permits to specify a file (input_file) that contains the dg_jobIds to be removed. The format of the file must be as follows: one dg_jobId for each line and comment lines must begin with a “#” or a “*” character. When using this option the user is interrogated for choosing among all, one or a subset of the listed job identifiers. If the input_file does not represent an absolute path the file will be searched in the current working directory.Possible job cancellation notifications are: Cancel SUCCESS i.e. the job has been successfully marked for removal. Cancel GENERIC_FAILURE i.e. the user is not the owner of the job or the

cancellation request has reached the JSS but has failed for some unknown reason. Cancel CONDOR_FAILURE i.e. the cancellation request has failed due to a CondorG

problem. Cancel GLOBUS_FAILURE i.e. the cancellation request has failed due to a Globus

job-manager problem. Cancel NOENT_FAILURE i.e. the job has not been found by JSS, by CondorG or

by the Resource Broker.The --notify option can be used to receive jobs cancellation notifications by e-mail. When this option is used the UI does not wait for the cancel notifications from the RB and returns control to the user immediately after the RB has accepted the cancellation request. This can be useful when a great number of jobs to cancel have been specified and the user wants to be able to perform other operations without waiting for the command results. Notification to multiple contacts can be requested by specifying the corresponding e-mail addresses separated by commas and without blanks.

IST-2000-25182 PUBLIC 89 / 116

Page 90: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

OPTIONS

--help displays command usage.

--versiondisplays UI version.

--allcancels all job owned by the user submitting the command. This option can’t be used either if one or more dg_jobIds have been specified explicitly or with the –input option.

--input input_file-i input_file

cancels dg_jobId contained in the input_files. This option can’t be used neither if one or more dg_jobIds have been specified nor with the –all option.

--notify e_mail_address-n e_mail_address

when a cancel request is submitted with this option, an e-mail message will be returned to the e_mail_address specified. The message will report on cancellation success/failure of the job specified in input. When the –all option has been specified or cancellation involves more than one job, an e-mail message is sent to the user for each RB that has performed cancellations on behalf of the UI. Notification to multiple contacts can be requested by specifying the corresponding e-mail addresses separated by commas and without blanks.

--config path_name-c path_name

if the command is launched with this option, the configuration file pointed to by path_name is used instead of the standard configuration file.

--output output_file-o output_file

writes the cancel results in the file specified by output_file instead of the standard output. output_file can be either a simple name or an absolute path (on the submitting machine). In the former case the file output_file is created in the current working directory.

IST-2000-25182 PUBLIC 90 / 116

Page 91: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

--nointif this option is specified every interactive question to the user is skipped. All warning messages and errors (if occurred) are written to the file dg-job-cancel_<UID>_<PID>.log under the /tmp directory. Location of the log file is configurable.

--debugwhen this option is specified, information about the API functions called inside the command are displayed on the standard output and are written to the file dg-job-cancel_<UID>_<PID>.log under the /tmp directory too. Location of the log file is configurable.

--logfile log_filewhen this option is specified, the command log file is relocated to the location pointed by log_file

dg_jobIdjob identifier returned by dg-job-submit. The job identifier list must be the last

argument of this command. EXIT STATUSdg-job-cancel exits with a status value 0 if all the specified jobs were cancelled successfully, >0 if errors occurred for each specified job id and <0 in case of partial failure. An example of partial failure is when more then one job has been specified: some jobs could be successfully removed and some others could be not removed.

EXAMPLES1. $> dg-job-cancel dg_jobId1 dg_jobId2

displays the following confirmation message:Are you sure you want to remove all jobs specified? [y/n]n: y

********************************************** JOBS CANCEL OUTCOMECancel SUCCESS for job: - dg_jobId1The job has been successfully marked for removal------Cancel NOENT_FAILURE for job:

IST-2000-25182 PUBLIC 91 / 116

Page 92: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

- dg_jobId2 Job not found by the Resource Broker

********************************************** $> In this case the command exit code is –1.

2. $> dg-job-cancel –all

displays the following confirmation message:Are you sure you want to remove all jobs owned by user Fabrizio Pacini? [y/n]n: y

********************************************** JOBS CANCEL OUTCOMECancel SUCCESS for job: - dg_jobId1 The job has been successfully marked for removal

------ Cancel SUCCESS for job: - dg_jobId2

The job has been successfully marked for removal**********************************************

$>

The exit code in this case is 0

SEE ALSO[A2], dg-job-submit.

IST-2000-25182 PUBLIC 92 / 116

Page 93: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

– dg-job-statusDisplays bookkeeping information about submitted jobs.

SYNOPSISdg-job-status [options] <job Id(s)>Options: --help --version

--all --input, -i <input_file> --full, -f --config, -c <config_file> --output, -o <output_file> --noint --debug --logfile <log_file>

DESCRIPTION This command prints the status of a job previously submitted using dg-job-submit. The job status request is sent to the LB that provides the requested information. This can be done during the whole job life.dg-job-status can monitor one or more jobs: the jobs to be checked are identified by one or more job identifiers (dg_jobIds returned by dg-job-submit) provided as arguments to the command and separated by a blank space. If the --all option is specified, information about all the jobs owned by the user submitting the command is printed on the standard output. When the command is launched with the --all option, neither can a dg_jobId be specified nor can the --input option be specified. The --input option permits to specify a file (input_file) that contains the dg_jobIds to monitor. The format of the file must be as follows: one dg_jobId for each line and comment lines have to begin with a “#” or a “*” character. When using this option the user is requested for choosing among all, one or a subset of the listed job identifiers. If the input_file does not represent an absolute path, it will be searched in the current working directory.If the user wants to use his “private” configuration file, this can be done using option --config path_name. The job information displayed to the user encompasses (bookkeeping information):

- dg_jobId (the job unique identifier)- Status (the job current status)- Job Exit Code (the job exit code; if 0)

IST-2000-25182 PUBLIC 93 / 116

Page 94: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

- Job Owner (User Certificate Subject)- Location (Id of RB, JSS or CE)- Destination (Id of CE where the job will be transferred to)- Status Enter Time (when the job entered actual state)- Last Update Time (last known event timestamp)- Status Reason (reason for being in this state)

If the --full option is specified, dg-job-status displays a long description of the queried jobs by printing in addition the following information:

- CE Node (id of cluster(s) node where the job is running)- JssId (job identifier in the JSS)- GlobusId (job identifier in the Globus job-manager)- LocalId (id in the CE queue (PBS, LSF, ..)) - Job Description (JDL) (complete JDL description of the job)- JSS Job Description (JDL) (complete JDL job description as sent to the JSS)- Job Description (job description for Condor-G built from the JDL one)- Moving (intermediate state: JobTransfer

but neither JobAccepted nor JobRefused has been logged yet; in this case ‘state’ and ‘location’ refer to the source of job transfer.)

- Cancelling (whether job cancellation is in progress)- Cancel Reason (cancellation status message)

Information fields that are not available (i.e. not returned by the LB) are not printed at all to the user.The job Status possible values are reported in Annex 7.2. Details on the Job Status Diagram can be found in [A4].

OPTIONS--help

displays command usage.

--versiondisplays UI version.

--alldisplays status information about all job owned by the user submitting the command. This option can’t be used either if one or more dg_jobIds have been specified or if the --input option has been specified. All LBs listed in the UI configuration file UI_ConfigENV.cfg are contacted to fulfil this request.

IST-2000-25182 PUBLIC 94 / 116

Page 95: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

--input input_file-i input_file

displays bookkeeping info about dg_jobIds contained in the input_files. When using this option the user is interrogated for choosing among all, one or a subset of the listed job identifiers. This option can’t be used either if one or more dg_jobIds have been specified or if the --all option has been specified.

--fulldisplays a long description of the queried jobs

--config path_name-c path_name

if the command is launched with this option, the configuration file pointed to by path_name is used instead of the standard configuration file.

--output output_file-o output_file

writes the bookkeping information in the file specified by output_file instead of the standard output. output_file can be either a simple name or an absolute path (on the submitting machine). In the former case the file output_file is created in the current working directory.

--nointif this option is specified every interactive question to the user is skipped. All warning messages and errors (if any) are written to the file dg-job-status_<UID>_<PID>.log under the /tmp directory. Location of log file is configurable.

--debugwhen this option is specified, information about the API functions called inside the command are displayed on the standard output and are written to the file dg-job-status_<UID>_<PID>.log under the /tmp directory too. Location of log file is configurable.

--logfile log_filewhen this option is specified, the command log file is relocated to the location pointed by log_file

dg_jobId

IST-2000-25182 PUBLIC 95 / 116

Page 96: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

job identifier returned by dg-job-submit. Job identifiers must always be provided as last arguments of the command.

EXIT STATUSdg-job-status exits with a value of 0 if the status of all the specified jobs is retrieved correctly, >0 if errors occurred for each specified job id and <0 in case of partial failure. An example of partial failure is when more then one job is specified: status info could be successfully retrieved for some jobs and not retrieved for some others.

IST-2000-25182 PUBLIC 96 / 116

Page 97: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

EXAMPLES$> dg-job-status dg_jobId2

displays the following lines:

********************************************************************BOOKKEEPING INFORMATIONPrinting status for the job: https://grid004f.cnaf.infn.it:7846/155.198.211.205/085936117861491?grid004f.cnaf.infn.it:7771

---

dg_JobId = https://grid004f.cnaf.infn.it:7846/155.198.211.205/085936117861491? grid004f.cnaf.infn.it:7771Job Owner = /C=IT/O=ESA/OU=ESRIN/CN=Fabrizio Pacini/[email protected] = Ready Location = grid004f.cnaf.infn.itJob Destination = skurut.cesnet.cz:2119/jobmanager-pbs-wp1Status Enter Time = Wed Sep 19 08:20:53 2001Last Update Time = Wed Sep 19 08:35:13 2001********************************************************************$>

SEE ALSO[A1], [A2], [A4], dg-job-submit.

IST-2000-25182 PUBLIC 97 / 116

Page 98: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

– dg-job-get-logging-infoDisplays logging information about submitted jobs.

SYNOPSISdg-job-get-logging-info [options] <job Id(s)>Options: --help --version

--all --input, -i <input_file> --from <T1> --to <T2> --full --level, -l --config, -c <config_file> --output, -o <output_file> --noint --debug --logfile <log_file>

DESCRIPTION This command queries the LB persistent DB for logging information about jobs previously submitted using dg-job-submit. The job logging information are stored permanently by the LB service and can be retrieved also after the job has terminated its life-cycle, differently from the bookkeeping information that are in some way “consumed” by the user during the job existence. The dg-job-get-logging-info request is sent to the LB service that queries the DB and returns the retrieved information. Contents of the logging information are:

- Event Type (possible event types are listed in Annex 7.3)- dg_jobId - Logging Level- Date (UTC)- Job Transfer Destination- Host Name- Job Run Node- Source Program

IST-2000-25182 PUBLIC 98 / 116

Page 99: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

- Job OwnerIf the command is issued with the --full option additional information consisting in the job description, in JDL or the one for Condor-G or both according to the WMS component that has logged the event, is printed to the user. Data on several jobs can be queried by specifying a list of job identifiers separated by a blank space as arguments of the command. Moreover the --input option permits to specify a file (input_file) which contains the dg_jobId whose information are requested. The format of the file must be as follows: one dg_jobId for each line and comment lines have to begin with a “#” or a “*” character. When using this option the user is interrogated for choosing among all, one or a subset of the listed job identifiers. If the input_file does not represent an absolute path, it will be searched in the current working directory.If the --all option is used, logging information about all the jobs owned by the user submitting the command are printed on the standard output. When the command is launched with the --all option, neither can one (or more) dg_jobId be specified nor is the --input option allowed. To perform more complex queries, the user can specify a time range he is interested to by using the --from (T1) and --to (T2) options. These options take as input timestamps in the format hhmmssDDMMYYYY (UTC) and make the command retrieve job logging information only for the specified time interval. If these options are not specified the default values are: Unix Epoch Time (for T1) and current time, i.e. the time the command has been submitted (for T2).Each event logged in the LB has an associated log level according to “Universal Format for Logger Messages” (see draft-abela-ulm-05.txt available at http://www-didc.lbl.gov/NetLogger/draft-abela-ulm-05.txt). Default value for the log level used by WMS components is System, anyway there could be special situations in which problems investigation is needed and additional events are logged with the Debug log level. The --level option of the dg-job-get-logging-info command allows the user to have returned from the LB also events having a Debug log level. If no --level option is used only events with System log level are returned.The --output option can be used to have the retrieved information written in the file identified by output_file instead of the standard output. output_file can be either a simple name or an absolute path (on the submitting machine). In the former case the file output_file is created in the current working directory.If the user wants to use his “private” configuration file this could be done using option --config path_name.

OPTIONS

--help displays command usage.

--versiondisplays UI version.

IST-2000-25182 PUBLIC 99 / 116

Page 100: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

--allretrieves logging information about all job owned by the user submitting the command. If used, this option must be provided as first command argument.

--input input_file-i input_file

retrieves logging info for all dg_jobIds contained in the input_files. This option can’t be used either if specifying one or more dg_jobIds or if using the --all option.

--from T1 gets job events logged since the specified date T1. T1 must be in the form hhmmss[DDMMYYYY]. If DDMMYYYY is not provided, input time is considered in the current day.

--to T2 gets job events logged up to the specified date T2. T2 must be in the form hhmmss [DDMMYYYY]. If DDMMYYYY is not provided, input time is considered in the current day.

--full makes the command display addition job information fields (i.e. the job description in JDL and /or the one for Condor-G).

--level makes the command retrieve job’s information for events having a log level equal to System and Debug. Otherwise only events with a System log level are returned.

--config path_name-c path_name

if the command is launched with this option, the configuration file pointed to by path_name is used instead of the standard configuration file.

--output output_file-o output_file

IST-2000-25182 PUBLIC 100 / 116

Page 101: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

writes the logging information in the file specified by output_file instead of the standard output. output_file can be either a simple name or an absolute path (on the submitting machine). In the former case the file output_file is created in the current working directory.

--nointif this option is specified every interactive question to the user is skipped. All warning messages and errors (if occurred) are written to the file dg-job-logging_<UID>_<PID>.log under the /tmp directory. Location for log file is configurable.

--debugwhen this option is specified, information about the API functions called inside the command are displayed on the standard output and are written to the file dg-job-logging_<UID>_<PID>.log under the /tmp directory too. Location for log file is configurable.

--logfile log_filewhen this option is specified, the command log file is relocated to the location pointed by log_file

dg_jobIdjob identifier returned by dg-job-submit. Job identifiers must always be provided as last arguments for this command.

EXIT STATUSdg-job-get-logging-info exits with a value of 0 if the status of all the specified jobs is retrieved correctly, >0 if errors occurred for each specified job and <0 in case of partial failure. An example of partial failure is when more then one job is specified: some job’s logging info could be successfully retrieved and some others could be not retrieved.

EXAMPLES1. $> dg-job-get-logging-info –all –from 12150005052001 –to 10000006052001 –output

mylog.txt

writes in file mylog.txt in the current working directory logging information about my jobs for the time since 12:15 on 5 May 2001 up to 10 o’clock on 6 May 2001.

2. $> dg-job-get-logging-info dg_jobId1 –from 113500where dg_jobId1 = https://grid004f.cnaf.infn.it:7846/131.154.99.104/14010479391529?

IST-2000-25182 PUBLIC 101 / 116

Page 102: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

grid004f.cnaf.infn.it:7771displays the following output:*******************************************************************LOGGING INFORMATION:

Printing info for the Job : dg_jobId1For the Event : JobTransfer ---Event Type = JobTransferdg_jobId = dg_jobId1Logging Level = SystemDate(UTC) = Tue Sep 4 16:12:56 2001Job Destination = ResourceBroker/grid004f.cnaf.infn.it:7771Host Name = lx01Source Program = UserInterfaceJob Owner = /O=Grid/O=UKHEP/OU=hep.ph.ic.ac.uk/CN=Fabrizio PaciniJob Descr (JDL) = [ - InputSandboxPath = "/tmp/datamat_161251122764136" - CertificateSubject = "/O=Grid/O=UKHEP/OU=hep.ph.ic.ac.uk/ CN=Fabrizio Pacini" - Rank = other.FreeCPUs * other.AverageSI00 – other.EstimatedTraversalTime - Executable = "WP1testA" - UserContact = "[email protected]" - StdInput = "sim.dat" - InputSandbox = {"/home/datamat/HandsOn-0409/file*","/ home/datamat/DATA/*"} - StdOutput = "sim.out" - StdError = "sim.err" - requirements = other.OpSys == "Linux RH 6.1" || other.OpSys == "Linux RH 6.2" - OutputSandbox = {"/tmp/sim.err","/tmp/test.out"} - dg_jobId = dg_jobId1 ]********************************************************************$>

SEE ALSO[A2], [A4], dg-job-submit.

IST-2000-25182 PUBLIC 102 / 116

Page 103: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

- dg-job-id-infoThis is a simple utility for the user; it just parses the dg_jobId string and displays formatted information contained in the job identifier. This is a “local” command since it does not need any interaction of the UI with the other WMS components.

SYNOPSISdg-job-id-info [options] <job Id(s)>Options: --help --version

--input, -i <input_file> --output, -o <output_file>

DESCRIPTION This command is used to display formatted information about the job from the dg_jobId of a job previously submitted. It is possible to supply one or more dg_jobIds as input to this command. Moreover it is possible to parse the dg_jobIds listed in a file using the --input option. The parsed information is printed on standard output; redirection of the output in a file can be done through the --output option.It is important to remark that since no interaction of the UI with other external components is foreseen for this command, it does not need any certificate to work.

OPTIONS--help

displays command usage.

--versiondisplays UI version.

--input input_file-i input_file

parses the dg_jobIds listed in the input_file. This option can’t be used specifying one (or more) dg_jobIds.

--output output_file

IST-2000-25182 PUBLIC 103 / 116

Page 104: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

-o output_filewrites the formatted information in the file specified by output_file instead of the standard output. output_file can be either a simple name or an absolute path (on the submitting machine). In the first case the file output_file is created in the current working directory.

dg_jobIdjob identifier returned by dg-job-submit. Job identifiers must be last arguments of this command.

EXIT STATUSdg-job-id-info exit with a value of 0 if no error occurs, >0 if errors occurred for each specified job identifier and <0 in case of partial failure.

EXAMPLES$> dg-job-id-info https://grid001f.pd.infn.it:2234/124.75.74.12/134534534534234?http://grid004f.cnaf.infn.it:4577

displays the following output:********************************************************************JOB ID INFOPrinting info for the Job ID :https://grid004.it:2234/124.75.74.12/134534534534234?www.rb.com:4577

Logging and Bookkeeping Server Address = https://grid001f.pd.infn.itLogging and Bookkeeping Server Port = 2234 Resource Broker Server Address = http://grid004f.cnaf.infn.it Resource Broker Server Port = 4577 Submission Time (hh:mm:ss) = 13:45:34 (UTC) User Interface Machine IP Address = 124.75.74.12 User Interface Process Identifier = 53453 Randomly Generated Number (0000-9999) = 4234 ********************************************************************$>

IST-2000-25182 PUBLIC 104 / 116

Page 105: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

7. ANNEXES

7.1. JDL ATTRIBUTESThe JDL is a fully extensible language (i.e. it does not rely on a fixed schema), hence the user is allowed to use whatever attribute for the description of a job without incurring in er-rors. Anyway only a certain set of attributes (that we will refer to as “supported” attributes) can be taken into account by the WMS components for scheduling a submitted job. Indeed in order to be actually used for selecting a resource, an attribute used in a job class-ad needs to have a correlation with some characteristic of the resources that are published in the GIS (aka MDS).The “supported” attributes, their meaning and the way to use them to describe a job are dealt in detail in document [A7] also available at the following URL:http://www.infn.it/workload-grid/docs/DataGrid-01-NOT-0101-0_5.pdf

7.2. JOB STATUS DIAGRAMThe following Figure 1 reports the status that a job can assume during its life cycle.

Figure 1 Job Life Cycle

Job status in Figure 1 are briefly described hereafter (see [A4] for further details):

STATUS:

- SUBMITTED: job is submitted but not yet received by the RB (i.e. it is waiting in the UI).- WAITING: job is waiting in the queue in the RB for various reasons (e.g. no appropriate

CE (cluster) found; required dataset is not available, dependency on other job etc.).- READY: appropriate CE found; job is transferred to the CE.- SCHEDULED: job is waiting in the queue on the CE.

IST-2000-25182 PUBLIC 105 / 116

Page 106: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

- RUNNING: job is running.- CHKPT: job is check-pointed and is waiting for restart; this is a system checkpointing of

jobs running on a CE, independently from Application Checkpointing.- DONE: job exited.- OUTPUTREADY: job exited and RB is ready to return output sandbox. - ABORTED: job was aborted for various reasons (e.g. waiting in the queue in RB, JSS or

CE for too long, over-use of quotas, expiration of user credentials, etc.).- CLEARED: output files were transferred to the user, job is removed from bookkeeping

database.

IST-2000-25182 PUBLIC 106 / 116

Page 107: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

7.3. JOB EVENT TYPESHereafter is reported the list of job event types that could be returned to the user by the dg-job-get-logging-info command. They are organized in several categories:

Events concerning a job transfer between components: JobTransfer A component generates this event when it tries to transfer a job to

some other component. This event contains the identification of the receiver and possibly the job description expressed in the language accepted by the receiver. The result of the transfer, i.e. success or failure, as seen by the sender is also included.

JobAccept A component generates this event when it receives a job from another WMS component. This event contains also the locally assigned job identifier.

JobRefuse A component generates this event when the transfer of a job to some other component fails. The source of this event, which also includes the reason for the failure, can be either the receiver or the sender, e.g. when the receiver is not available.

Events concerning a job state change during processing within a component: JobAbort The job processing is stopped due to system conditions or user

request. The reason is included in the event. JobRun The job is started on a CE. JobChkpt The job is check-pointed on a CE. The reason is included in the event. JobDone The job has completed. The process exit status is included in the

event. JobClear The user has successfully retrieved the job results, e.g. the output files

specified in the output sandbox (see Section 6.1.3); the job will be removed from the bookkeeping database in the near future.

JobScheduled The job has been successfully submitted to the appropriate CE, i.e. passed to the LRMS.

JobCancel The job has been successfully marked for removal.JobFail The job failed during its execution on the CE.

Events associated with the Resource Broker only: JobMatch An appropriate match between a job and a Computing Element has

been found. The event contains the identifier of the selected CE. JobPending A match between a job and a suitable Computing Element was not

found, so the job is kept pending by the RB. The event contains the reason why no match was found.

JobStatus It contains information about resources consumed by the job. This event is generated periodically by the CE and eliminates the need for direct communication from the LB Service to the CE. Two types of information should be considered: cumulative information (e.g. CPU time) and non-cumulative information (e.g. memory consumption). For cumulative properties only the most recent value is kept in the database. For non-cumulative

IST-2000-25182 PUBLIC 107 / 116

Page 108: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

properties, on the other end, all the values are stored in order to allow for example.

More details on job event types can be found in [A4].

IST-2000-25182 PUBLIC 108 / 116

Page 109: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

7.4. WILDCARD PATTERNSThe wildcard patterns that can be included in the InputSandbox attribute expression are used by the UI to perform file name “globbing” in a fashion similar to the UNIX csh shell. The result of the “globbing” is a list of the files whose names match any of the specified patterns. The admitted special characters together with their meaning are listed hereafter:

- * wildcard for any string- ? wildcard for any single character- [chars ] delimits a wildcard matching any of the enclosed

characters. If chars contains a sequence of the form a-b then any character between a and b (inclusive) will match. Such an expression can be negated by means of the special character “!” ([!chars] matches any character not in chars).

EXAMPLESConsider a directory where “ls –F” gives:

1file a1.f apple.o bob.o h4374.f john.o2files ab apps/ foo.c h4374.o mydir/ABS ab.f bob foo.f john stuff/a1 apple.f bob.f gh john.f

That is to say some files and directories. The examples below show the way the mentioned wildcards are expanded (the notation => indicates the result of typing the command).

1) Every two letter file name:echo ?? => a1 ab gh

2) Every two character name starting with “a“:echo a? => a1 ab

3) Every file starting with j, o, h, or n:echo [john]* => h4374.f h4374.o john john.f john.o

4) Include a range, e.g. everything starting with an upper case letter or a digit: echo [A-Z0-9]* => 1file 2files ABS

5) Negate a range:

IST-2000-25182 PUBLIC 109 / 116

Page 110: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

echo [!john]*.f => a1.f ab.f apple.f bob.f foo.f

6) Every file starting in “a” and ending in .f: echo a*.f => a1.f ab.f apple.f

IST-2000-25182 PUBLIC 110 / 116

Page 111: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

7.5. THE MATCH MAKING ALGORITHMThe main task performed by the RB is to find the best suitable Computing Element to execute the job at. In order to accomplish this task the RB interacts with the other WMS components. More precisely, the Replica Catalogue (RC) and the Information Index (II) are the two main WMS components which supply the RB with all the information required for the actual resolution of the matches between job requirements and Computing Element capabilities (i.e. runtime environments, data access features, processing resources etc.). The following sections provide a description of the matchmaking algorithm performed by the RB. At this aim it is worth to identify three different scenarios to be dealt with separately : direct job submission, job submission without data-access requirements, job submission with data-access requirements.

7.5.1. Direct Job SubmissionThe simplest scenario is to consider the case where the JDL submitted by the UI contains a link to the resource to submit the job at, i.e. the Computing Element identifier (CEId). In this case the RB doesn’t perform any matchmaking algorithm at all, but simply limits its action to the delegation of the submission request to the JSS, for the actual submission.

Figure 2 - Submission with CEId known

It should be pointed out that, if the CEId is specified then the RB neither checks whether the user who owns the job is authorised to access the given CE, nor interacts with the RC for the resolution of files requirements, if any. The only check performed by the RB is the JDL syntax one, while converting the JDL into a ClassAd.

7.5.2. Job submission without data-accesss requirementsLet’s do a little step onwards and consider the scenario where the user specifies a job with given execution requirements, but without data files ones. Once the JDL has been received by the RB and successfully converted into ClassAd (job-ad) the RB starts the actual match-making algorithm to find if the characteristics and status of Grid resources matches the job requirements. The matchmaking algorithm consists of two different phases: requirements check and the rank computation. During the requirements check phase the RB contacts the II in order to create a set of the more likely CEs to execute the job at, thus compliant with user requirements and user certificate subject, as well. Taking into account that all the CE attributes involved in the JDL

IST-2000-25182 PUBLIC 111 / 116

Page 112: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

requirements (defined by the user to express his/her needs) are almost constant in time (i.e. it’s improbable that a CE changes its operating system or its runtime environment in the very short term, e.g. every half an hour), it is clear that all the information cached in the II represent a good source for testing matches between job requirements and CE features. It is clearly more efficient than contacting each CE to find out the same information.

Figure 3 - Requirements checking phase

Once the RB has created the set of the suitable CEs where the job can be executed, the RB performs the second phase of the matchmaking algorithm, which allows the RB to acquire information about the “quality” of the just found suitable CEs. On the other hand if no suitable CEs have been found the RB sends an e-mail notification at the user recipient specified in the JDL.In the ranking phase the RB contacts directly the LDAP server of the involved CEs to obtain the values of those attributes, which appears in the rank expression of the received JDL. It should be pointed out that conversely to the previous phase, it is better to contact each suitable CE, rather than using the II as source of information, since the rank attributes represents variables varying in time very frequently (i.e. FreeCPUs, FreeMemory).Currently if all the suitable CEs are assigned with the same rank value the RB performs a “random” choice, i.e. the first CE in the list of suitable ones will receive the job for executing it. It is clear that a more sophisticated method should be adopted by the RB in case of equal ranking CEs, decoupling the user from the need of defining significant rank expressions. One possibility could be the execution of a post-ranking selection, which depending on performance factor, which should be defined, supply the user with the optimal CE choice for the actual submission of a given job. Rank computation is depicted in Figure 4.

IST-2000-25182 PUBLIC 112 / 116

Page 113: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

Figure 4 – Rank computation phase

7.5.3. Job submission with data-access requirementsThe Resource Broker interacts with the Replica Management services in order to find out the most suitable Computing Element taking into account the Storage Elements where both input data sets are physically stored and output data sets should be staged on completion of job execution. Before describing the action taken by the RB upon reception of a JDL where both data-access and computing requirements are present, it is worth to recall the JDL attributes which represent a data requirement at the RB side: OutputSE, InputData and DataAccessProtocol, respectively representing the Storage Element (SE) where the output file should be staged, the input files (LFN, PFN) required as input for the actual job execution and the protocol “spoken” by the application to access such files.The main two phases of the match making algorithm performed by the RB remain unchanged, but the RB executes the requirements check and ranking for each class of CEs satisfying the data-access requirements. Additionally, the RB performs a pre-match processing to find out and classify those CEs satisfying both data-access and user authorisation requirements.During the pre-match processing phase the RB contacts the RC (the one specified through the ReplicaCatalog JDL attribute) in order to resolve logical file names and collect all the information about SEs containing at least one input data file. This information will be used to write down the broker-info-file, which will be sent to the JSS for the actual submission within the input sandbox. At this point the RB is ready to start the CEs classification procedure, during which the RB contacts the II in order to find the CEs satisfying both the authorization requirements and having the OutputSE within its own LAN (CloseSE). Using the information retrieved during the file name resolution, the RB classifies those CEs depending on the number of input files

IST-2000-25182 PUBLIC 113 / 116

Page 114: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

stored in storage element(s) which is (are) close to the CE itself and speak at least one of the protocols specified in the DataAccessProtocol within the JDL.

Figure 5 - CEs classification procedure

Upon completion of the CE data classification, the RB is ready for the actual match making and starts the requirements checking phase for each CE belonging to the first non-empty class of CEs, which can access the highest number of distinguished files. If a CE doesn’t satisfy the user requirements it is removed from its class. The requirements checking phase is repeated until at least a CE matching the user requirement is found.Once the requirements checking phase is completed either the RB knows a set of CEs satisfying both data-access and computing requirements having access to the maximum number of distinguished input files, or there does not exist a suitable CE matching such requirements. In the first case the RB starts the ranking phase in order to find the best CE to whichsubmit the job. In the second one the RB sends an e-mail notification at the user’s recipient specified in the JDL.

IST-2000-25182 PUBLIC 114 / 116

Page 115: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

Figure 6 - Match-Making algorithm

IST-2000-25182 PUBLIC 115 / 116

Page 116: server11.infn.itserver11.infn.it/workload-grid/docs/DataGrid-01... · Web viewDataGrid. WP1 - WMS Software Administrator and User Guide. Document identifier: DataGrid-01-TEN-0118-0_7

WP1 - WMS SOFTWARE ADMINISTRATOR AND USER GUIDE

Doc. Identifier:

DataGrid-01-TEN-0118-0_7

Date: 19/07/2002

7.6. PROCESS/USER MAPPING TABLEThe following Table 2 reports for each daemon process needed by the WMS to accomplish its tasks the user identifier under which it has to be run. The installation of the profile rpms and the use (as root) of the scripts they install in /etc/rc.d/init.d for starting WMS components is strongly recommended in order to comply with this table.

Service Name User Namemysqld mysql

postgres postgrescondor_gridmanager dguser

condor_master dgusercondor_schedd dguser

jssparser dguserjssserver dguserRBserver dguser

interlogger rootdglogd root

bkserver rootileventd rootslapd root

Table 2 Process/User mapping

IST-2000-25182 PUBLIC 116 / 116