24
12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR www.eu-eela.org E-infrastructure shared between Europe and Latin America CE + WN installation and configuration Vanessa Hamar Universidad de Los Andes – Mérida, Venezuela 12 th EELA Tutorial Lima, 24-29 September,2007

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Embed Size (px)

Citation preview

Page 1: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR

www.eu-eela.org

E-infrastructure shared between Europe and Latin America

CE + WN installation and configuration Vanessa HamarUniversidad de Los Andes – Mérida, Venezuela12th EELA TutorialLima, 24-29 September,2007

Page 2: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Outline

• What is a Computing Element (CE) ?• What is a Torque Server ?• What is a Worker Node?• How to install and configure a Computing Element with

Torque Server.• How to install and configure a Worker Node with

Torque

Page 3: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

What is CE?

• The CE is a service representing a computing resource.

• Its main functionality is job management

(job submission, job control, etc.).

• For job submission, the CE can work in:– push modelpush model (where the job is pushed to a CE for its execution).

– pull modelpull model (where the CE asks the WMS for jobs).

Page 4: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

• TORQUETORQUE (Tera-scale Open-source Resource and QUEue management) is a resource management providing control over batch jobs and distribuited compute resource.

• The Torque System is composed by a:– pbs_serverpbs_server which provides the basic batch services

such as receiving/creating a batch job or protecting

the job against system crashes.– job_schedulerjob_scheduler which contains the site's policy used

to decide which job must be executed.– pbs_mompbs_mom which places the job into execution. It is also responsible

for returning the job’s output to the user.

What is Torque?

Page 5: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

What is a Worker Node?

• The Worker Node (WN) is a set of clients required

to run jobs sent by the CE via the Local Resource

Management System. It currently includes the:

– gLite I/O Client, – the Logging and Bookkeeping Client, – the R-GMA Client and – the WMS Checkpointing library.

Page 6: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing CE + Torque Server

WN + Torque

Page 7: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Preliminary and common steps

• Start from an instalation of SLC 3.0.8• Install JAVA SDK• Remove LAM and Postfix• Check the hostname• Install and configure ntp daemon• Install X.509 host certificates /etc/grid-security and

check their file permissions.• Install the latest version of glite-yaim• Install the middleware

Page 8: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing pre-requisites

• JAVA is not included in distribution. Install it separately (>= 1.4.2_08)

• apt-get install j2sdk

Page 9: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing pre-requisites

• Depending on the packages set you selected when installing the operating system, it may be possible that lam package is installed on your WN. Please remove lam.

apt-get remove lam

• There is a known installation conflict between the 'torque-clients' rpm and the 'postfix' mail client (Savannah. bug #5509). If you are going to install Torque, uninstall postfix package

apt-get remove postfix

Page 10: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing pre-requisites

• Check the FQDN hostname

– Ensure that the hostnames of your machines are correctly set. Run the command:

hostname -f

Page 11: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing pre-requisites

• Syncronization among all gLite nodes is mandatory. Install ntp if not already available for your system:– apt-get install ntp

• Add your time server in /etc/ntp.conf– restrict <time_server_IP_address> mask 255.255.255.255 nomodify notrap

noquery – server <time_server_name> – (you can use ntp-1.infn.it – IP 193.206.144.10)

• Edit /etc/ntp/step-tickers adding your(s) time server(s) hostname• If you are running a firewall, you will have to allow inbound

comminication on the NTP port:– -A INPUT -s <NTP-serverIP-1> -p udp --dport 123 -j ACCEPT

• Activate the ntpd service with the following commands: ntpdate <your ntp server name> service ntpd start chkconfig ntpd on

– You can check ntpd’s status with:

ntpq -p

Page 12: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing pre-requisites

• Install glite-yaim• apt-get install glite-yaim-core• apt-get install glite-yaim-clients

Page 13: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

• Request host certificates for the CE to a CA–

• Copy host certificate (hostcert.pem and hostkey.pem) in /etc/grid-certificates.

• Change the permisions– chmod 644 hostcert.pem– chmod 400 hostkey.pem

Installing pre-requisites

Page 14: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing CE+Torque Server via apt

• All the configuration values to sites have to be configured in a site configuration file using key-value pairs.

• This file is shared among all the different gLite node types. So edit once and keep it in a safe place

• Create a copy of /opt/glite/yaim/examples/site-info.def template (coming from the glite-yaim-core package) to your reference directory for the installation (e.g. /root/siteinfo):– cp /opt/glite/yaim/examples/site-info.def /root/siteinfo/site-info.def

• A good syntax test for your site configuration file is to try to source it manually running the command:– source site-info.def

Page 15: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing CE+Torque Server via apt

• The configuration is stored in a directory structure which will be extended in the near future. Currently the following files are used: site-info.def and the vo.d directory.

Page 16: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing CE+Torque Server via apt

• The /root/siteinfo/vo.d directory

• Each file name in this directory has to be the lower-cased version of e VO name defined in site-info.def. The matching file should contain the definitions for that VO and will overwrite the ones which are defined in site-info.def.

• SW_DIR=$VO_SW_DIR/eela DEFAULT_SE=$CLASSIC_HOST STORAGE_DIR=$CLASSIC_STORAGE_DIR/eela

Page 17: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing CE+Torque Server via apt

• vi /opt/glite/yaim/etc/wn-list.conf

limaXX.ring.pucp.edu.pe

limaXX.ring.pucp.edu.pe

…..

Page 18: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing CE+Torque Server via apt

• Install the node

• /opt/glite/yaim/bin/yaim -i -s /root/siteinfo/site-info.def -m glite-CE

• Configure the node

• /opt/glite/yaim/bin/yaim -c -s /root/siteinfo/site-info.def -n lcg-CE_torque -n MPI_CE -n BDII_site

Page 19: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

• If the installation is performed successfully, the following components are installed:

– gLite in /opt/glite

– Condor in /opt/condor-x.y.x (where x.y.z is the current condor version)

– Globus in /opt/globus

– Tomcat in /var/lib/tomcat5

– Torque in /var/spool/pbs

Installing CE+Torque Server via apt

Page 20: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing CE+Torque Server via apt

• Edit /etc/ssh/sshd_config/etc/ssh/sshd_config and add the following lines at the end:

HostbasedAuthentication yes

IgnoreUserKnownHosts yes

IgnoreRhosts yes

• Restart the server with:

/sbin/service sshd restart/sbin/service sshd restart

Page 21: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing CE+Torque Server via apt

• On the CE generate an updated version of /etc/ssh/ssh_know_hosts/etc/ssh/ssh_know_hosts by running:

• edg-pbs-shostsequiv• edg-pbs-knownhosts

• Copy that file into all the WorkerNodes.

Page 22: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

Installing WN Server via apt

•Install the node

•/opt/glite/yaim/bin/yaim -i -s /root/siteinfo/site-info.def -m glite-WN -m glite-torque-client-config

•Configure the node

•/opt/glite/yaim/bin/yaim -c -s /root/siteinfo/site-info.def -n WN_torque

Page 23: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America

References

• https://twiki.cern.ch/twiki/bin/view/LCG/GenericInstallGuide301

• https://twiki.cern.ch/twiki/bin/view/LCG/YaimGuide310

Page 24: 12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATOR  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

12th EELA TUTORIAL - USERS AND SYSTEM ADMINISTRATORS – Lima, 24-29 September, 2007

E-infrastructure shared between Europe and Latin America