13
Topics Covered 1. PBS Pro Configuration at ANURAG a. PBS Pro HA Complex b. Compute Nodes c. Queue Configuration d. Scheduling Policy e. Access Control List f. Job submission model 2. How to submit Jobs 3. User and Admin commands

PBS Pro Configuration Commands.pdf

Embed Size (px)

Citation preview

  • Topics Covered

    1. PBS Pro Configuration at ANURAG

    a. PBS Pro HA Complex

    b. Compute Nodes

    c. Queue Configuration

    d. Scheduling Policy

    e. Access Control List

    f. Job submission model

    2. How to submit Jobs

    3. User and Admin commands

  • PBS Pro Configuration at ANURAG

    Pcicture1: PBS Pro complex configuration at ANURAG

  • PBS Pro HA Complex

    PBS Pro is configured in high availability mode.

    Failure of one host doesnt affect the users jobs, the secondary server (passive-

    mode) will become active in case of primary failure.

    idc-master1 is primary PBS Pro server.

    idc-master2 is secondary PBS Pro server.

    PBS Pro server and scheduler daemons run on these systems whichever is active.

    PBS Pro Compute Nodes

    Compute nodes from cs1-cn1 to cs1-cn49 are configured as execution hosts.

    PBS MoM daemon runs on these nodes. PBS MoM talks to PBS Server and vice

    versa for user program execution and usage reporting etc.,

    Users jobs are assigned with one or more of these nodes based on the policy in

    force and based on the users resource request.

    Note: Nodes from cn1-cn5 have 32 cores and nodes from cn6-cn49 have 16 cores.

    Queue Configuration

    The following queues are configured

    Ilight Queue

    The ilight queue is intended to submit jobs that require very little CPU

    resource. So the nodes in this queue are over subscribed.

    cs1-cn6 has actual 16 physical cores but it is configured to use 320 slots.

    Iheavy Queue

  • The iheavy queue is intended to submit jobs that require more CPU

    resource but not 100% per core. So the nodes in this queue are only 50%

    over subscribed i.e., 2 slots / physical core.

    Nodes cs1-cn7 to cs1-cn10 are assigned to this queue.

    128 slots per node.

    Bnormal Queue

    This is the default queue.

    Rest of the nodes are assigned to this queue.

    The nodes in this queue are not over subscribed. 1 slot / core

    The jobs that require high CPU resources are submitted to this.

    Scheduling Policy

    Fairshare is configured at ANURAG.

    Concept of fairshare

    A fair method for ordering the start times of jobs, using resource usage

    history

    A scheduling tool which allocates certain percentages of the system to

    specified users or collections of users

    Ensures that jobs are run in the order of how deserving they are

    Basic outline of how fairshare works

    Scheduler collects usage from the server at every scheduling cycle

    The resource whose usage is tracked for fairshare is ncpus*walltime

    Scheduler chooses which fairshare entity is most deserving

    The job to be run next is selected from the set of jobs belonging to the

    most deserving entity, and then the next most deserving entity, and so on

    The fairshare is configured to have equal shares for all users at ANURAG.

  • Example of fairshare model:

    Access Control List

    The queues are configured with group level access control list.

    Only users belongs to VLSI are authorized to submit jobs to ilight, iheavy and

    bnornal Queues.

    However this can be reconfigured based on need.

    User job submission model

    Terminal Servers:

    idc-tcserver1

    idc-tcserver2

    Terminal servers are installed with PBS Pro commands.

    Users at ANURAG connect to Terminal servers for job submission.

    PBS Pro commands talks to PBS Server for job submission, job query, job

    monitoring etc.,

  • User Commands

    How to Submit Jobs

    To use PBS, you create a batch job, usually just called a job, which you then hand off, or submit, to PBS. A batch job is a set of commands and/or applications you want to run on one or more execution machines, contained in a file or typed at the command line. You can include instructions which specify the characteristics such as job name, and resource requirements such as memory, CPU time, etc., that your job needs. The job file can be a shell script under UNIX, a cmd batch file under Windows, a Python script, a Perl script, etc.

    For example, here is a simple PBS batch job file which requests one hour of time, 400MB of memory, 4 CPUs, and runs my_application:

    To submit the job to PBS, you use the qsub command, and give the job script as an argument to qsub. For example, to submit the script named my_script:

    #!/bin/sh

    #PBS -l walltime=1:00:00

    #PBS -l mem=400mb,ncpus=4

    ./my_application

    qsub my_script

    Command Purpose pbsnodes a Show status of nodes in cluster qsub Command to submit a job qdel Command to delete a job qstat Show job, queue, & Server status qstat -B Briefly show PBS Server status qstat -Bf Show detailed PBS Server status qstat -Q Briefly list all queues qstat -Qf Show detailed queue information tracejob Extracts job info from log files xpbs PBS graphical user interface xpbsmon PBS graphical monitoring tool

  • qsub:

    The command qsub allows to submit jobs to the batch-system. qsub uses the

    following syntax:

    qsub [options] job_script [ argument ... ]

    where job_script represents the path to a script containing the commands to be

    run on the system using a shell.

    There are two ways to submit a job.

    Method 1:

    You may add the options directly to the qsub command, like:

    qsub -o pbs_out.dat -e pbs_err.dat job_script [ argument ... ]

    Method 2:

    The qsub options can also be written to the job description file job_script, which is

    then passed on to qsub on the command line:

    qsub job_script [ argument ... ]

    The contents of job_script may look like the following:

    #!/bin/bash

    #PBS -o pbs_out.dat

    #PBS -e pbs_err.dat

    ./your_commands

    Please note that any PBS directives after the first shell command are ignored by the

    system. So keep PBS directives at the beginning of your job file.

  • Description of the most commonly used options to qsub:

    Input/Output

    -o path standard output file

    -e path standard error file

    -j oe join standard error to standard output

    Notification

    -M email address notifications will be sent to this email address

    -m b|e|a|n notifications on the following events: begin, end, abort, no mail (default) Do not forget to specify an email address (with

    -M) if you want to get these notifications.

    Resources

    -l

    walltime=[hours:minutes:]seconds requested real time; the default (maximum) depends on the system and, if applicable, the

    chosen queue

    -l select=N:ncpus=NCPU request N times NCPU slots (=CPU cores) for the

    job (default for NCPU: 1)

    -l select=N:mem=size request N times size bytes of memory for the

    job

    -W depend=afterok:job-id start job only if the job with job id job-id has

    finished successfully

    Other useful options

    -N name optional name of the job

    Sequential jobs

    qsub job_script

    where the contents of job_script may look like this:

    #!/bin/bash

    # Redirect output stream to this file.

    #PBS -o pbs_out.dat

    # Redirect error stream to this file.

  • #PBS -e pbs_err.dat

    # Send status information to this email address.

    #PBS -M [email protected]

    # Send an e-mail when the job is done.

    #PBS -m e

    # Change to current working directory (directory where qsub was executed)

    cd $PBS_O_WORKDIR

    # For example an additional script file to be executed in the current

    # working directory. In such a case assure that script.sh has

    # execution rights (chmod +x script.sh).

    ./script.sh

    Parallel MPI jobs

    qsub job_script

    where the contents of job_script may look like this:

    #!/bin/bash

    # Redirect output stream to this file.

    #PBS -o pbs_out.dat

    # Join the error stream to the output stream.

    #PBS -j oe

    # Send status information to this email address.

    #PBS -M [email protected]

    # Send me an e-mail when the job has finished.

    #PBS -m e

    # Reserve resources for 16 parallel processes

    # (Note: the default ncpus=1 could also be omitted)

    #PBS -l select=1:ncpus=16

    # Change to current working directory (directory where qsub was executed)

    cd $PBS_O_WORKDIR

    NSLOTS=`cat $PBS_NODEFILE | wc -l`

    mpirun -np $NSLOTS ./your_mpi_executable [extra params]

  • Parallel OpenMP jobs

    qsub job_script

    where the contents of job_script may look like this: #!/bin/bash

    # Redirect output stream to this file.

    #PBS -o pbs_out.dat

    # Join the error stream to the output stream.

    #PBS -j oe

    # Send status information to this email address.

    #PBS -M [email protected]

    # Send me an e-mail when the job has finished.

    #PBS -m e

    # Reserve resources for 16 threads

    #PBS -l select=1:ncpus=16

    # Change to current working directory (directory where qsub was executed)

    cd $PBS_O_WORKDIR

    # OMP_NUM_THREADS is automatically set to 16 with the above select statement

    ./your_openmp_executable

    Submitting interactive jobs

    The submission of interactive jobs is useful in situations where a job requires some

    sort of direct intervention. This is usually the case for X-Windows applications or in

    situations in which further processing depends on your interpretation of immediate

    results. A typical example for both of these cases is a graphical debugging session.

    Note: Interactive sessions are particularly helpful for getting acquainted with the

    system or when building and testing new programs.

    Starting interactive sessions within PBS only requires specifying the option -I to

    the qsub command, so in the simplest case

    qsub I

  • This will bring up a further Bash session on the system. To ensure X forwarding to

    the X-server display indicated by your actual DISPLAY environment variable, add

    the option -v DISPLAY to your qsub command. All qsub options can be specified as

    usual on the command line, or they can be provided within an optionfile with

    qsub -I optionfile

    A valid option file might contain the following lines (only #PBS directives are parsed

    within this file):

    # Name your job

    #PBS -N myname

    # Export some of your current environment variables,

    #PBS -v var1[=val1],var2[=val2],...

    # e.g. your current DISPLAY variable for graphical applications

    #PBS -v DISPLAY

    # Reserve resources for 8 MPI processes

    #PBS -l select=8

    # Specify one hour runtime

    #PBS -l walltime=1:0:0

    When your session was started prepare your environment as needed, e.g. by

    loading all necessary modules and then start your programs as usual. For your

    parallel applications refer to the script.sh files for parallel MPI, respectively parallel

    OpenMP batch jobs above.

    Note: Make sure to end your interactive sessions as soon as they are no longer

    needed!

    Observing jobs (qstat)

    To get information about running or waiting jobs use

    qstat [options]

  • Useful qstat options :

    -

    uuser prints all jobs of a given user

    -

    fjob-

    id

    prints full information of the job with the given job-id. Note that information

    such as resources_used.ncpusare on resources allocated to the job by

    PBSPro, not actual resource consumption. Use top -u $USER to observe your

    program's behaviour in real time.

    -aw prints more information in a wider display

    -help prints all possible qstat options

    In case of pending jobs, you might get a hint on when your job will be started, by

    executing

    qstat -awT

    In this case the runtime of waiting jobs is replaced with their estimated start time.

    Don't be alarmed, though, this only provides a worst case estimation.

    Deleting a job (qdel)

    To delete a job with the job identifier job-id, execute

    qdel job-id

    Input/output staging

    The two streams stdout and stderr are directed to a special directory

    (naming: pbs.$PBS_JOBID.x8z) subordinate to your home directory and copied to

    the job's working directory only at the end of the job.

    Note: If your application writes large amounts of data to stdout and/or stderr,

    please redirect this output directly to a location within your $SCRATCH space.

    Otherwise your jobs might be terminated for exceeding your $HOME quota. You can

    redirect both the output to stdout and to stderr of your application my_executable to

    a file job_output.datwithin your current working directory with:

    ./my_executable >job_output.dat 2>&1

  • PBS umask setting

    Per default PBS jobs run under the uncommon umask setting 0077 (i.e. all

    read/write/execute permissions for group members or others are removed). If you

    need another umask setting for your job runs (e.g., if you want to share

    information with group members) please specify your desired setting explicitly

    within your job scripts. For example

    umask 0022

    sets the usual Linux bit mask, removing write permissions for group and others.

    For more information please refer PBS Professional

    Administrators Guide

    User Guide

    Reference guide

    These can be downloadable from User Area.