8
Cluster Computing Applications for Bioinformatics • Thurs., Sept. 20, 2007 • process management • shell scripting • Sun Grid Engine • running parallel programs

Cluster Computing Applications for Bioinformatics

Embed Size (px)

DESCRIPTION

Cluster Computing Applications for Bioinformatics. Thurs., Sept. 20, 2007 process management shell scripting Sun Grid Engine running parallel programs. Accessing the Cluster. ssh username@server -X to enable X forwarding ssh compute-#-# to access specific node - PowerPoint PPT Presentation

Citation preview

Page 1: Cluster Computing Applications for Bioinformatics

Cluster Computing Applications for Bioinformatics

• Thurs., Sept. 20, 2007

• process management

• shell scripting

• Sun Grid Engine

• running parallel programs

Page 2: Cluster Computing Applications for Bioinformatics

Accessing the Cluster

• ssh username@server– -X to enable X forwarding

• ssh compute-#-# to access specific node

• qrsh to access the least busy node

• cluster-fork command to run on every node

Page 3: Cluster Computing Applications for Bioinformatics

Managing Processes

• ps – list your running processes– -f : show file information– -e : list everyone's processes

• top – current top processes by CPU and memory use

• kill – terminate a process by number– killall to kill by program name

• command & - run in background– bg - show background tasks

• nice / renice – set priority

Page 4: Cluster Computing Applications for Bioinformatics

The Shell

•Unix command interpreter

•bash – Bourne Again Shell

•.bashrc and .bash_profile

–settings for your shell environment

cd ~

ls -a

vi .bash_profile

echo $PATH

Page 5: Cluster Computing Applications for Bioinformatics

Shell Scripting

•Automate common tasks

– create directory structure required for sequence assembly

mkdir ~/bin

cd /share/bio/examples/

cp makeseqdir ~/bin

cd TFL

makeseqdir

Page 6: Cluster Computing Applications for Bioinformatics

Distributed Shell Scripts

•Preface CPU intensive commands with qrsh -cwd

•qtcsh– shell that does this

automatically based on ~/.qtask file

– Does not work

cd /share/bio/examples/

cp assemble ~/bin

assemble

Page 7: Cluster Computing Applications for Bioinformatics

Sun Grid Engine - SGE

• Job queue and load balancing

• commands:– qrsh / qtcsh– qstat -f : show status of jobs / queues– qdel : delete a job from the queue– qmon : graphical interface– qsub : submit job

Page 8: Cluster Computing Applications for Bioinformatics

Running Parallel Programs•MPI – Message Passing Interface

•must be launched with mpirun or as a script with qsub

•mpiblast - parallel version of BLAST

– modify ~/.ncbirc– first run mpiformatdb

–nfrags=n

cd /share/bio/examples

cp .ncbirc ~

cp mpiblast.sh ~

cd ~

qsub -pe mpich 8 mpiblast.sh