Upload
jocelyn-howard
View
248
Download
2
Tags:
Embed Size (px)
Citation preview
Content
1. Basic overview of Linux
2. The Linux ‘terminal’ and structure of commands
3. Everyday Linux commands and working with files and directories
4. The Linux survival ‘cheat-sheet’
5. Simple exercises
What is Linux?
• Simply an operating system, e.g.– Windows ?– Mac OS X
• Controls a computer system’s resources
• Allows us to use those resources while making it difficult enough to ‘break’ things
Why Linux? (Computing Perspective)
• FREE - can be downloaded in its entirety• Portable to almost any hardware platform• Made to keep running for years• Secure, versatile, scalable• Very few bugs and loopholes (if any) survive
more than a few hours
In this case, free = better quality
Why Linux? (Bioinformatics Perspective)
• Software developed first for Linux/Unix
• Convenient built-in system utilities that can be strung together for doing complex analyses
• Security + Scalability
• The ‘command line’ – best bioinformatics software first released without fancy interfaces
• Control + Flexibility
• Automation - easy to ‘pipeline’ processes
The Linux Filesystem
Just like with any window-based systems, you MUST make sub-directories in your home directory to organise your work
The ‘commandline’• One of 2 user interfaces (other is the GUI)
• Textual ‘shell’ for issuing commands to the system and for viewing/capturing outputs
• Very powerful way of getting exactly what you want
• Working remotely on a server is typically via a command line shell
• Bit of a learning curve, but once it ‘clicks’ you rarely want to click anymore
Commands• General structure of Linux commands:
$ command -options targets
• Can automatically save to a file:$ command -options targets > output_file
• Can let multiple commands ‘play tag’$ command -options targets | command2 | command3 > output_file
NOTES: • Spaces are NOT optional• RARELY do we view or grab things from non-text formatted files
‘pwd’ & ‘ls’
• In pure command line mode, you can’t see where you are in the file system.
• pwd prints your current working directory
• ls lists the contents of the working directory– it is the equivalent of double-clicking a folder to
see what is inside
rm
• removes a file from the computer
• Typing rm myfile removes the file called ‘myfile’
• NOTE: there is no ‘Trash’ or ‘Recycle Bin’, so once a file is gone, its gone for good.
• Very rarely will you need to do rm * , which deletes ALL files in the current folder, so always try to convince yourself NOT to
Getting information about directories
• pwd (print working directory)• ls (list directory contents)• ls -l (long listing of directory contents)• ls -lh (file size units: KB, MB, GB, etc)• ls -lt (sort by time created/edited)• ls -ltr (same as above, but reversed)
• ‘Wildcards’ allowed, eg. ls -ltrh *.txt
Getting basic information about files
• wc counts the number of characters, words and lines in a file
• If used without extra options it reports all 3
• For specific parts only, use:– wc -c filename (characters)– wc -w filename (words)– wc -l filename (lines)
(We’ll see examples in the practical)
Creating, renaming, moving and deleting directories
• mkdir directory (creates a new directory)
• mv directory_name new_name– renames if new_name doesn’t exist– moves directory into new_name if new_name
exists (NB: Will not ask first)
• rmdir directory_name (deletes directory ONLY if empty)
Navigating the directory structure
• ‘/’ is the root of the directory tree• cd directory (changes your current working
directory to the new directory)– cd / takes you to the root directory– cd takes you home– cd .. takes you one step ‘up’ (closer to root)– cd full-path-to-directory will take you to that
directory regardless of where you currently are
• eg. cd /home/john/music/jack_parow/
Navigation tips
• Use ‘pwd’ a LOT to get your bearings
• A plain ‘cd’ will always take you home
• ‘cd -’ (minus) takes you to the previous directory you were working in
• Use ‘cd ..’ a lot to become used to navigating from memory • eg. cd ../../microrarray_data/affymetrix/
Copying
• cp source-file(s) destination, where destination is:– another directory, eg. copying files from an
archive to a working directory– a new filename, eg. cp mainfile workingfile
• Copying entire directories also possible– cp -fr exome1/ exome1_Mar2012/ (identical)
Working with files
• Rename: mv a.fastq sample_1_1.fq• Moving: mv sample_1_1.fq ngs/raw/• Deleting: rm a.fastq.backup
• Wildcards work (but be careful!)– rm * (removes every file in a directory)– rm *.backup (only operates on files with .backup)– mv *.fastq.backup ngs/backups/
Viewing (text) file contents
• cat – Short for concatenate
– displays contents of file(s) one after the other
– writes file contents to terminal (screen)
• ‘Problem’: if files are bigger than 1 screen’s worth, text just goes flying past and you don’t get to see anything useful
Viewing (text) file contents (2)
• more– like a cat with pause
– still writes to the screen, but in seqments• more 1.fasta
– has a pattern finding facility, which enables you to jump to first occurrence of the search string• sequence: more 1.fasta <enter>, /string <enter>,
<enter> for next hit
Viewing (text) file contents (3)
• nano– simple terminal-based text viewer/editor– basic scrolling and search functions– USAGE: nano 1.fasta– CTRL-W (to find)– CTRL-O to save– CTRL-X to exit
• CAREFUL working with large files…
Cheat sheets
• http://files.fosswire.com/2007/08/fwunixref.pdf
• This covers everyday Linux commands, but…
• The best cheatsheet is the one you create and keep adding to as you learn new ‘tricks’
3 good options for using Linux
1. Existing Windows computer:– Virtual machine– Dual boot– ‘Linux on a stick’ (See pendrivelinux.com)
2. Dedicated desktop (commodity hardware)
3. Remote server e.g. get an account at your institute or Center for High Performance Computing: www.chpc.ac.za
Accessing a remote server
• Assumes you have an account on said machine
• Requires a ‘client’ to talk to remote machine
• Linux and Mac – simply use terminal
• Windows – download a special program– PuTTY is a good option: – www.chiark.greenend.org.uk/~sgtatham/putty/
ssh
• Network protocol for secure data communication between two computers that connects, via a secure channel over an insecure network
• Connection is between a SSH server and a SSH client
• Perfect for shell account access
• Replaces telnet (never use) which sends passwords in plaintext that can be intercepted
Example ssh sessionsapphire-5:~ junaid$ ssh junaid@rocketjunaid@rocket's password: Linux rocket.sanbi.ac.za 2.6.35-22-server #35-Ubuntu SMP Sat Oct 16 22:02:33 UTC 2010 x86_64 GNU/Linux
Welcome to the Ubuntu Server!
Last login: Mon May 27 10:25:02 2013 from 172.20.1.223
junaid@rocket:~$ ls
20Apr2011-Afr-vs-Rest.zip gene.rc rvm-installer bin/ KGP.bed target_intervals.bed.gzAfr-vs-All-1000G-coding-20Apr2011.txt CE Certification Policy.doc.PDF snpEff tmpAfr-vs-All-HapMap3-coding-20Apr2011.txt tophat_sample2
junaid@rocket:~$ more Afr-vs-All-1000G-coding-20Apr2011.txt
1 13301 13301 C T rs180734498 DDX11L11 69510 69510 G A rs75062661 OR4F51 137824 137824 G A rs147252685 LOC7297371 714018 714018 A G rs114983708 LOC1002880691 761731 761731 T C rs2286139 LINC001151 761751 761751 T C rs1057213 LINC001151 761760 761760 A G rs143121524 LINC001151 761763 761763 G A rs144708130 LINC001151 762021 762021 T C rs186393233 LINC001151 762272 762272 A G rs3115849 LINC00115
Moving files across servers
• Can be between your 1) personal machine and a server or 2) between two remote servers1. Two methods• Commandline (personally preferred) - ftp or sftp• Graphical client, eg.
– FileZilla: https://filezilla-project.org/ or – WinSCP: http://winscp.net/)
2. Commandline – 2 step• ssh into remote server• ftp or sftp into second server and initiate transfers
Exercise
(Use your notes and ‘cheatsheet’)1. Open a terminal2. Check that you are in your home directory3. Create a directory called ‘exercises’4. Can you move this directory to the same
level as your home directory?5. Copy testfiles.zip into the new folder
6. Go into exercises (change working directory ) and uncompress the file called ‘testfiles.zip’– $ unzip testdata.zip
7. Do a directory listing to see what happened8. Notice a new directory? Go into it.
9. Do the following and note the differences: cat gene_expression.diff.txt more gene_expression.diff.txt (hit ENTER a few times, then hit the
SPACEBAR) nano gene_expression.diff.txt
NOTE: copy testfiles.zip into ‘exercises’
10. Concatenate all the files as follows:• cat *.txt > joined.txt
11. Make a copy of joined.txt and name it anything you want
12. Do a ‘long’ directory listing to compare sizes of old files and new file. Do the numbers add up?
13. Go back one directory level (cd ..)
14. Now copy entire testdata directory as newdata