32
1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant ([email protected]) Bio Computing & iPlant Collaborative Eric Lyons ([email protected]) Plant Sciences & iPlant Collaborative University of Arizona http://goo.gl/ p4j3m or https://sites.google.com/site/appliedciconcepts/ Will Computers Crash Genomics? Science Vol 331 Feb 2011

1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant ([email protected]) Bio Computing & iPlant Collaborative Eric Lyons

Embed Size (px)

Citation preview

Page 1: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

1

Applied CyberInfrastructure ConceptsISTA 420/520 Fall 2013

1

Nirav Merchant ([email protected])Bio Computing & iPlant CollaborativeEric Lyons ([email protected])Plant Sciences & iPlant CollaborativeUniversity of Arizonahttp://goo.gl/p4j3m or https://sites.google.com/site/appliedciconcepts/

Will Computers Crash Genomics? Science Vol 331 Feb 2011

Page 2: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

Tasks for todayManaging your VMAdd user, permission, security considerations etc.Understanding where the files areTerminal, editors etcShell and scriptingStart building your “Data Science ToolBox”

Page 3: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

Step #1 for Big Data ToolkitCommand line competency

Page 4: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

PermissionsWhy do you need them ?What is a ACL (Access Control List) ?The UNIX model of permissions (next slides are

from Greg Wilson at http://software-carpentry.org)

Path statement and finding things

Page 5: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

Has unique user name and user ID

user

Page 6: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

Has unique user name and user ID

User name is text: "imhotep", "larry", "vlad", …

user

Page 7: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user

Has unique user name and user ID

User name is text: "imhotep", "larry", "vlad", …

User ID is numeric (easier for computer to store)

Page 8: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group

Page 9: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group

Has unique group name and group ID

Page 10: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group

Has unique group name and group ID

User can belongs to zero or more groups

Page 11: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group

Has unique group name and group ID

User can belongs to zero or more groups

List is usually stored in /etc/group

Page 12: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group all

Page 13: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group all

Everyone else

Page 14: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group all

Has user and group IDs

Page 15: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group all

read

Page 16: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group all

read

write

Page 17: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group all

read

write

execute

Page 18: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group all

read ✔ ✔ ✗

write ✔ ✗ ✗

execute ✗ ✗ ✗

Page 19: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group all

read ✔ ✔ ✗

write ✔ ✗ ✗

execute ✗ ✗ ✗

File's owner can read and write it

Page 20: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group all

read ✔ ✔ ✗

write ✔ ✗ ✗

execute ✗ ✗ ✗

File's owner can read and write it

Others in group can read

Page 21: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

user group all

read ✔ ✔ ✗

write ✔ ✗ ✗

execute ✗ ✗ ✗

File's can read and write it

Others in group can read

That's all

Page 22: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

Where are my files ?

• Understanding layout of data– Home– Root– Tmp

• Permissions• Storage space and planning for it• Managing runaway items (more in next

class)

Page 23: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

Security considerations

• Update your OS (how can you do that ?)• Why you should NEVER run as root

(how do I add a user ?)• Password and keys

(and dual factor)• Ssh foo

Page 24: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

What is Shell?

• Shell is– Command Interpreter that turns text that you

type (at the command line) in to actions:– User Interface: take the command from user

• Programming Shell can do– Customization of a Unix session– Scripting– Many Many automation steps

Page 25: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

What is Shell?

• Shell is– Command Interpreter that turns text that you

type (at the command line) in to actions:– User Interface: take the command from user

• Programming Shell can do– Customization of a Unix session– Scripting– Many Many automation steps

Page 26: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

Customization of a Session

• Each shell supports some customization.– User prompt– Where to find mail– Shortcuts (alias)

• The customization takes place in startup files – Startup files are read by the shell when it

starts up– The Startup files can differ for different shell

Page 27: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

Popular Shells

sh Bourne Shell ksh Korn Shell csh,tcsh C Shell (for this course) bash Bourne-Again Shell

Page 28: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

Flavors of Unix Shells

• Two main flavors of Unix Shells– Bourne (or Standard Shell): sh, ksh, bash, zsh

• Fast• $ for command prompt

– C shell : csh, tcsh• better for user customization and scripting• %, > for command prompt

• To check shell:– % echo $SHELL (shell is a pre-defined variable)

• To switch shell:– % exec shellname (e.g., % exec bash)

Page 29: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

Startup files and why you should care

bash:/etc/profile (out-of-the-box login shell settings) /etc/bash.bashrc (out-of-box non-login settings)/etc/bash.bashrc.local (global non-login settings)~/.bash_profile (login shell user customization)~/.bashrc (non-login shell user customization)

~/.bash_logout (user exits from interactive login shell)

http://cli.learncodethehardway.org/bash_cheat_sheet.pdf

Page 30: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

Some Special Keys

• How do you invoke tcsh ?• Ctrl-U = Delete everything on the command-

line• Ctrl-A = Move cursor to the front• Ctrl-E = Move cursor to the end• Ctrl-P = Set the current command-line to the

previous command• Ctrl-N = Set the current command-line to the

next command• TAB = Filename completion

Page 31: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

Preview pieces of toolbox

• http://datascienceatthecommandline.com/• We will work though Step 5 and go straight to

commands

Page 32: 1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall 2013 1 Nirav Merchant (nirav@email.arizona.edu) Bio Computing & iPlant Collaborative Eric Lyons

Next class

Preparing to play with your data set– Can you download a piece of it ?

Learn about space and process management Introduction to shell scripting and automationStart building your Big Data command line tool

kit