38
CCPR Computing Services Workshop 1: Programming Basics, Unix, Remote Computing October 13, 2004

General Programming 101304

Embed Size (px)

DESCRIPTION

Programming

Citation preview

CCPR Computing ServicesWorkshop 1:

Programming Basics, Unix, Remote Computing

October 13, 2004

Part 1: Programming Basics

Motivation Before you start coding Programming Conventions

Documentation Names, comments

Directory Structure Basic Constructs Miscellaneous (debugging, cross-checking

results)

Motivation

Facilitate research Save time Cleaner code Easily share programs Basic Concepts

MUCH better programming

Programming Conventions What are conventions?

Examples

Who cares? Readability of code Organization Transferring code to others

Apply conventions consistently to: variable names and function names comments directory structure

Before you start coding…

THINK WRITE down the problem WRITE down the algorithm in English (not code)

Modularity Comments Create test (if reasonable)

TRANSLATE one section to code TEST the SECTION thoroughly Translate/Test next section, etc.

Documentation - File Header#Laura Piersol ([email protected])#HRS project#/u/socio/laurapiersol/HRS/#October 11, 2004#Python version 2.4#Stata version 8#Purpose: Create and merge two datasets in Stata,# then convert data to SAS#Input programs: # HRS/staprog/H2002.do, # HRS/staprog/x2002.do, # HRS/staprog/mergeFiles.do#Output: # HRS/stalog/H2002.log, # HRS/stalog/x2002.log, # HRS/stalog/mergeFiles.log # HRS/stadata/Hx2002.dta # HRS/sasdata/Hx2002.sas#Special instructions: Check log files for errors # check for duplicates upon new data release

File header includes: Name Project Project location Date Version of software Purpose Inputs Outputs Special Instructions

Naming Conventions

Not a detail! Good names clarify your code Portray meaning/purpose Adopt a convention and BE CONSISTENT

Naming Conventions, cont. Use language standard (if it exists) If no standard, pick one and BE CONSISTENT

Functions: getStats, calcBetas, showResultsScalar variables: scPi, scGravity, scWorkHoursString variables: stName, stCareerGlobal variables: _Country, _Nbhd

Be aware of language-specific rules Max length, underscore, case, reserved words

Naming Conventions, cont.

Differentiating log files: Programs MergeHH.sas, MergeHH.do Log files MergeHHsas.log, MergeHHsta.log

Meaningful variable names: LogWt vs. var1 AgeLt30 vs. x

Procedure that cleans missing values of Age: fixMissingAge

Matrix multiplication X transpose times X matXX

Commenting Code

Good code is SELF-COMMENTING Naming conventions, structure, header explain 95%

Comments explain PURPOSE, not every detail TRICKS (good) reasons for unusual coding

Comments DO NOT fix sloppy code translate syntax

Commenting Code - Stata example

SAMPLE 2*Convert names in dataset to lowercase. program def lowerVarNames

foreach v of varlist _all { local LowName = lower("`v'") *If variable is already lowercase, *rename statement throws error. if `"`v'"' != `"`LowName'"' { rename `v' `=lower("`v'")' } }end

SAMPLE 1program def function1foreach v of varlist _all {local x = lower("`v'")if `"`v'"' != `"`x'"' {rename `v' `=lower("`v'")'}}end

No conventions, comments, structure

Comments: succinct and not overdoneNames:

lowerVarNames-action word for program-distinct use of case

LowName-descriptive-distinct use of case

v-looping variable and short scope-non-descriptive, but does not detract

from meaning!Structure- indentations, parentheses lined up!

Directory Structure

A project consists of many different types of files

Use folders to SEPARATE files in a logical way

Be consistent across projects if possible

ATTIC folder for older versions

HOME

PROJECT NAME

DATA

RESULTS

LOG

PROGRAMS

ATTIC

Miscellaneous Tips

BACKUP! Weekly zip file stored externally README.txt file to describe folder BE ORGANIZED CROSS-VERIFY results Something not working?

Remember the computer is following YOUR directions… go back to your code

Programming Constructs

Tools to simplify and clarify your coding Available in virtually all languages

Constructs - Looping Repeat section of code START value, INCREMENT, STOP value Example-

convert uppercase to lowercase for each variable in a dataset

Constructs – Looping ExamplesC for loop: Start with x=1, Increment = x+1, Stop when x==10

for(x=1; x<10; x++) { …code…}

PERL while loop: Start with count= 1, Increment= count+1, Stop when count==11$count=1; while ($count<11) { print "$count\n"; $count++; }

STATA foreach loop: Start = first variable in varlist, Increment = next variable in varlist, Stop =last variable in varlist

foreach v of varlist _all {local LowName = lower("`v'")if `"`v'"' != `"`LowName'"' {

rename `v' `=lower("`v'")'}

}

Constructs - If/then/else Execute section of code if condition is true:

if condition then{execute this code if condition true}

end Execute one of two sections of code:

if condition then{execute this code if condition true}

else{execute this code if condition false}

end

Constructs - Elseif/case Elseif - Execute one of many sections of code:

if condition1 then{execute this code if condition1 true}

elseif condition2 then{execute this code if condition2 true}

else{execute this code if condition1, condition2, condition3 are all false}

end

Case- same idea, different namecase condition1 then

{execute this code if condition1 true}case condition2 then

{execute this code if condition2 true}etc.

Constructs - And, or, xor

1 AND 1 True

1 AND 0 False

0 AND 0 False

1 OR 1 True

1 OR 0 True

0 OR 0 False

AND - BOTH conditions must be true results in TrueOR - AT LEAST ONE condition must be true results in TrueXOR - EXACTLY ONE condition must be true for statement to be true

1 XOR 1 False

1 XOR 0 True

0 XOR 0 False

Constructs - Break

Stop execution of program Examples:

Debugging. If particular error occurs then break. Parameters in function call are nonsensical. Print

error and break.

Constructs - keywords

Looping - for, foreach, do, while If statements – if, then, else, case And/Or/Xor – logical, and, or, xor, &, | Break – exit, break

PART 2: Unix

Motivation Basic Commands Job submission and management Pipes Unix Shell Script files

Unix

Motivation A quick history Unix variants

(AIX, Solaris, FreeBSD, Linux) Where?

Nicco (SSC’s server) CCPR’s linux cluster (coming soon) CCPR’s data server (coming soon)

Unix – Basic Commands

man command list help for command (man if short for manual)

man -k command keyword search for command

whatis command give a brief description of command

apropos keyword list commands with keyword in the NAME section their man page

Getting Help

Unix – Basic CommandsFile Managementls list files (options –l for long, -a for all)

mv filename1 filename2 rename filename1 with the name filename2

cp filename1 filename2 make a copy of filename1 and call it filename2

rm filename delete filename

more filename print contents of filename1 to the screen

cat filename print contents of filename to the screen

cat filename1 >> filename2 append contents of filename1 to the file filename2

cat part1 part2 >> bothparts

appends the contents of file part1 and file part to to the file bothparts

head filename show first 10 lines to screen

tail filename show last 10 lines to screen

Unix – Basic CommandsDirectory Management

pwd show current directory

mkdir dirname create new directory dirname

rmdir dirname remove directory dirname

cd dirname Change to directory dirname

cd change to your home directory

cd .. move one directory up

cd ../.. move two directories up

~ home directory

cd ~/scripts change to the scripts folder in your home directory

Unix – Basic CommandsUsing Previous Commands

!! repeat last command

!v repeat the last command that started with v

!-2 repeat second to last command

arrow up/down scroll through list of previous commands

history list history of commands used

Unix – Basic CommandsOther Useful Unix Tools

* wildcard (matches any number of characters)

? wildcard (matches single character)

grep word filename list lines of filename containing word

diff filename1 filename2 shows differences between filename1 and filename2

wc filename counts number of lines, words, and characters in filename

sort < filename sorts the lines of filename

who lists users currently on the system

cal displays current month's calendar

date displays date

Unix – Basic Commands

du –s Total kilobytes used in current directory

du –a Same as above, but more detail

ls –l Gives individual file sizes in bytes

Disk Usage

Editing Files in Unix

Vi Editor and Emacs Neither are user-friendly for starters Look at CCPR internet when you start Best way to learn is to start editing a document Once you get used to them, they’re easy and fast to

use.

Being nice

Always run your jobs “nicely” Prevents interfering with other users Precede command with “nice +19” (no quotes)

user@nicco%nice +19 stata –b jobfile.do

Job Submission in Unix

Interactiveuser@nicco%stata

Foreground jobsuser@nicco%nice +19 stata –b do jobfile.douser@nicco%nice +19 sas jobfile.sas

Background jobs – &user@nicco%nice +19 stata –b do jobfile.do &user@nicco%nice +19 sas jobfile.sas &

Background jobs with logoff – nohupuser@nicco%nohup nice +19 stata –b do jobfile.do &user@nicco%nohup nice +19 sas jobfile.sas &

Job Management

Ctrl-c Cancel a foreground job

Ctrl-z Suspend a job in the foreground

bg Move a suspended foreground job to the background

ps –u List information for processes you own (under current shell)

ps –ux Lists information for process owned by you and others

ps –aux Lists information for all processes (including root, bin, etc.)

ps –aux | more Display output of ps –aux one page at a time

ps –aux | grep PID List lines from command ps –aux containing PID

kill PID Kills (cancels) process number PID

top Lists 15 processes using the most cpu processing power

q Stops command top

Pipes “|” redirects

command 1 output command 2 input

Command Output from Input to Result

top | grep piersol top grep piersol Jobs in top 15 containing piersol

ps -aux | more Ps –aux More Jobs listed one page at a time

ls | wc -l ls Wc –l Count number of files and directories in current directory

Extends to more than 2 commands

Unix Shell

What’s a Unix Shell? The Unix shell is the program that provides the

interface between the user and the kernel What’s a shell script?

A list of commands put into a file that can be interpreted by the Unix Shell.

What are scripting languages? Generally easier to code but less efficient Shell scripts, Perl, Python

Remote Computing

Windows to Whitney

Remote Login File Transfer

Remote Desktop Connection

>= XP included*< XP download*

Map a Drive viaWindows Explorer

Web-based Remote Desktop

Connection*

*See http://www.ccpr.ucla.edu/asp/compserv.asp, XP users get latest version

Remote Computing

Windows to Unix

Remote Login File transfer

SSH Secure Shell Client*

SSH Secure File Transfer Client*

Map a Drive viaWindows Explorer –

Need Samba account**

* http://computing.sscnet.ucla.edu/training/tutorial_SSH.htm** http://computing.sscnet.ucla.edu/training/tutorial_samba.htm

Finally…

Questions and Feedback