101
Lab 1: Introduction to Python Programming 1/20/17 Slide credits: Nicole Rockweiler! 1

Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

  • Upload
    lamnhan

  • View
    221

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Lab 1: Introduction to Python Programming

1/20/17Slide credits:

Nicole Rockweiler!1

Page 2: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Afewpreliminarywords…

2

Page 3: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Overview

• Schedule• Logistics• GettingStarted• IntotoUnix• IntrotoPython• Assignment1

3

Page 4: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Gettingthemostoutofthiscourse

1. StartthehomeworkEARLY2. Collaborate3. Useyourresources– tutors,TAs,professors,labmates,discussion

groups,andmostofall,theinternet.4. Thinkbig

4

Page 5: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Logistics

• Registerfor4credits• Labsareacontinuationoftheconceptslearnedfromlectures• Labmaterialisgenerallynottestedonexams• Coursewebsite:http://genetics.wustl.edu/bio5488/• Bringyourlaptoptoeverylab

5

Page 6: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Wheretogethelp(a.k.a.howtomaintainyoursanity)

• Cometoofficehours• Mondaysafterclass(11:30am-12:30pm)inthe4th floorclassroom4515McKinley/areaoutsidetheclassroomandbyappointment

• Cometotutoringsessions• Tuesdays5:30-7pmin6001B*ScottMcKinleyBuilding• *4/4willbein5001B• FREEFOOD!!

• Usethegoogledocstoask/answerquestions-https://docs.google.com/spreadsheets/d/11KW_lu9mE59LBtF0X8EtrCJfHQZ22fQwz8AC3AMZSs8/edit?usp=sharing• [email protected]• Workingroups

6

Page 7: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

7

Wheretogethelp(a.k.a.howtomaintainyoursanity)

Page 8: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Assignments

• AssignmentsarepostedonthecoursewebsiteWednesdaysat10am• AssignmentsareduethefollowingWednesdayat10am• Assignmentformat• Givenabioinformaticsproblem• Write/completeaPythonscript• Analyzedatawithyourscript• Answerbiologicalquestionsaboutyourresults

• Turninformat• MoreonthisinabitJ

8

Page 9: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Wed Thurs Fri Sat Sun Mon Tue WedHW

releasedClass

discussion&worktime

10-11:30am

Officehours11:30-

12:30pm

Tutoringsession5-7:30pm

HWdue10am

Schedule

9

Page 10: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Schedule(cont.)Assignment Released Due Topic

1 1/18 1/27 Introduction2 1/25 2/1 SequenceComparison3 2/1 2/8 NextGenSequencing4 2/8 2/15 GeneExpression5 2/15 2/22 Epigenomics6 2/22 3/1 MotifFinding7 3/1 3/22 Synthetic GeneAssembly8 3/1 3/22 Metagenomics9 3/22 3/29 GeneticVariation10 3/29 4/5 Wright-FisherModel11 4/5 4/12 TBD12 4/12 4/19 Substitution Rates

13 4/19 4/26 CisRegulatoryEvolution

2labsoverspringbreak

10

Page 11: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Assignmentpolicies

• SeetheCourseInformationà Assignmentpoliciesdocumentoncoursewebsite• Thereare13assignments

• Youmustturninallassignments• Allassignmentsareweightedequally

• Latepolicy• 25%penaltyforturninginassignment1daylate• Assignmentsthatare>1daylatewillgivena0• Emailus(early)torequestanextension

• Auditors• We’llgivecommentsonyourprograms,butwon’tgradetheshortanswerquestions• Samelatepolicyapplies

• Collaboration• Groupworkisencouraged,butplagiarismisunacceptable• Tryto“Googleit”first• Citeyoursources

• Workontheassignmentbeforecomingtolab 11

Page 12: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Grading

• Eachassignmentisoutof10points• Gradedon• Doesthecodework?

• Itdoesn’thavetobethe“fastest”or“mostefficient”togetfullcredit• Ifdoesn’twork,describewhereyouhadproblems• Isthecodewellcommentedandreadable?(moreoncommentinglaterJ)

• Aretheanswerscorrect?• Gradeswillbereturnedinafilecalledgrades.txtontheclassserver• OnlyyouandtheTAswillbeabletoreadthisfile

12

Page 13: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Gettingstarted

13

Page 14: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Remotecomputers

• Wewillbedoingallofourworkonaremotecomputerwiththe hostnamegenomic.wustl.edu• ThisisaUnix-basedcomputerthatwecansecurelyconnecttothroughaprotocolcalled secureshell (SSH).

14

Page 15: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Whatistheshell?

• Theshell isaprogramthattakescommandsfromthekeyboardandgivesthemtotheoperatingsystemtoexecute• Therearemanydifferentshellprograms• We’llbeusingthemostcommonshell:theBourne-AgainShell(bash)

15

Page 16: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

AWindow’sGUI

HowdoIaccesstheshell?

• Mostofusarefamiliarwithgraphicaluserinterfaces(GUI)tocontrolourcomputers• Anotherwayiswithcommand-lineinterfaces (CLI)• Aterminal emulatorisaprogramthatallowsyoutointeractwiththeshellthroughaCLI• TherearemanydifferentterminalprogramsthatvaryacrossOSs

• We’llbeusingPuTTY (Windows)andTerminal(Mac)

APuTTY window

ATerminalwindow16

Page 17: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

WhyshouldIlearnhowtouseshellsandterminals?

• CLIsarecommoninscientificcomputingà getusedtothem!• Theshellisareallypowerfulwayofinteractingwithyourcomputerà becomeasuperuser!

17

Page 18: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Bio5488commandconvention

• Wehighly recommendthatyoutypeallofthecommand/codeyourselfratherthancopyandpasting• Here'sanexampleofacommandline"snippet“

$ type_me_exactly <modify_me>output

$ ls <assignment>README.txt

Example:

Template:Don’ttypethe“<>”

Thisiscalledthecommandprompt.Itmeans,“I’mreadyforacommand!”Don’ttypethe“$.”

18

Page 19: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Howtologontotheremotecomputer(Windowsusers)

1. LaunchPutty2. Inthehostnamefield,enter

genomic.wustl.edu3. Enterasessionnickname,e.g.,

bio54884. ClickSave5. ClickOpen

19

Page 20: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Howtologontotheremotecomputer(Macusers)

1. OpenTerminal(foundin/Applications/Utilities)2.SSHtotheremotecomputer.Type:

ssh <username>@genomic.wustl.eduwhere<username> isreplacedwithyourusername

3.Asecuritymessagemaybeprinted.Typeyes andhitenter.

20

Page 21: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Howtologontotheremotecomputer(Macusers)

4.Enteryourpassword- itwillnotshowthatyouaretyping! Hitenter.

21

Page 22: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Acoupleofnotes

• WhenyoulogontotheclassserveryouwillbelocatedinYOURhomedirectory.• Everycommandthatyourunafterloggingontoaremotecomputerwillberunonthatcomputer.

22

Page 23: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

SublimeText

• SublimeTextisatexteditor forwritingandeditingscripts• We’lluseSublimetoeditbothlocalandremotefiles• Documentation:http://www.sublimetext.com/support

23

Page 24: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Cyberduck

• Cyberduck isasecurefiletransferclient andwillallowyoutotransferfilesfromyourlocalcomputertoaremotecomputer

24

Page 25: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Exercise:settingupCyberduck

• Createabookmark• LaunchtheCyberduck application• ClickBookmarkà NewBookmark• SelectSFTP(SSHFileTransferProtocol)fromthedropdownmenu• Enteranicknameforthebookmark,e.g.,bio5488• Entergenomic.wustl.eduastheservername• ClicktheX

• Setthedefaulttexteditor• ClickCyberduck/Edità Preferencesà Editor• Selectsublimetextfromthedropdownmenu.(Youmayneedbrowseyourcomputerfortheeditor)

• CheckAlwaysusethisapplication• RestartCyberduck

25

Page 26: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Exercise:transferringfileswithCyberduck

• Todownload afiletoyourlocalcomputer• DraganddropafilefromCyberduck toyourFinder/FileExplorerwindow• Or,double-click

• Toupload afiletotheremotecomputer• DraganddropafilefromFinder/FileExplorertoCyberduck

26

Page 27: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Exercise:editingremotefileswithSublimeTextandCyberduck

• Newfiles• ClickFileà Newfile• Enterafilename• Clickedit• SublimeTextshouldnowlaunch• Addsometexttothefile• ClickFileà Saveorctrl+s

• Existingfiles• Selectthefilebyclickingthefilename1X• ClicktheEditbuttoninthenavigationbar• Editthefile• ClickFileà Saveorctrl+s

27

Page 28: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

BasicUnix

28

Page 29: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Thefilesystem

• The filesystem isthepartoftheoperatingsystem(OS)responsibleformanagingfilesandfolders• InUnix,foldersarecalled directories.

• Unixkeepsfilesarrangedinahierarchicalstructure• Thetopmostdirectoryiscalledtherootdirectory• Eachdirectorycancontain

• Files• Subdirectories

• Youwillalwaysbe“in”adirectory• Whenyouopenaterminalyouwillbeinyourown homedirectory.

• Onlyyoucanmodifythingsinyourhomedirectory

29

aclemens

Page 30: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Determiningwhereyouare(pwd)

• Ifyougetlostinthefilesystem,youcandeterminewhereyouarebytyping:

$ pwd/home/aclemens

• pwd standsforprintworkingdirectory• pwd printsthefullpath ofthecurrentworkingdirectory

30

Page 31: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Listingdirectorycontents(ls)

• Tolistthecontentsofadirectory:$ lsassignment1 foo

• lsstandsforlistdirectorycontents

31

Page 32: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Changingdirectories(cd)

• Tochangetodifferentdirectory$ cd <directory_name>

where<directory_name> =thepath youwanttomoveto

• Apathisalocationinthefilesystem• cdstandsforchangedirectory• Togetbacktoyourhomedirectory

$ cd ~• ~ isshorthandforyourhomedirectory

32

Page 33: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Changingdirectories(cont.)

• Tomoveone directoryabovethecurrentdirectory$ cd ../

• Tomovetwo directoriesabovethecurrentdirectory$ cd ../../

• Youcanstringasmany../asyouneedto

33

Page 34: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Makingdirectories(mkdir)

• Tomakeadirectory$ mkdir <new_directory_name>where

<new_directory_name> =nameofthedirectorytocreate• mkdir standsformakedirectory• Donotusespacesor“/”indirectoryorfilenames

34

Page 35: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Exercise:createsomedirectories

Trytocreatethisdirectorystructure:

Hints• Usepwd todeterminewhereyouareinthedirectorystructure• Usecd tonavigatethroughthedirectorystructure.• Usemkdir tocreatenewdirectories

35

Page 36: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Copyingthings(cp)

• Tocreateacopyofafile$ cp –i <filename> <copy_of_filename>where

<filename> =fileyouwanttocopy<copy_of_filename> =nameofcopiedfileThe-i flag isasafetyfeaturetomakesureyoudonotoverwriteafilethatalreadyexists(interactive)

• Tocreateacopyofadirectory$ cp -r <directory> <copy_of_directory>where

<directory> =directoryyouwanttocopy<copy_of_directory> =nameofcopieddirectoryThe-rflagisrequiredtocopyallofthedirectory’sfilesandsubdirectories 36

Page 37: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Copyingthings(cont.)(cp)

• cp standsforcopyfiles/directories• Tocreateacopyoffileandkeepthenamethesame

$ cp –i <filename> .where

<filename> =fileyouwanttocopy• Theshortcutisthesamefordirectories,justremembertoincludethe-rflag

37

Page 38: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Exercise:copyingthings

Copy/home/assignments/assignment1/README.txt toyourworkdirectory.Keepthenamethesame.

38

Page 39: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Renaming/movingthings(mv)

• Torename/moveafile/directory$ mv -i <original_filename> <new_filename>where

<original_filename> =nameoffile/dir youwanttorename<new_filename> =nameyouwanttorenameitto

• mvstandsformovefiles/directories

39

Page 40: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Printingcontentsoffiles(cat)

• Toprintafile$ cat <filename>where

<filename> =nameoffileyouwanttoprint• catstandsforconcatenatefileandprinttothescreen• Otherusefulcommandsforprintingpartsoffiles:• more• less• head• tail

40

Page 41: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Exercise:printingcontentsoffiles

PrintthecontentsofyourREADME.txt

Experimentwithusingdifferentcommands,e.g.,cat,head,andtail.Howdothecommandsdiffer?

41

Page 42: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

DeletingThings(rm)

• Todeleteafile$ rm <file_to_delete>where

<file_to_delete> =nameofthefileyouwanttodelete

• Todeleteadirectory$ rm –r -i <directory_to_delete>where

<directory_to_delete> =nameofthedirectoryyouwanttodelete

• rm standsforremovefiles/directories

IMPORTANT:thereisnorecyclebin/trashfolderonUnix!!Onceyoudeletesomething,itisgoneforever.

Beverycarefulwhenyouuse rm!! 42

TIP:Checkthatyou’regoingtodeletethecorrectfilesbyfirsttestingwith'ls'andthencommittingto'rm'

Page 43: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Exercise:deletingthings

Deletethetest directorythatyoucreatedinapreviousexercise.

43

Page 44: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Savingoutputtofiles

• Save theoutputtoafile$ <cmd> > <output_file>where

<cmd> =command<output_file> =nameofoutputfile

• WARNING:thiswilloverwritetheoutputfileifitalreadyexists!• Append theoutputtotheendofafile

$ <cmd> >> <output_file>

Thereare2“>”

44

Page 45: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Learningmoreaboutacommand(man)

• Toviewacommand’sdocumentation$ man <cmd>where

<cmd> =command• manstandsformanualpage• Usetheandarrowkeystoscrollthroughthemanualpage

• Type“q”toexitthemanualpage

↑ ↑

45

Page 46: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Exercise:readingdocumentation

Determinewhatthefollowingcommanddoes$ cal

46

Page 47: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Gettingyourselfoutoftrouble

• Abortacommand

• Temporarilystopacommand

• Resumeastoppedjob$ fg <job_id>

47

Page 48: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Unixcommandscheatsheet--yournewbestie

https://ubuntudanmark.dk/filer/fwunixref.pdf48

Page 49: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Assignment1

49

Page 50: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Howtocomplete&“turnin”assignments

1. Createaseparatedirectoryforeachassignment2. Create“submission”and“work”subdirectories• Work=scratchwork• Submission=finalversion• TheTAswillonlygradecontentthatisinyoursubmissiondirectory

3. CopythestarterscriptsandREADMEtoyourworkdirectory

4. Copythefinalversionofthefilestoyoursubmissiondirectory• Don’ttouchthesubmissionfolderagain!Timestampsofthefilesareusedtodetermineiftheassignmentwasturnedinontime 50

Page 51: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

READMEfiles• AREADME.txt filecontainsinformationonhowtorunyourcodeandanswerstoanyofthequestionsintheassignment

• Atemplatewillbeprovidedforeachassignment• Copythetemplatetoyourworkfolder• Replacethetextin{}withyouranswers• LeaveallotherlinesaloneJ

51

Question 1:{nuc_count.py nucleotide count output}-Comments:{Things that went wrong or you can not figure out}-

Question 1:A: 10C: 15G: 20T: 12-Comments:The wording for part 2 was confusing.-

AREADME.txttemplate AfilledoutREADME.xt

Page 52: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Usagestatements inREADME.txt

• Purpose• Tellsauser(you,TA,anyoneunfamiliarwithyour)howtorunthescript• Documentshowyoucreatedyourresults

• Goodpractices• Writeoutexactlyhowyouranthescript:python3 foo.py 10 bar

• AND/OR,writeouthowtorunthescriptingeneral,i.e.,withplaceholdersforcommand-lineargumentspython3 foo.py <#_of_genes> <gene_of_interest>

• TIP:copyandpasteyourcommandsintoyourREADME• TIP:usethecommandhistory toviewpreviouscommands(uparrow)

52

Page 53: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

53

Page 54: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Assignment1TODOs

• Downloadchr20viaFTP(hereweusewget)• Youwillbegivenastarterscript(nuc_count.py)thatcountsthetotalnumberofA,C,G,Tnucleotides• Modifythescripttocalculatethenucleotidefrequencies• Modifythescripttocalculatethedinucleotidefrequencies

• Modifyastarterscript(make_seq.py)togeneratearandomsequencegivennucleotidefrequencies• Usemake_seq.py togeneraterandomsequencewiththesamenucleotidefrequenciesaschr20• Comparethechr20di/nucleotidefrequencies(observed)withtherandommodel(expected)

54

Page 55: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Fasta fileformat

• Astandardtext-basedfileformatusedtodefinesequences,e.g.,nucleotideorpeptidesequences• .faor.fasta extension• Eachsequenceisdefinedbymultiplelines• Line1:Descriptionofsequence.Startswith“>”• Lines2-N:Sequence

• Afasta cancontain≥1sequence

>chr22ACGGTACGTACCGTAGATNAGTAN>chr23ACCGATGTGTGTAGGTACGTNACGTAGTGATGTAT

Examplefasta file

1

2

3

4

5

55

Page 56: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Requirements

• DuenextFriday (1/27)at10am• Yoursubmissionfoldershouldcontain:

□APythonscripttocountnucleotides(nuc_count.py)□APythonscripttomakearandomsequencefile(make_seq.py)

□Anoutputfilewitharandomsequence(random_seq_1M.txt)

□AREADME.txt filewithinstructionsonhowtorunyourprogramsandanswerstothequestions.

• Remembertocommentyourscript!

56

Page 57: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Pythonbasics

RecyclingNicole’sslidesfromyear2016*

57

Page 58: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

WhatisPython?• Pythonisawidelyusedprogramminglanguage• Firstimplementedin1989byGuidovanRossum• Free,open-sourcesoftwarewithcommunity-baseddevelopment• Trivia:PythonisnamedaftertheBBCshow“MontyPython’sFlyingCircus”andhasnothingtodowithreptiles VanRossumisknownas

a"BenevolentDictatorForLife"(BDFL)

WhichPython?• Thereare2widelyusedversionsofPython:Python2.7andPython3.x• We’llusePython3• ManyhelpforumsstillrefertoPython2,somakesureyou’reawarewhichversionisbeingreferenced

58

Page 59: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

InteractingwithPythonThereare2mainwaysofinteractingwithPython:

ThisisPython’scommandprompt.Itmeans,“I’mreadyforacommand!”Don’ttypethe“>>>” 59

Page 60: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Variables• Themostbasiccomponentofanyprogramminglanguageare"things,"alsocalled variables• Avariablehasanameandanassociatedvalue• ThemostcommontypesofvariablesinPythonare:

Type Description Example

Integers Awholenumber x=10

Floats Arealnumber x=5.6

Strings Text(1ormorecharacters) x=“Genomics”

Booleans Abinaryoutcome:trueorfalse x=True

60

Youcanusesinglequotesordoublequotes

Page 61: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

• Tosaveavariable,use=>>> x = 2

• Todeterminewhattypeofvariable,usethetype function>>> type(x)<class 'int'>

• IMPORTANT: thevariablenamemustbeonthelefthandside ofthe=>>> x = 2

>>> 2 = x

Variables(cont.)

Thevalue ofthevariableThename ofthevariable

61

Page 62: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Variablenaming(best)practices

• Muststartwithaletter• Cancontainletters,numbers,andunderscoresß nospaces!• Pythoniscase-sensitive:x ≠ X• Variablenamesshouldbedescriptiveandhavereasonablelength• UseALLCAPSforconstants,e.g.,PI• Donotusenamesalreadyreservedforotherpurposes(min,max,int)

62Wanttolearnmoretips?Checkouthttp://www.makinggoodsoftware.com/2009/05/04/71-tips-for-naming-variables/

Page 63: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Exercise:definingvariables

• Createthefollowingvariablesfor• Yourfavoritegenename• Theexpressionlevelofagene• Thenumberofupregulatedgenes• WhethertheHOXA1 genewasdifferentiallyexpressed

• Whatisthetypeforeachvariable?

63

Cheatsheet

Page 64: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Collectionsofthings

• Whyisthisconceptuseful?• Weoftenhavecollectionsofthings,e.g.,

• Alistofgenesinapathway• Alistofgenefusionsinacancercellline• AlistofprobeIDsonamicroarrayandtheirintensityvalue

• Wecould storeeachiteminacollectioninaseparatevariable,e.g.,gene1 = ‘SUCLA2’gene2 = ‘SDHD’...

• Abetterstrategyistoputalloftheitemsinonecontainer• Pythonhasseveraltypesofcontainers

• List (similartoarrays)• Set• Dictionary

64

Page 65: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Lists:whatarethey?

• Listsholdacollectionofthingsinaspecifiedorder• Thethingsdonothavetobethesametype

• Manymethodscanbeusedtomanipulatelists.

65

Syntax Example Output

Createalist<list_name> = [<item1>, <item2>]

Indexalist<listname>[<position>] 'SDHD'

Page 66: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Lists:wherecanIlearnmore?

• Python.orgtutorial:https://docs.python.org/3.4/tutorial/datastructures.html#more-on-lists• Python.orgdocumentation:https://docs.python.org/3.4/library/stdtypes.html#list

66

Page 67: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Doingstufftovariables

• Thereare3commontoolsformanipulatingvariables• Operators• Functions• Methods

67

Page 68: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Operators

• Operatorsareaspecialtypeoffunction:• Operatorsaresymbolsthatperformsomemathematicalorlogicaloperation

• Basicmathematicaloperators:

68

Operator Description Example+ Addition >>> 2 + 3

5- Subtraction >>> 2 - 3

-1* Multiplication >>> 2 * 3

6/ Division >>> 2 / 3

0.6666666666666666

Page 69: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Operators(cont.)Youcanalsouseoperatorsonstrings!

69

Operator Description Example+ Combinestringstogether >>> 'Bio' + '5488'

'Bio5488'>>> 'Bio' + 5488Traceback (most recent call last):

File "<stdin>", line 1, in <module>TypeError: Can't convert 'int' object to strimplicitly

* Repeatastringmultipletimes >>> 'Marsha' * 3'MarshaMarshaMarsha'

Isitabird?Isitaplane?Noit’sastring!

Stringsandintscannotbecombined

Page 70: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Relationaloperators

• Relationaloperatorscompare2things• Returnaboolean

70

Operator Description Example< Less than >>> 2 < 3

True<= Lessthanorequalto >>> 2 <= 3

True> Greaterthan >>> 2 > 3

False>= Greaterthanorequalto >>> 2 >= 3

False== Equalto >>> 2 == 3

False!= Notequalto >>> 2 != 3

True

==isusedtotestforequality

=isusedtoassignavaluetoavariable

Page 71: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Logicaloperators

• Performalogicalfunctionon2things• Returnaboolean

71

Operator Description Exampleand ReturnTrue ifboth argumentsaretrue >>> True and True

True>>> True and FalseFalse

or ReturnTrue ifeither argumentsaretrue >>> True or FalseTrue>>> False or FalseFalse

Page 72: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Functions:whatarethey?• Whyarefunctionsuseful?

• Allowyoutoreusethesamecode• Programmersarelazy!

• Ablockofreusable codeusedtoperformaspecifictask

72

Takeinarguments(optional)

Dosomething

Returnsomething(optional)

• Similartomathematicalfunctions,e.g.,𝑓 𝑥 = 𝑥$• 2types:

Built-inFunctionprewrittenforyou

print:printsomethingtotheterminalfloat:convertsomethingtoafloatingpoint#

User-definedYoucreateyourownfunctions

Page 73: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Functions:howcanIcallafunction?

73

Syntax Example Output

Callafunctionthattakesnoarguments<function_name>()

Callafunctionthattakesargument(s)<function_name>(<arg1>, <arg2>) 8

Page 74: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Pythonfunctions:wherecanIlearnmore?

• Python.orgtutorial• User-definedfunctions:https://docs.python.org/3/tutorial/controlflow.html#defining-functions

• Python.orgdocumentation• Built-infunctions:https://docs.python.org/3/library/functions.html

74

Page 75: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Methods:whatarethey?

• Firstapreamble...• Methodsareaclosecousinoffunctions• Forthisclasswe’lltreatthemasbasicallythesame• Thesyntaxforcallingamethodisdifferentthanforafunction• Ifyouwanttolearnaboutthedifferences,googleobjectorientedprogramming (OOP)

• Whyarefunctions methodsuseful?• Allowyoutoreusethesamecode

75

Page 76: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

StringmethodsSyntax Description Example

<str>.upper() • Returnsthestringwith alllettersuppercased >>> x = "Genomics">>> x.upper()'GENOMICS'

<str>.lower() • Returnsthestringwith allletterslowercased >>> x.lower()'genomics'

<str>.find(<pattern>) • Returnsthefirstindexof<pattern>inthestring• Returns -1iftheif<pattern> isnotfound

>>> x.find('nom')2

<str>.count(<pattern>) • Returnsthenumberoftimes<pattern>isfoundinthestring

• HINT:explorehow.countdealswithoverlappingpatterns

>>> x.count('g')0

<str>[<index>] • Returnstheletteratthe<index>th position >>> x[1]'e'

https://docs.python.org/3.4/library/stdtypes.html?#string-methodshttps://docs.python.org/3/library/stdtypes.html#str

0 1 2 3 4 5 6 7

G e n o m i c s 76

Page 77: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Makingchoices(conditionalstatements)

• Whyisthisconceptuseful?• Oftenwewanttocheckifaconditionistrueandtakeoneactionifitis,andanotheractioniftheconditionisfalse• E.g.,Ifthealternativeallelereadcoverageataparticularlocationishighenough,annotatethepositionasaSNPotherwise,annotatethepositionasreference

77

Page 78: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Conditionalstatementsyntax

78

Syntax Example OutputIfif <condition>:

# Do somethingx is positive

If/elseif <condition>:

# Do somethingelse:

# Do something else

x is NOT positive

If/else if/elseif <condition1>:

# Do somethingelif <condition2>:

# Do something elseelse:

# Do something else

x is negative

Indentationmatters!!!Indentthelinesofcodethatbelongtothesame

codeblockUse1tab

Page 79: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Commenting yourcode

• Whyisthisconceptuseful?• Makesiteasierfor--you,yourfutureself,TAsJ,anyoneunfamiliarwithyourcode--tounderstandwhatyourscriptisdoing

• Commentsarehumanreadabletext.TheyareignoredbyPython.• Addcommentsfor

Thehow• Whatthescriptdoes• Howtorunthescript• Whatafunctiondoes• Whatablockofcodedoes

79

TREATYOURCODELIKEALABNOTEBOOK

Thewhy• Biologicalrelevance• Rationalefordesignandmethods• Alternatives

Page 80: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Commentingruleofthumb

Alwayscode[andcomment]asiftheguywhoendsupmaintainingyourcodewillbeaviolentpsychopathwhoknowswhereyoulive.Codeforreadability.

-- JohnWoods

80

Page 81: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Commentingyourcode(cont.)

• Commentingisextremelyimportant!

81

• Pointswillbedeductedifyoudonotcommentyourcode

• Ifyouusecodefromaresource,e.g.,awebsite,citeit

Page 82: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Commentsyntax

82

Syntax Example

Blockcomment# <your_comment># <your_comment>

In-linecomment<code> # <your_comment>

Page 83: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Pythonmodules

• AmoduleisfilecontainingPythondefinitionsandstatementsforaparticularpurpose,e.g.,• Generatingrandomnumbers• Plotting

• Modulesmustbeimportedatthebeginningofthescript• Thisloadsthevariablesandfunctionsfromthemoduleintoyourscript,e.g.,

import sysimport random

• Toaccessamodule’sfeatures,type<module>.<feature>,e.g.,sys.exit()

83

Page 84: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Randommodule

• Containsfunctionsforgeneratingrandomnumbersforvariousdistributions• TIP:willbeusefulforassignment1

84

Function Description

random.choice Returnarandomelementfromalist

random.randint Returnarandominterger inagivenrangerandom.random Return arandomfloatintherange[0,1)Random.seed Initializethe (pseudo)randomnumbergenerator

https://docs.python.org/3.4/library/random.html

Page 85: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Howtorepeatyourself(forloops)

• Whyisthisuseful?• Often,youwanttodothesamethingoverandoveragain• Calculatethelengthofeachchromosomeina

genome• Lookupthegeneexpressionvalueforeverygene• AligneachRNA-seq readtothegenome

• Aforlooptakesoutthemonotonyofdoingsomethingabazilliontimesbyexecutingablockofcodeoverandoverforyou• Remember,programmersarelazy!

• Aforloopiterates overacollectionofthings• Elementsinalist• Arangeofintegers• Keysinadictionary

85

Page 86: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

ForloopsyntaxSyntax Example Outputfor <counter> in <collection_of_things>:

# Do somethingHello!Hello!Hello!Hello!Hello!Hello!Hello!Hello!Hello!Hello!

0123456789

• The<counter> variableisthevalueofthecurrentiteminthecollectionofthings• Youcanignoreit• Youcanuseitsvalueintheloop

• Allcodeintheforloop’scodeblockisexecutedateachiteration

• TIP:Ifyoufindyourselfrepeatingsomethingoverandover,youcanprobablyconvertyourcodetoaforloop!

86

Indentationmatters!!!Indentthelinesofcodethatbelongtothesame

codeblockUse1tab

Page 87: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Whichoptionwouldyouratherdo?

87

A

B

Page 88: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Howtorepeatyourself(cont.)

• Forloopshaveaclosecousincalledwhileloops• Themajordifferencebetweenthe2• Forloopsrepeatablockofcodeapredeterminednumberoftimes(really,acollectionofthings)• Whileloopsrepeatablockofcodeaslongasanexpressionistrue

• e.g.,whileit’ssnowing,repeatthisblockofcode• Whileloopscanturnintoinfinitewhileloopsà theexpressionisneverfalsesotheloopneverexits.Becareful!

• Seehttp://learnpythonthehardway.org/book/ex33.html foratutorialonwhileloops

88

Page 89: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Command-linearguments

• Whyaretheyuseful?• Passingcommand-lineargumentstoaPythonscriptallowsascripttobecustomized

• Example• make_nuc.py cancreatearandomsequenceofanylength• Ifthelengthwasn’tacommand-lineargument,thelengthwouldbehard-coded• Tomakea10bpsequence,wewouldhaveto1)editthescript,2)savethescript,and3)runthescript.

• Tomakea100bpsequence,we’dhaveto 1)editthescript,2)savethescript,and3)runthescript.

• Thisistedious&error-prone• Remember:bealazyprogrammer!

89

Page 90: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

90

Page 91: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Command-linearguments

• Pythonstoresthecommand-lineargumentsasalistcalledsys.argv• sys.argv[0] # script name• sys.argv[1] # 1st command-line argument• …

• IMPORTANT:argumentsarepassedasstrings!• Iftheargumentisnotastring,convertit,e.g.,int(),float()

• sys.argv isalistofvariables• Thevaluesofthevariables,e.g.,theAfrequency,arenot“pluggedin”untilthescriptisrun

• UsetheA_freq tostandfortheAfrequencythatwaspassedasacommand-lineargument

91

Page 92: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Reading(andwriting)tofilesinPython

Whyisthisconceptuseful?• Oftenyourdataismuchlargerthanjustafewnumbers:• Billionsofbasepairs• Millionsofsequencingreads• Thousandsofgenes

• It’smaynotfeasibletowriteallofthisdatainyourPythonscript• Memory• Maintenance

Howdowesolvethisproblem?

92

Page 93: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Output file 2

Reading(andwriting)tofilesinPython

Thesolution:• Storethedatainaseparatefile• Then,inyourPythonscript• Read inthedata(linebyline)• Analyzethedata• Write theresultstoanewoutputfileorprintthemtotheterminal

• Whentheresultsarewrittentoafile,otherscriptscanreadintheresultsfiletodomoreanalysis

93

Python script 1

Input file

Output file 1

Python script 2

Page 94: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

ReadingafilesyntaxSyntax Example

with open(<file>) as <file_handle>:for <current_line> in open(<file>) , ‘r’):

<current_line> = <current_line>.rstrip()# Do something

Output>chr1ACGTTGATACGTA

94

Page 95: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Theanatomyofa(simple)script

95

• Thefirstlineshouldalwaysbe#!/usr/bin/env python3

• Thisspeciallineiscalledashebang• Theshebangtellsthecomputer

howtorunthescript• ItisNOTacomment

Page 96: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Theanatomyofa(simple)script

96

• Thisisaspecialtypeofcommentcalledadocstring,ordocumentationstring

• Docstringsareusedtoexplain1)whatscriptdoesand2)howtorunit

• ALWAYSincludeadocstring• Docstringsareenclosedintriple

quotes,“““

Page 97: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Theanatomyofa(simple)script

97

• Thisisacomment• Commentshelpthereaderbetter

understandthecode• Alwayscommentyourcode!

Page 98: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Theanatomyofa(simple)script

98

• Thisisanimportstatement• Animportstatementloads

variablesandfunctionsfromanexternalPythonmodule

• Thesysmodulecontainssystem-specificparametersandfunctions

Page 99: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Theanatomyofa(simple)script

99

• Thisgrabsthecommandlineargumentusingsys.argv andstoresitinavariablecalledname

Page 100: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

Theanatomyofa(simple)script

100

• Thisprintsastatementtotheterminalusingtheprintfunction

• Thefirstlistofargumentsaretheitemstoprint

• Theargumentsep=“”saysdonotprintadelimiter(i.e.,aseparator)betweentheitems

• Thedefaultseparatorisaspace.

Page 101: Lab 1: Introduction to Python Programminggenetics.wustl.edu/bio5488/files/2017/01/SP2017_012017_Assignment1... · Assignments •Assignments are posted on the course website Wednesdays

101