Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
CourseIntroduction
BBM101- IntroductiontoProgrammingI
Hacettepe UniversityFall2016
FuatAkal,AykutErdem,Erkut Erdem
1SlidesbasedonmaterialpreparedbyRuthAnderson,MichaelErnstandBillHoweinthecourseCSE140UniversityofWashington
WelcometoBBM101• Thiscourse teaches coreprogramming conceptswithanemphasis ondatamanipulation tasksfromscience, engineering, andbusiness
• Goal bytheendofthesemester: Givenadatasource andaproblemdescription, youcanindependently writeacomplete,usefulprogramtosolvetheproblem
2
CourseStaff
• Lecturers:– Asst.Prof.Dr.Fuat Akal– Asst.Prof.Dr.Aykut Erdem– Asst.Prof.Dr.Erkut Erdem
3
CourseStaff
• TAs(TeachingAssistants):– Necva Bölücü– SelmaDilek– Burcu Yalçıner– Selim Yılmaz
DonothesitatetoaskTAsforhelp!
4
LearningObjectives• Computationalproblem-solving– Writingaprogramwillbecomeyour“go-to”solutionfordataanalysistasks.
• BasicPythonproficiency– Includingexperiencewithrelevantlibraries fordatamanipulation, scientific computing,andvisualization.
• Experienceworkingwithrealdatasets– astronomy,biology, linguistics,oceanography, opengovernment, socialnetworks,andmore.
– Youwillseethattheseareeasytoprocesswithaprogram,andthatdoingsoyieldsinsight.
5
WhatThisCourseisnot• A“skillscourse”inPython
– …thoughyouwillbecomeproficientinthebasicsofthePythonprogramminglanguage
– …andyouwillgainexperiencewithsomeimportantPythonlibraries
• Adataanalysis/“datascience”/datavisualizationcourse– Therewillbeverylittlestatisticsknowledgeassumedortaught
• A“project”course– theassignmentsare“real,”butareintendedtoteachspecific
programmingconcepts
• A“softwareengineering” course– Programmingisthestartingpointofcomputerscienceandsoftware
engineering
6
“It’sagreattimetobeadatageek.”-- RogerBarga,MicrosoftResearch
7
“Thegreatestmindsofmygenerationaretryingtofigureouthowtomakepeopleclickonads”
-- JeffHammerbacher,co-founder,Cloudera
8
AllofScienceisReducingtoComputationalDataManipulation
Oldmodel:“Querytheworld” (Dataacquisitioncoupledtoaspecifichypothesis)Newmodel:“Downloadtheworld” (Dataacquisitionsupportsmanyhypotheses)– Astronomy:High-resolution,high-frequencyskysurveys(SDSS,LSST,PanSTARRS)– Biology:labautomation,high-throughputsequencing,– Oceanography:high-resolutionmodels,cheapsensors,satellites
40TB/2nights
~1TB/day100sofdevices
Example:AssessingTreatmentEfficacy
Zipcodeofclinic
Zipcodeofpatient
numberoffollowupswithin16weeksaftertreatmentenrollment.
Question:Doesthedistancebetweenthepatient’shomeandclinicinfluencethenumberoffollowups,andthereforetreatmentefficacy?
9
PythonProgramtoAssessTreatmentEfficacy#ThisprogramreadsanExcelspreadsheetwhosepenultimate#andantepenultimatecolumnsarezipcodes.#Itaddsanewlastcolumnforthedistancebetweenthosezip# codes,andoutputsinCSV(comma-separatedvalues)format.#Calltheprogramwithtwonumericvalues:thefirstandlast# rowtoinclude.#Theoutputcontainsthecolumnheadersandthoserows.
#Librariestouseimport randomimport sysimport xlrd #libraryforworkingwithExcelspreadsheetsimport timefrom gdapi import GoogleDirections
#Nokeyneedediffewqueriesgd =GoogleDirections('dummy-Google-key')
wb =xlrd.open_workbook('mhip_zip_eScience_121611a.xls')sheet=wb.sheet_by_index(0)
#Userinput:firstrowtoprocess,firstrownottoprocessfirst_row=max(int(sys.argv[1]),2)row_limit =min(int(sys.argv[2]+1),sheet.nrows)
def comma_separated(lst):return ",".join([str(s)for sin lst])
headers=sheet.row_values(0)+["distance"]print comma_separated(headers)
for rownum in range(first_row,row_limit):row=sheet.row_values(rownum)(zip1,zip2)=row[-3:-1]if zip1and zip2:#Cleanthedatazip1=str(int(zip1))zip2=str(int(zip2))row[-3:-1]=[zip1,zip2]#ComputethedistanceviaGoogleMapstry:distance=gd.query(zip1,zip2).distance
except:print >>sys.stderr,"Errorcomputingdistance:",zip1,
zip2distance=""
#Printtherowwiththedistanceprint comma_separated(row+[distance])#AvoidtoomanyGooglequeriesinrapidsuccessiontime.sleep(random.random()+0.5)
23linesofexecutablecode!10
Source:Brookings
ThevalueofacomputerscienceeducationSomestatistics(fromU.S.)
Slidecredit:code.org
Computingjobsarethe#1sourceofnewwagesintheUnitedStates
500,000currentopenings:These jobsareinevery industryandevery state,andthey’reprojectedtogrowattwicetherate
ofallotherjobs.
Somestatistics(fromU.S.)
Slidecredit:code.org
TheSTEM*problemisincomputerscience:
Sources:BureauofLaborStatistics,NationalCenterforEducationStatistics
71%ofall newjobsin
STEMareincomputing
8%ofSTEMgraduates
areincomputerscience
*STEM=Science,Technology,Engineering,andMath
Somestatistics(fromU.S.)
Slidecredit:code.org
CourseLogistics
• Website:http://web.cs.hacettepe.edu.tr/~bbm101/
• Seethewebsiteforalladministrativedetails
• Readthehandoutsandrequiredtexts,before thelecture
• Takenotes!
• FollowthecourseinPiazzahttps://piazza.com/hacettepe.edu.tr/fall2016/bbm101
14
AcademicIntegrity
• Honestworkisrequiredofascientistorengineer.
• Collaborationpolicyonthecourseweb.Readit!– Discussionispermitted.– Carryingmaterials fromdiscussion isnotpermitted.– Everythingyouturninmustbeyourownwork.
• Citeyoursources,explainanyunconventionalaction.– Youmaynotviewothers’work.– Ifyouhaveaquestion,ask.
• Wetrustyoucompletely.
• Butwehavenosympathyfortrustviolations– norshouldyou!
15
HowtoSucceed• Noprerequisites
• Non-predictorsforsuccess:– Pastprogrammingexperience– Enthusiasmforgamesorcomputers
• Programminganddataanalysisarechallenging
• Everyoneofyoucansucceed– Thereisnosuchthingasa“bornprogrammer”– Workhard– Followdirections– Bemethodical– Think beforeyouact– Tryonyourown,thenaskforhelp– Startearly
16
IntegratedDevelopmentEnvironment(IDE)
• Therearemany!
17
OurRecommendation:PyCharm
18
PythonVersion• WhateverIDEyouchoosetoworkwith,alwayssticktoPythonversion3.5.2
• Always usethisversiontocodeyourassignments.
19
Books• Therearemany!
20
OurRecommendationforBooks• ThePythonTutorial,availablefromthePythonwebsite.
– ThisisgoodforexplainingthenutsandboltsofhowPythonworks.
• IntroductiontoComputationandProgrammingUsingPython,SecondEdition,JohnV.Guttag,MITPress,August2016
• ThinkPython,2ndedition– Freelyavailableonlinein HTML and PDF.– Alsoavailableforpurchaseasaprintedbook,butdon'tbuythefirst
edition.– Thisbookintroducesmoreconceptualmaterial,motivating
computationalthinking.
• Thereisan interactiveversionof“HowtoThinkLikeaComputerScientist” (thefirsteditionof“ThinkPython”),whichletsyoutypeandrunPythoncodedirectlywhilereadingthebook.
21