day1 - Rhodeskirlinp/courses/db/f16/lectures/day1.pdf · Equivalent representations of a relation...

Preview:

Citation preview

Databases

Standardstuff

• Classwebpage:cs.rhodes.edu/db• Textbook:getitsomewhere;usedisfine– Stayupwithreading!

• Prerequisite:CS241• Coursework:– Homework,groupproject,midterm,final

• Bepreparedtobringlaptopseverysooften.

Groupproject• Youwilldesignandimplementyourowndatabase-drivenwebsite.

• Ideas:shopping,auctions,writeabetterBannerWeb,library/bibliographysystem,reviewsalaYelp,bank,finance/stocks,jobpostings,socialnetworkingalaFacebook,recipes,movies,apartments,…

• Groups:probably4-5people,formedonyourown.

• Spreadoutoverthewholesemester;check-insalongtheway.

Whystudydatabases?

• Academicreasons• Programmingreasons• Business(getajob)reasons• Studentreasons

Whatwillyoulearn?

• Databasedesign– Howdoyoumodelyourdatasoitcanbestoredinadatabase?

• Databaseprogramming– HowdoIuseadatabasetoaskitquestions?

• Databaseimplementation– Howdoesthedatabaseitselfwork;i.e.,howdoesitstore,find,andretrievedataefficiently?

Whatisthegoalofadatabase?

• Electronicrecord-keeping,enablingfast andconvenient accesstotheinformationinside.

• DBMS=Databasemanagementsystem– Softwarethatstoresindividualdatabasesandknowshowtosearchtheinformationinside.

– RDBMS=RelationalDBMS– Examples:Oracle,MSSQLServer,MSAccess,MySQL,PostgreSQL,IBMDB2,SQLite

DBMSFeatures

• Supportmassiveamountsofdata– Giga-,tera-,petabytes

• Persistentstorage– Datacontinuestolivelongafterprogramfinishes.

• Efficientandconvenientaccess– Efficient:don'tsearchtheentirethingtoansweraquestion!

– Convenient:allowuserstoaskquestionsaseasilyaspossible.

• Secure,concurrent,andatomicaccess

Example:buildabetterBannerWeb

• Professorsofferclasses,studentssignup,getgrades

• Whataresomequestionswe(studentsorfaculty)couldask?– FindmyGPA.– …

• Whyaresecurity,concurrency,andatomicityimportanthere?

Obvioussolution:Folders

• Advantages?

• Disadvantages?

Obvioussolution++

• TextfilesandPython/C++/Javaprograms

Obvioussolution++

• Let'suseCSV:

Hermione,Granger,R123,Potions,ADraco,Malfoy,R111,Potions,BHarry,Potter,R234,Potions,ARonald,Weasley,R345,Potions,C

Hermione,Granger,R123,Potions,ADraco,Malfoy,R111,Potions,BHarry,Potter,R234,Potions,ARonald,Weasley,R345,Potions,CHarry,Potter,R234,Herbology,BHermione,Graner,R123,Herbology,A

File1:Hermione,Granger,R123 Draco,Malfoy,R111 Harry,Potter,R234 Ronald,Weasley,R345File2:R123,Potions,AR111,Potions,BR234,Potions,AR345,Potions,CR234,Herbology,BR123,Herbology,A

Problems

• Inconvenient– needtoknowPython/C++/Javatogetatdata!

• Redundancy/inconsistency• Integrityproblems• Atomicityproblems• Concurrentaccessproblems• Securityproblems

Whyarethereproblems?

• Twomainreasons:– ThedescriptionofhowthefilesarelaidoutisburiedwithinthePython/C++/Javacodeitself(ifit'sdocumentedatall)

– Thereisnosupportfortransactions (supportingconcurrency,atomicity,integrity,andrecovery)

• DBMSshandleexactlythesetwoproblems.

Relationaldatabasesystems

• EdgarF.Codd wasaresearcheratIBMwhoconceivedanewwayoforganizingdatabasedonthemathematicalconceptofarelation.

• Relation:asetoforderedtuples(oh,no,CS172stuff…)

• RDBMS=Relationaldatabasemanagementsystem.

• Therelationalmodelusesrelations(akatables)tostructuredata.

• Gradesrelation:First Last Course Grade

Hermione Granger Potions A

Draco Malfoy Potions B

Harry Potter Potions A

Ronald Weasley Potions C

• Relationalmodelisanabstraction.• Separatesthelogicalview(asviewedbytheDBuser)fromthephysicalview(DB'sinternalrepresentationofthedata)

First Last Course Grade

Hermione Granger Potions A

Draco Malfoy Potions B

Harry Potter Potions A

Ronald Weasley Potions C

• Structuredquerylanguage(SQL)foraccessing/modifyingdata:

• FindallstudentswhoaregettingaB.– SELECT First, Last FROM Grades WHERE Grade = "B"

First Last Course Grade

Hermione Granger Potions A

Draco Malfoy Potions B

Harry Potter Potions A

Ronald Weasley Potions C

Transactionprocessing• OneormoreDBoperationscanbegroupedintoatransaction.

• ForaDBMStoproperlyimplementtransactions:• Atomicity:All-or-nothingexecutionoftransactions.

• Consistency:ADBcanhaveconsistencyrulesthatshouldnotbeviolated.

• Isolation:Eachtransactionmustappear tobeexecutedasifnoothertransactionsarehappeningsimultaneously.

• Durability:Anychangesatransactionmakesmustneverbelost.

Ontotherealstuffnow…

DataModels

• Awayofdescribingdata.– Better:adescriptionofhowtoconceptuallystructurethedata,whatoperationsarepossibleonthedata,andanyconstraintsonthedata.

• Structure:howweviewthedataabstractly• Operations:whatispossibletodowiththedata?

• Constraints:howcanwecontrolwhatdataislegalandwhatisnot?

Relationalmodel

• Structure:relation(table)• Operations:relationalalgebra(selectcertainrows,certaincolumns,wherepropertiesaretrue/false)

• Constraints:canenforcerestrictionslikeGrademustbein{A,B,C,D,F}

First Last Course Grade

Hermione Granger Potions A

Draco Malfoy Potions B

Harry Potter Potions A

Ronald Weasley Potions C

Othermodels• Semi-structureddatathatisstill“structured”butnotinrelationalformat.– XML,JSON

• Objectdatabases,orobject-relational• Graphdatabases• NoSQL,NewSQL

Semi-structuredmodel

• Structure:Treesorgraphs– e.g.,XML

• Operations:Followpathsintheimpliedtreefromoneelementtoanother.– e.g.,XQuery

• Constraints:canconstraindatatypes,possiblevalues,etc.– e.g.,DTDs(documenttypedefinition),XMLSchema

Object-relational

• Similartorelational,but– Valuesinatablecanhavetheirownstructure,ratherthanbeingsimplestringsorints.

– Relationscanhaveassociatedmethods.

Relationalmodelismostcommon

• Simple:builtaroundasingleconceptformodelingdata:therelationortable.– Arelationaldatabaseisacollectionofrelations.– Eachrelationisatablewithrowsandcolumns.– AnRDBMScanmanagemanydatabasesatonce.

• Supportshigh-levelprogramminglanguage(SQL)– Limitedbutusefulsetofoperations.

• Haselegantmathematicaltheorybehindit.

RelationTerminology

• Relation==2Dtable– Attribute ==columnname– Tuple ==row(nottheheaderrow)

• Database==collectionofrelationsFirst Last Course Grade

Hermione Granger Potions A

Draco Malfoy Potions B

Harry Potter Potions A

Ronald Weasley Potions C

RelationTerminology

• Arelationincludestwoparts:– Therelationschema definesthecolumnheadingsofthetable(attributes/fields)

– Therelationinstance definesthedatarows(tuples,rows,orrecords)ofthetable.

First Last Course Grade

Hermione Granger Potions A

Draco Malfoy Potions B

Harry Potter Potions A

Ronald Weasley Potions C

Schema

• Aschemaiswrittenbythenameoftherelationfollowedbyaparenthesizedlistofattributes.– Grades(First, Last, Course, Grade)

• ArelationaldatabaseschemaisthesetofschemasforalltherelationsinaDB.

First Last Course Grade

Hermione Granger Potions A

Draco Malfoy Potions B

Harry Potter Potions A

Ronald Weasley Potions C

Domains

• ArelationalDBrequiresthateverycomponentofarow(tuple)haveaspecificelementarydatatype,ordomain.– string,int,float,date,time(nocomplicatedobjects!)

Grades(First:string, Last:string, Course:string, Grade:char)

Equivalentrepresentationsofarelation

Grades(First, Last, Course, Grade)• Relationisaset oftuples,notalist.• Attributesinaschemaareaset aswell.– However,theschemaspecifiesa"standard"orderfortheattributes.

• Howmanyequivalentrepresentationsarethereforarelationwithm attributesandn tuples?

First Last Course Grade

Hermione Granger Potions A

Draco Malfoy Potions B

Harry Potter Potions A

Ronald Weasley Potions C

Degreeandcardinality

• Degree/arity ofarelationisthenumberofattributesinarelation.

• Cardinality isthenumberoftuplesinarelation.

First Last Course Grade

Hermione Granger Potions A

Draco Malfoy Potions B

Harry Potter Potions A

Ronald Weasley Potions C

Keystoagoodrelation(ship)

Keysofarelation

• Keysareakindofintegrityconstraint.• AsetofattributesKformsakeyforarelationRif:– weforbidtwotuplesinaninstanceofRtohavethesamevaluesforallattributesofK.

First Last Course Grade

Hermione Granger Potions A

Draco Malfoy Potions B

Harry Potter Potions A

Ronald Weasley Potions C

Grades(First, Last, Course, Grade)

Keysofarelation

• Keyshelpassociatetuplesindifferentrelations.

SID CRN Grade

123 777 A

111 777 B

234 777 A

345 777 C

SID First Last

123 Hermione Granger

111 Draco Malfoy

234 Harry Potter

345 Ronald Weasley

CRN Name Semester Year

777 Potions Fall 1997

888 Potions Spring 1997

999 Transfiguration Fall 1996

789 Transfiguration Spring 1996

Example

• Let'sexpandtheserelationstohandlethekindsofthingsyou'dliketoseeinBannerWeb.

• Keeptrackofstudents,professors,courses,whoteacheswhat,enrollments,pre-requisites,grades,departments&theirchairs.– Onlyonechairperdepartment.– Studentcannotenrollinmultiplecopiesofthesamecourseinonesemester.

– Otherconstraintsthatarelogical.