Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 11
Database Management SystemsDatabase Management Systems
Database Management SystemsDatabase Management SystemsDatabase Management Systems
–– ChapterChapter 5 5 ––
Foundations of Information Systems (WS 2008/09)FoundationsFoundations of Information Systems (WS 2008/09)of Information Systems (WS 2008/09)
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 22
DBMS DBMS architecturearchitecture
User InterfaceUser InterfaceUser Interface
Schema ManagerSchema ManagerSchema Manager Query ManagerQuery ManagerQuery Manager Transaction ManagerTransaction Transaction ManagerManager
Storage ManagerStorage Storage ManagerManager
Data DictionaryData DictionaryDatabaseDatabase
LogLog
Main Main components components of a of a database management database management systemsystem (DBMS) and (DBMS) and their interactiontheir interaction::
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 33
Storage manager basicsStorage manager basics
User InterfaceUser InterfaceUser Interface
Schema ManagerSchema ManagerSchema Manager Query ManagerQuery ManagerQuery Manager Transaction ManagerTransaction Transaction ManagerManager
Storage ManagerStorage Storage ManagerManager
Data DictionaryData DictionaryDatabaseDatabase
LogLog
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 44
Storage hierarchy Storage hierarchy of a of a computercomputer
RegisterRegister
CacheCache
Main Main MemoryMemory
Disc Disc StorageStorage
Archive Archive StorageStorage
11--10 ns10 ns
1010--100 ns100 ns
100100--1000 ns1000 ns
10 ms10 ms
secsec
Access Access timestimes((approxapprox.):.):
1 ms = 1/1000 sec1 μs = 1/1000 ms1 ns = 1/1000 μs
11 msms = 1/1000 sec= 1/1000 sec1 1 μμs = 1/1000 mss = 1/1000 ms11 nsns = = 1/1000 1/1000 μμss
„Access gap“:factor 105
This is whereThis is wherequeries are queries are executedexecuted!!
This is whereThis is wheredata are data are storedstored!!
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 55
Database Database bufferbuffer and and storage hierarchystorage hierarchy
Main Main memory memory ((processorprocessor))
Background Background store store (disc)(disc)((transienttransient))
((persistentpersistent))
replacereplace
fetchfetch
buffered pagesbuffered pages
•• OperationsOperations on on datadata are alwaysare always performedperformed in in mainmain memory onlymemory only!!•• ConsequenceConsequence: : Data Data to to be manipulated have be manipulated have to to be read into main store beforebe read into main store before
modification modification ((„„bufferingbuffering““) and ) and have have to to be written be written back back sometimes later sometimes later to to background storage background storage media media by the by the DBMS. DBMS.
•• For For thethe DBMS, DBMS, the machinethe machine‘‘s s main memory thus serves as main memory thus serves as a a bufferbuffer..
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 66
StoringStoring tuplestuples in in files files on on pagespages
General General principle principle of of mappingmapping relations to relations to pagespages (on disc):(on disc):
•• per per relationrelation: A : A filefile subdivided into several subdivided into several subsequentsubsequent pagespages
•• per per pagepage: An : An internalinternal record record table table containing containing pointerspointers to all to all tuplestuples//records records on on this pagethis page
•• AdressingAdressing individual tuplesindividual tuples via via aatuple identifiertuple identifier (TID)(TID)
•• TID TID consistsconsists of of twotwo partsparts::•• A A page number page number (absolute (absolute addressaddress) ) •• A A pointer pointer in in the record the record table table pointing pointing toto
a a particular position within the pageparticular position within the pagewhere the respective tuple starts where the respective tuple starts (relative (relative addressaddress))
•• IndirectionIndirection isis usefuluseful duringduring reorganisationreorganisationof of thethe pagepage..
•• •• ••1 2 31 2 3
TIDTID
4711 24711 2
50015001 Foundations Foundations . . .. . .
50415041 Databases Databases . . .. . .
40524052 Logics Logics . . .. . .
page page # 4711# 4711
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 77
Access Access pathspaths and and indexesindexes
•• DBMS DBMS normally make use normally make use of of various internal auxiliary data structuresvarious internal auxiliary data structures in orderin orderto to speed speed up up access access to to larger tableslarger tables. Such additional . Such additional data structures are calleddata structures are calledaccess pathsaccess paths..
•• The most important The most important such such access aids are calledaccess aids are called indexesindexes, , special forms special forms ofofsearch trees offering direct access search trees offering direct access to to values values of of particularly important particularly important andandfrequently accessed attributes frequently accessed attributes of a table (of a table (rather than sequentially searchingrather than sequentially searchingall all fields fields of all of all rowsrows). ). OftenOften, at least , at least the primary key attributethe primary key attribute(s) of a table(s) of a tableare supported by are supported by an an index for index for rapid rapid accessaccess..
•• TheThe notionnotion ‚‚index'index' isis motiviatedmotiviated by the concept by the concept of of index pagesindex pages of of booksbooks. . HereHere,,an additional an additional part part at at the the end of end of the book contains keywordthe book contains keyword--page reference listspage reference listsfor avoiding for avoiding to to search for certain keywords sequentiallysearch for certain keywords sequentially..
•• SQLSQL offersoffers specialspecial DDLDDL--commands for creating commands for creating and and deleting indexes deleting indexes to to oneoneor several columns or several columns of a table:of a table:
CREATE INDEX <index-name> ON <table-name> <list-of-column-names>
DROP INDEX <index-name>
CREATE CREATE INDEXINDEX <index<index--name> ON <tablename> ON <table--name> <listname> <list--ofof--columncolumn--names>names>
DROP DROP INDEXINDEX <index<index--name>name>
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 88
•• Per Per relationrelation, , one index normally supports the primary keyone index normally supports the primary key ((oftenoften created automacreated automa--ticallytically):):
•• In In additionaddition, , severalseveral secondary indexessecondary indexes may exist for other frequently used columnsmay exist for other frequently used columns,,even if these columnseven if these columns//attributes attributes do do not possess the key propertynot possess the key property. .
•• For For implementingimplementing indexesindexes, , special data structures are used by the special data structures are used by the DBMS, DBMS, the choicethe choiceof of which can be influenced by specially authorizedwhich can be influenced by specially authorized DB DB administrators administrators in in largelargecommercialcommercial systems :systems :
•• Search treesSearch trees::•• ISAMISAM--filesfiles•• BB--treestrees and and variantsvariants•• RR--trees trees and and variantsvariants ((higherhigher--dimensional dimensional index structuresindex structures))
•• HashHash--tables tables ((tables associated with special access functionstables associated with special access functions))
•• When using When using an an indexindex, , there is always there is always a a „„tradeofftradeoff"" between resourcesbetween resources::A A reduction reduction in in access access time has to time has to be measured against be measured against an an increase increase in in administrationadministrationtime time forfor indexindex maintenancemaintenance and in and in storage used by the storage used by the extra extra index dataindex data. .
Indexes: Indexes: OverviewOverview
PrimärindexPrimPrimäärindexrindex
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 99
•• simplestsimplest form of form of indexingindexing::
•• EachEach entry entry on on the the index pagesindex pages consists consists of of values values of of the indexed attribute the indexed attribute ((searchsearchkey key S). S). Each entry Each entry on on the the data pagesdata pages is is a a value value of of another attribute another attribute ((data data D). D).
•• On On the index pagesthe index pages, , search key values alternate with search key values alternate with pointerspointers to to correspondingcorrespondingdata pagesdata pages. .
•• SearchSearch withinwithin a a pagepage isis donedone sequentiallysequentially! ! ModificationsModifications on on index pagesindex pages maymayleadlead to to high high administration overheadadministration overhead ((rebalancing rebalancing of of neighbourneighbour pagespages,,redirectionredirection of of pointerspointers, , creation creation of of newnew pagespages, , moving moving of of keykey valuesvalues))
ISAMISAM--OrganisationOrganisation
SS11 SS22 . . . . . . SSkk . . . . . . SSll . . . S. . . Smm
DD1 1 DD22 . . . . . . DDss DDt t . . . . . . DDss DDv v . . . . . . DDww. . .. . .
Index-Sequential Access Method (ISAM)IndexIndex--SequentialSequential Access Access Method Method (ISAM)(ISAM)
≥≥ SS11 << SS22
index pageindex page
data pagesdata pages
. . .. . .. . .. . .
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 1010
pointerpointer
search keysearch key
For For primary indexprimary index::data entrydata entry
For For secondary indexsecondary index::TIDTID
BB--treetree: : IdeaIdea
Each node corresponds Each node corresponds to to one page one page in in storestore
Tree is Tree is balancedbalanced: : equally long branchesequally long branches
More than More than 50% of 50% of each page filled with dataeach page filled with data
Generalization Generalization of of the the ISAMISAM--ideaideato multiple to multiple levels levels of of indexing indexing plusplusmixing data mixing data and and index valuesindex values::
BB--TreeTree (Bayer, (Bayer, McCreight McCreight 1970)1970)
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 1111
Storage structuresStorage structures: : Summary Summary of of main ideasmain ideas
•• A A databasedatabase physically consists physically consists of a of a selection selection of of filesfiles accessed solely by theaccessed solely by thestorage manager storage manager of of the the DBMS.DBMS.
•• These These files reside files reside on on largelarge, , persistent storage persistent storage media, media, normally normally on on discsdiscs. . This This part part of a of a computercomputer‘‘s s store is called store is called secondary storagesecondary storage or background storeor background store, , as opposed as opposed to to its comparatively smallits comparatively small, , but but fast fast primaryprimary or main or main storestore..
•• There is There is no such no such thing as rows or columns thing as rows or columns in in filesfiles, , but but just just sequences sequences of of indiviindivi--dual dual symbolssymbols ((actually actually just just bitsbits!). !). ThusThus, , the the DBMS has to DBMS has to
•• codecode relational relational tuplestuples//fields as bit sequences when storing awayfields as bit sequences when storing away•• decodedecode them again when fetching them for manipulationthem again when fetching them for manipulation
•• There are There are a a few more basic principles few more basic principles of of physical structure physical structure of a of a databasedatabase::•• The beginning The beginning of of each tuple each tuple ((stored as stored as a a bit sequencebit sequence) ) is markedis marked
by by a a reference called reference called tuple identifiertuple identifier (TID).(TID).•• Tuples are summarized into larger units called Tuples are summarized into larger units called pagespages –– when when
accessing the database accessing the database no no tuplestuples, , but entire pages are fetchedbut entire pages are fetched..•• Pages Pages in turn in turn are organized are organized in form of in form of search treessearch trees, , called called indexesindexes. .
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 1212
Query Query manager basicsmanager basics
User InterfaceUser InterfaceUser Interface
Schema ManagerSchema ManagerSchema Manager Query ManagerQuery ManagerQuery Manager Transaction ManagerTransaction Transaction ManagerManager
Storage ManagerStorage Storage ManagerManager
Data DictionaryData DictionaryDatabaseDatabase
LogLog
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 1313
Query Query processingprocessing: : OverviewOverview
ScanningScanningParsingParsing
View expansionView expansion
OptimizationOptimizationOptimization
Code Code generationgenerationInterpretationInterpretation
Executable programExecutable program
Execution Execution planplan
SQL SQL queryquery
RARA--expressionexpression
1.1.
2.2.
3.3.
•• One of One of the most important the most important taskstasks of a DBMS of a DBMS isis efficient efficient processingprocessing of of queriesqueries..
•• In In principleprinciple, , queriesqueries areare treated treated likelike programsprograms writtenwritten in a in a programprogramlanguagelanguage, i.e., , i.e., theythey areare givengiven to ato acompilercompiler..
•• The most important phase The most important phase of of the compilation process is the the compilation process is the attemptattempt of of improving the improving the efficiency efficiency of of thethe final final execution phaseexecution phase, , called called queryquery optimizationoptimization..
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 1414
•• The optimization phase tries The optimization phase tries to to improve efficiencyimprove efficiency of of executionexecution, , even though it noteven though it notalways may reach always may reach an an „„optimaloptimal““ solution solution ((thus the notion is misleadingthus the notion is misleading).).
•• From From an an initialinitial internalinternal representation representation of of thethe queryquery in RA, an in RA, an equivalentequivalent but morebut moreefficiently executable efficiently executable algebra expressionalgebra expression is generated by equivalence preservingis generated by equivalence preservingtransformationstransformations: :
•• Afterwards Afterwards a a suitable suitable implementation variantimplementation variant is chosen for eachis chosen for each RARA--operatoroperator..ThisThis isis donedone byby exploiting severalexploiting several specialspecial parametersparameters of of the operandthe operand relations relations such such asas sizesize, , selectivityselectivity, , sortingsorting, , indexingindexing etc. etc. –– oftenoften cost models are usedcost models are used, , tootoo. .
•• At At the the end of end of thethe optimization phaseoptimization phase a a query executionquery execution planplan ((shortshort: QEP) : QEP) isis genegene--ratedrated whichwhich cancan bebe furtherfurther improvedimproved individuallyindividually ((„„byby handhand““) ) byby a DB a DB specialistspecialistwhenwhen usingusing a a commercialcommercial DBMS ("DBMS ("database tuningdatabase tuning").").
•• StrategiesStrategies and and heuristics used by heuristics used by a a commercial query optimizercommercial query optimizer are often veryare often verybadly documented as they constitute badly documented as they constitute a a „„well well hiddenhidden" " secret secret of of the the resp. resp. vendorvendor. .
Query Query optimizationoptimization: : PhasesPhases
Logical optimizationLogicalLogical optimizationoptimization
Physical optimizationPhysical Physical optimizationoptimization
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 1515
ExampleExample of of logicallogical optimization optimization (1)(1)
In which semester are those students enrolledwhich attend courses read by professor Sokrates?
In In whichwhich semestersemester areare thosethose students enrolledstudents enrolledwhich attend courseswhich attend courses readread byby professorprofessor Sokrates?Sokrates?
SELECT DISTINCT s.SemesterFROM Students s,
Attends a,Courses c,Professors p
WHERE p.Name = 'Sokrates'AND p.PersNr = c.ReadByAND a.CourseNr = c.CourseNrAND a.MatrNr = s.MatrNr
SELECT DISTINCTSELECT DISTINCT s.Semesters.SemesterFROMFROM StudentsStudents s,s,
Attends Attends a,a,Courses Courses c,c,Professors pProfessors p
WHEREWHERE p.Name = 'Sokrates'p.Name = 'Sokrates'ANDAND p.p.PersNrPersNr = c.= c.ReadByReadByANDAND a.a.CourseNrCourseNr = c.= c.CourseNrCourseNrANDAND a.a.MatrNrMatrNr = s.= s.MatrNrMatrNr
in SQLin SQL::
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 1616
ExampleExample ofof logical optimizationlogical optimization (2)(2)
ππs.Semesters.Semester
p.Name = 'Sokrates' p.Name = 'Sokrates' ∧∧p.p.PersNrPersNr = c.= c.ReadByReadBy ∧∧a.a.CourseNrCourseNr = c. = c. CourseNrCourseNr ∧∧a.a.MatrNrMatrNr = s. = s. MatrNr MatrNr
σσ
××pp
××
××cc
ss aa
Operator Operator treetree inincanonicalcanonical representationrepresentationof of thethe SQLSQL--queryquery::
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 1717
ExampleExample ofof logical optimizationlogical optimization (3)(3)
ππs.Semesters.Semester
p.Name = 'Sokrates' p.Name = 'Sokrates' ∧∧p.p.PersNrPersNr = c.= c.ReadByReadBy ∧∧a.a. CourseNrCourseNr = c.= c. CourseNrCourseNr ∧∧a.a.MatrNrMatrNr = s. = s. MatrNr MatrNr
σσ
××pp
××
××cc
ss aa
Splitting and Splitting and pushpushííng selectionsng selections
nicht verschiebbar !nicht verschiebbar !
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 1818
ExampleExample ofof logical optimizationlogical optimization (4)(4)
ππs.Semesters.Semester
p.p.PersNrPersNr = c.= c.ReadBy ReadBy σσ
××
pp××
××
cc
ss aa
σσ p.Name = 'Sokrates'p.Name = 'Sokrates'
σσ
a.a.CourseNrCourseNr = c.= c.CourseNr CourseNr σσ
a.a.MatrNrMatrNr = s. = s. MatrNr MatrNr
Forming joins from products Forming joins from products and and selectionsselections
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 1919
ExampleExample ofof logical optimizationlogical optimization (5)(5)
ππ s.Semesters.Semester
p.p.PersNrPersNr = c.= c.ReadBy ReadBy
pp
cc
ss aa
σσ p.Name = 'Sokrates'p.Name = 'Sokrates'a.a.CourseNrCourseNr = c.= c.CourseNrCourseNr
a.a.MatrNrMatrNr = s. = s. MatrNr MatrNr
ReorderingReordering joinsjoins
SmallestSmallest operandoperand::Should participate Should participate inin1st 1st joinjoin!!
11
22
33
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 2020
ExampleExample ofof logical optimizationlogical optimization (6)(6)
ππ s.Semesters.Semester
p.p.PersNrPersNr = c.= c.ReadBy ReadBy
ppcc
ss
aa
σσ p.Name = 'Sokrates'p.Name = 'Sokrates'
a.a.CourseNrCourseNr = c.= c.CourseNr CourseNr
a.a.MatrNrMatrNr = s. = s. MatrNr MatrNr
. . . all . . . all otherother joinsjoinsare then ordered are then ordered automaticallyautomatically 33
22
11
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 2121
•• JustificationJustification forfor reorderingsreorderings duringduring logicallogical optimizationoptimization::In In eacheach stepstep equivalence preservingequivalence preserving transformationstransformations are appliedare applied..
•• AlwaysAlways::•• Answer setAnswer set is is identicalidentical overover eacheach DBDB--statestate beforebefore and and afterafter transformation transformation
((„„equivalenceequivalence").").•• EvaluationEvaluation of of thethe transformedtransformed expressionexpression isis oftenoften (in (in many casesmany cases, , notnot alwaysalways) )
moremore efficientefficient than before thethan before the transformationtransformation. . •• ButBut::
•• Transformations Transformations do do notnot come come with any guarantee forwith any guarantee for efficiencyefficiency gainsgains!!•• Transformation Transformation rulesrules areare heuristicsheuristics ((„„rulesrules of of thumbthumb"), "), which have which have to to be be
appliedapplied in a in a „„cleverclever““ way.way.•• ThereforeTherefore: :
•• NotionNotion „„optimizationoptimization" " isis misleadingmisleading !!•• IfIf at all, at all, only certain only certain improvementsimprovements of of efficiencyefficiency cancan bebe achievedachieved, ,
rarelyrarely a (a (theoreticallytheoretically imaginableimaginable) optimal ) optimal solutionsolution will will bebe reachedreached..
LogicalLogical optimizationoptimization: : TheoreticalTheoretical foundationsfoundations
σp(R1 × R2) ⇒ (σp(R1)) × R2σσpp(R(R11 ×× RR22) ) ⇒⇒ ((σσpp(R(R11)))) ×× RR22
if attrif attr(p) (p) ⊆⊆ attr(Rattr(R11))and and attrattr(p) (p) ∩∩ attr(Rattr(R22) = ) = ∅∅e.g.:e.g.:
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 2222
Physical optimizationPhysical optimization : : OverviewOverview
""logicallogical" " RARA--expressionexpression
improvedimprovedlogical logical RARA--expressionexpression
Evaluation PlanEvaluation Plan((usingusing physicalphysicalRARA--operatorsoperators))
LogicalLogicaloptimizationoptimization
PhysicalPhysicaloptimizationoptimization
Main Main taskstasks of of physicalphysical optimizationoptimization::
•• ChoosingChoosing suitablesuitable algorithmsalgorithms forfor physicallyphysically realizingrealizingRARA--operatorsoperators: : physicalphysical operatorsoperators, , physical algebraphysical algebra
•• ExploitingExploiting exististing exististing access pathsaccess paths to to storagestorage structuresstructures::indexesindexes, B, B--treestrees, , hashhash--tablestables
•• SortingSorting of of intermediate resultsintermediate results•• Creating temporary Creating temporary indexesindexes and and hashhash--tablestables•• PlanningPlanning of of intermediate storage usageintermediate storage usage•• Estimating costsEstimating costs of different of different implementation variantsimplementation variants
based based on on predefinedpredefined cost modelscost models
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 2323
ProjectProject
SortSort
SelectSelect
PhysicalPhysical optimizationoptimization: : ExampleExample
ππ s.Semesters.Semester
p.p.PersNrPersNr = c.= c.ReadBy ReadBy
ppcc
ss
aa
σσ p.Name = 'Sokrates'p.Name = 'Sokrates'
a.a.CourseNrCourseNr = c.= c.CourseNr CourseNr
a.a.MatrNrMatrNr = s. = s. MatrNr MatrNr s.Semesters.Semester
p.p.PersNrPersNr = c.= c.ReadBy ReadBy
pp
cc
ss
aa
p.Name = 'Sokrates'p.Name = 'Sokrates'
h.h.CourseNrCourseNr = v.= v.CourseNrCourseNr
h.h.MatrNrMatrNr = s. = s. MatrNr MatrNr
IndexIndexJoinJoin
MergeMergeJoinJoin
IndexIndexJoinJoin
c.c.CourseNrCourseNrSortSort
h.h.CourseNrCourseNr
HashHash
IndexDupIndexDupResultResult ofoflogicallogicaloptimizationoptimization::
QEP QEP afterafterphysicalphysicaloptimizationoptimization::
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 2424
Operators of Operators of physicalphysical algebraalgebra
ProjectProjectaa
UnionUnion
SelectSelectcc
IndexSelectIndexSelectcc
EachEach operatoroperator of RA ("of RA ("logicallogical algebraalgebra") ") is associated withis associated with oneone oror more proceduresmore proceduresrealizing thisrealizing this operationoperation in in physicalphysical storagestorage: Operators of : Operators of physicalphysical algebraalgebra
•• ProjectionProjection withoutwithout duplicateduplicate eliminationelimination::
•• UnionUnion withoutwithout duplicate eliminationduplicate elimination ::
•• Duplicate elimininationDuplicate eliminination::•• "naive" "naive" basicbasic versionversion::
•• usingusing sortingsorting::
•• using using an an IndexIndex::
•• SelectionSelection::•• withoutwithout usingusing an an indexindex::
•• withwith usingusing anan indexindex ::
NestedDupNestedDup
SortDupSortDup
IndexDupIndexDup
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 2525
Operators ofOperators of physical algebraphysical algebra (2)(2)
SortSortaa
BucketBucket
HashHashaa
IndexJoinIndexJoincc
•• JoinJoin ((analoguoslyanaloguosly: : DifferenceDifference, , intersectionintersection):):•• "naive" "naive" basic versionbasic version::
•• on on sortedsorted inputinput relations:relations:
•• using using an an indexindex::
•• using using a a hashhash--table table in in mainmain memorymemory::
•• IntermediateIntermediate storingstoring::•• withwith temporarytemporary materialisationmaterialisation
of of results results relations in relations in memorymemory::
•• M. M. withwith sortingsorting of of resultresult relationrelation::
•• M. M. withwith indexingindexing the result relationthe result relation ::
•• SortingSorting of of intermediateintermediate resultresult relations:relations:
NestedLoopNestedLoopcc
MergeJoinMergeJoincc
MergeSortMergeSort
HashJoinHashJoincc
TreeTreeaa
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 2626
DiscussionDiscussion of of samplesample evaluation evaluation planplan
ProjectProject
SortSort
SelectSelect
s.Semesters.Semester
p.p.PersNrPersNr = c.= c.ReadBy ReadBy
pp
cc
ss
aa
p.Name = 'Sokrates'p.Name = 'Sokrates'
a.a.CourseNrCourseNr = c.= c.CourseNr CourseNr
a.a.MatrNrMatrNr = s. = s. MatrNr MatrNr
IndexIndexJoinJoin
MergeMergeJoinJoin
IndexIndexJoinJoin
c. c. CourseNr CourseNr SortSort
a. a. CourseNr CourseNr
HashHash
IndexIndexDupDup
Temporary hashTemporary hash--tabletablein in main memorymain memory
Primary indexPrimary index on son s
Secondary indexSecondary index on con c
Only Only partpart of of primary key primary key on a on a ⇒⇒ no no IndexJoin IndexJoin
No No indexindex on 'Name'on 'Name'
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 2727
Transcation manager basicsTranscation manager basics
User InterfaceUser InterfaceUser Interface
Schema ManagerSchema ManagerSchema Manager Query ManagerQuery ManagerQuery Manager Transaction ManagerTransaction Transaction ManagerManager
Storage ManagerStorage Storage ManagerManager
Data DictionaryData DictionaryDatabaseDatabase
LogLog
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 2828
TransactionsTransactions
•• Every change Every change (update) of a (update) of a database is performed database is performed in in special unitsspecial units comprising comprising one or more one or more update update statements statements ((possibly combined with queriespossibly combined with queries):):
•• Transactions Transactions in in SQLSQL are marked by special are marked by special keywordskeywords::•• BEGIN_OF_TRANSACTION, BEGIN_OF_TRANSACTION, for starting for starting a a transactiontransaction,,•• COMMIT, COMMIT, for successful completionfor successful completion, , oror•• ROLLBACK, ROLLBACK, for aborting for aborting a a failed transaction explicitlyfailed transaction explicitly. .
•• Transactions are Transactions are „„packagespackages““ of of individual steps treated with individual steps treated with „„particular careparticular care““by the by the DBMS: DBMS: Transactions are always guaranteeing some Transactions are always guaranteeing some form of form of logicallogicaland and physical physical consistency consistency of of the databasethe database!!
•• Individual Individual update update statementsstatements are treated like are treated like „„minimini--transactionstransactions““ implicitlyimplicitly,,eveneven without enclosing them into without enclosing them into BEGINBEGIN--COMMIT.COMMIT.
•• The The transaction managertransaction manager is one is one of of the most important components the most important components of a DBMS.of a DBMS.
TransactionsTransactionsTransactions
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 2929
TransactionsTransactions: Motivation: Motivation
Transactions take placeTransactions take place in in many areas many areas of real life all of real life all the the time,time,e.g. e.g. when using when using a a cash cash machinemachine for receiving for receiving cash cash moneymoney or or when transfering moneywhen transfering money byby home bankinghome banking. .
e.g.e.g..: .: Transfer 50 Transfer 50 €€ fromfrom accountaccount A to A to accountaccount B !B !
begin_of_transaction; read(A, a); a := a -50; write(A, a); read(B, b); b := b+50; write(B, b); commit
beginbegin_of__of_transactiontransaction; ; readread(A, a); a :(A, a); a :== a a --50; 50; writewrite(A, a); (A, a); readread(B, b); b :(B, b); b :== b+50; b+50; writewrite(B, b); (B, b); commitcommit
300300
200200
AA
BB250250
200200
AA
BB250250
250250
AA
BB
Initial Initial statestate Intermediate stateIntermediate state Final Final statestate((consistentconsistent)) ((consistentconsistent))((inconsistentinconsistent))
locallocal variablesvariables
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 3030
TransactionsTransactions: Motivation (2): Motivation (2)
Transaction managementTransaction management often takes place often takes place in a in a context context wherewhere severalseveral competingcompeting transactionstransactions concurrentlyconcurrently trytryto to accessaccess a a centralcentral, , jointly used databasejointly used database. .
e.g.e.g..: .: BookBook a a seatseat on on thethe nextnext flightflight fromfrom Cologne to Delhi !Cologne to Delhi !
Mr. AMr. A Mr. BMr. B
begin_of_transaction;read(free, a);write(occupied, a);commit
beginbegin_of__of_transactiontransaction;;readread((freefree, a);, a);writewrite((occupiedoccupied, a);, a);commitcommit
begin_of_transaction;read(free, b);write(occupied, b);commit
beginbegin_of__of_transactiontransaction;;readread((freefree, b);, b);writewrite((occupiedoccupied, b);, b);commitcommit
TTAA TTBB
What happens if there is only one seat leftWhat happens if there is only one seat left? ? ––Who gets itWho gets it? ? -- What does the other one seeWhat does the other one see??
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 3131
•• Sequences Sequences of of updates are reasonably handled only if they are executed updates are reasonably handled only if they are executed entirelyentirelyand and without observable interruptionwithout observable interruption. .
•• TransactionsTransactions expectexpect•• ConsistencyConsistency of of the the final final state state (i.e., (i.e., satisfaction satisfaction of all of all integrity constraintsintegrity constraints))•• PersistencePersistence of of thethe resulting changesresulting changes•• Resistence Resistence of of the the DBMS DBMS againstagainst machine errorsmachine errors and and concurrentconcurrent accessaccess
•• IfIf anyany transactiontransaction has has reachedreached a a state where any state where any of of these expectations can these expectations can no no longerlongerbe satisfiedbe satisfied, , it it has to has to be be interruptedinterrupted by the by the DBMS and DBMS and thethe initialinitial state from which state from which this particularthis particular transactiontransaction has has startedstarted has to has to reconstructedreconstructed::
•• If If a a transactiontransaction hadhad to to bebe rolledrolled back back duedue to to externalexternal errorserrors,, thethe DBMS will DBMS will automaautoma--tically try tically try to to restart it restart it at at the next possible moment the next possible moment in time.in time.
•• HoweverHowever, , if the transaction if the transaction has has been stopped due been stopped due to to integrity violationsintegrity violations it it has has caused itselfcaused itself, , rollbackrollback isis performedperformed, , butbut nono restart is attemptedrestart is attempted forfor thisthis transactiontransaction..
Policy Policy of of transaction handlingtransaction handling
RecoveryRecoveryRecovery
RollbackRollbackRollback
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 3232
CharacteristicsCharacteristics of of transactionstransactions: : TheThe ACIDACID--paradigmparadigm
Transactions thus areTransactions thus are characterisedcharacterised byby thethe followingfollowing fourfour propertiesproperties addressedaddressedtogethertogether byby thethe notionnotion of of „„ACIDACID--paradigmparadigm““::
• Atomicity:Transactions are indivisible, i.e., they are either executed entirely, or not at all („all or nothing-principle“).
• Consistency:Transactions lead from consistent initial states to consistent finalstates (but possibly via inconsistent intermediate states).
• Isolation:Concurrently performed transactions on the same database(in multi-user mode) do not influence each other, even though their individual steps may be interleaved by the DBMS for impro-ving efficiency.
• Durability:The effect of a successfully completed transaction is maintained persistently in the DB, unless explicitly reversed later on by follow-uptransactions.
•• AAtomicitytomicity::TransactionsTransactions areare indivisibleindivisible, i.e., , i.e., they are either executed entirelythey are either executed entirely, , or not or not at all (at all („„all all oror nothingnothing--principleprinciple““).).
•• CConsistencyonsistency::TransactionsTransactions lead fromlead from consistentconsistent initial statesinitial states to to consistentconsistent finalfinalstatesstates ((but possiblybut possibly viavia inconsistentinconsistent intermediateintermediate statesstates).).
•• IIsolationsolation::ConcurrentlyConcurrently performedperformed transactionstransactions on on the samethe same databasedatabase(in multi(in multi--useruser mode) do mode) do notnot influenceinfluence eacheach otherother, , eveneven though though theirtheir individualindividual stepssteps maymay bebe interleavedinterleaved byby thethe DBMS DBMS forfor improimpro--vingving efficiencyefficiency..
•• DDurabilityurability::TheThe effecteffect of a of a successfullysuccessfully completedcompleted transaction is maintained transaction is maintained persistentlypersistently in in thethe DB, DB, unless explicitly reversed later unless explicitly reversed later on on by followby follow--upuptransactionstransactions. .
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 3333
ComponentsComponents of a of a transaction managertransaction manager
Resulting from the Resulting from the ACIDACID--paradigmparadigm: : three main three main components components of a of a transaction managertransaction manager
•• Consistency checkerConsistency checker::•• Checks Checks all all affected integrity constraints after each individual stepaffected integrity constraints after each individual step
in a in a transaction transaction (IMMEDIATE) (IMMEDIATE) or or at at the the end of a end of a transactiontransaction(DEFERRED).(DEFERRED).
•• Stops Stops transaction transaction on on integrity violations integrity violations and and initiates rollbackinitiates rollback..•• SchedulerScheduler::
•• Determines Determines a proper (a proper („„nonnon--interferringinterferring““) ) sequence sequence of of steps steps of conof con--current transactions current transactions ((„„interleavinginterleaving““) ) using resources efficientlyusing resources efficiently..
•• Maintains the Maintains the „„illusionillusion““ of a of a sequential execution sequential execution ((serializabilityserializability). ). •• Recovery managerRecovery manager::
•• Records Records individual steps individual steps of of each transaction each transaction in a in a special protocolspecial protocolstorage areastorage area, , called the called the DB logDB log ((„„logginglogging““).).
•• Manages rollback Manages rollback and and restart after restart after system system failures according failures according to to thethelog log entriesentries..
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 3434
Lost updateLost update--problemproblem
write(A,a2)a2 := a2 ∗ 1.03
read(A,a2)
T2
write(B,b1)b1 := b1 + 300
read(B,b1)write(A,a1)
a1 := a1 – 300read(A,a1)
T1
9.8.7.6.5.4.3.2.1.
Step
„Lost update problem"„„Lost update Lost update problemproblem""
Expl. of Expl. of twotwo unsynchronizedunsynchronized transactionstransactions, , thetheexecution execution of of which causes changes which causes changes to to be be „„lostlost““::
T1: Transferring 300 €from account A to B
T2: Paying 3% interestto account A
TT11: Transferring 300 : Transferring 300 €€fromfrom accountaccount A to BA to B
TT22: : PayingPaying 3% 3% interestinterestto to accountaccount AA
A B A B
5300 5300 €€ 4700 4700 €€
5459 5459 €€
5000 5000 €€
5000 5000 €€
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 3535
Lost updateLost update--problemproblem (2) (2)
T2
aa11 := := aa11 –– 300300
read(read(AA,,aa11))
T1
9.8.7.6.5.4.3.2.1.
Step A B A B
5300 5300 €€ 4700 4700 €€
5000 5000 €€
5150 5150 €€
5000 5000 €€
There would have been two intuitively correctThere would have been two intuitively correctsolutionssolutions::
1. 1. casecase: T: T11 executed beforeexecuted before TT22
writewrite(A,a(A,a11))
readread(B,b(B,b11))
bb11 := := bb11 + 300+ 300
writewrite(B,b(B,b11))
readread(A,a(A,a22))
writewrite(A,a(A,a22))
aa22 := := aa22 ∗∗ 1.031.03
5000 5000 €€
T1: Transferring 300 €from account A to B
T2: Paying 3% interestto account A
TT11: Transferring 300 : Transferring 300 €€fromfrom accountaccount A to BA to B
TT22: : PayingPaying 3% 3% interestinterestto to accountaccount AA
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 3636
Lost updateLost update--problemproblem (3) (3)
T2T1
9.8.7.6.5.4.3.2.1.
Schritt A B A B
5300 5300 €€ 4700 4700 €€
5459 5459 €€
5159 5159 €€
5000 5000 €€
AnotherAnother semanticallysemantically „„clean" clean" solutionsolution::
2. 2. casecase: T: T22 executedexecuted beforebefore TT11
writewrite(A,a(A,a11))
readread(B,b(B,b11))
bb11 := := bb11 + 300+ 300
writewrite(B,b(B,b11))
readread(A,a(A,a22))
writewrite(A,a(A,a22))
aa22 := := aa22 ∗∗ 1.031.03
aa11 := := aa11 −− 300300
readread(A,a(A,a11))
4700 4700 €€
T1: Transferring 300 €from account A to B
T2: Paying 3% interestto account A
TT11: Transferring 300 : Transferring 300 €€fromfrom accountaccount A to BA to B
TT22: : PayingPaying 3% 3% interestinterestto to accountaccount AA
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 3737
Lost updateLost update--problemproblem (4) (4)
T2T1
9.8.7.6.5.4.3.2.1.
Step A B A B
5300 5300 €€ 4700 4700 €€
5459 5459 €€
5159 5159 €€
5000 5000 €€
An An „„interleavedinterleaved" " execution execution of of both transactionsboth transactionsproducingproducing a a „„correctcorrect““ resultresult::
writewrite(A,a(A,a11))
readread(B,b(B,b11))
bb11 := := bb11 + 300+ 300
writewrite(B,b(B,b11))
readread(A,a(A,a22))
writewrite(A,a(A,a22))
aa22 := := aa22 ∗∗ 1.031.03
aa11 := := aa11 −− 300300
readread(A,a(A,a11))
T1: Transferring 300 €from account A to B
T2: Paying 3% interestto account A
TT11: Transferring 300 : Transferring 300 €€fromfrom accountaccount A to BA to B
TT22: : PayingPaying 3% 3% interestinterestto to accountaccount AA
©© 2009 2009 Prof. Dr. Rainer Prof. Dr. Rainer MantheyManthey Foundations Foundations of IMof IM 3838
ResumeeResumee
There is muchThere is much, , much more much more to to be learnt be learnt about databases about databases and and database managementdatabase management, ..., ...
... ... butbut: : This lecture cannot provide more than This lecture cannot provide more than just a just a „„tastertaster““ of of this excitingthis excitingand and very very relevant relevant area area of of information processinginformation processing..
Time is over, folks!
Next week, we‘ll do a last roundof „exam training“ –make sure everybody attends:
I want each of you to pass straightaway!!
Time Time is overis over, , folksfolks!!
Next weekNext week, , wewe‘‘ll ll do a last do a last roundroundof of „„exam trainingexam training““ ––make sure everybody attendsmake sure everybody attends::
I I want each want each of of you you to pass to pass straightawaystraightaway!!!!