Upload
trinhhanh
View
215
Download
0
Embed Size (px)
Citation preview
Announcements
• MidtermnextWednesday
• Inclass:roughly70minutes• Comein10minutesearlier!
MidtermReview
Midterm
• Format:• Regularquestions.Nomultiple-choice.• ThisimpliesfewerquestionsJ• Acouplebonusquestions• Closedbookandnoaids!Wewillprovideacheatsheet
• Material:Everythingincludingbuffermanagement• Externalsortisnotfairgame!
• High-lights:• SimpleSQLandSchemaDefinitions• JoinSemantics• FunctionDependenciesandClosures• Decompositions(BCNFandProperties)• BufferPoolandReplacementPolicies
MidtermReview
High-Level:SQL
• Basicterminology:• relation/table(+“instanceof”),row/tuple,column/attribute,multiset
• TableschemasinSQL
• Single-tablequeries:• SFW(selection+projection)• BasicSQLoperators:LIKE,DISTINCT,ORDERBY
• Multi-tablequeries:• Foreignkeys• JOINS:
• BasicSQLsyntax&semanticsof
MidtermReview>SQL
5
TablesinSQL
PName Price Manufacturer
Gizmo $19.99 GizmoWorks
Powergizmo $29.99 GizmoWorks
SingleTouch $149.99 Canon
MultiTouch $203.99 Hitachi
Product
Arelation ortable isamultiset oftupleshavingtheattributesspecifiedbytheschema
Amultiset isanunorderedlist(or:asetwithmultipleduplicateinstancesallowed)
Anattribute (orcolumn)isatypeddataentrypresentineachtupleintherelation
Atuple orrow isasingleentryinthetablehavingtheattributesspecifiedbytheschema
MidtermReview>SQL
6
TableSchemas
• Theschema ofatableisthetablename,itsattributes,andtheirtypes:
• Akey isanattributewhosevaluesareunique;weunderlineakey
Product(Pname: string, Price: float, Category: string, Manufacturer: string)
Product(Pname: string, Price: float, Category: string, Manufacturer: string)
MidtermReview>SQL
7
SQLQuery
• Basicform(therearemanymanymorebellsandwhistles)
CallthisaSFW query.
SELECT <attributes>FROM <one or more relations>WHERE <conditions>
MidtermReview>SQL
8
LIKE:SimpleStringPatternMatching
SELECT *FROM ProductsWHERE PName LIKE ‘%gizmo%’
DISTINCT:EliminatingDuplicates
SELECT DISTINCT CategoryFROM Product
ORDERBY:SortingtheResults
SELECT PName, PriceFROM ProductWHERE Category=‘gizmo’ORDER BY Price, PName
MidtermReview>SQL
9
Joins
PName Price Category ManufGizmo $19 Gadgets GWorks
Powergizmo $29 Gadgets GWorks
SingleTouch $149 Photography Canon
MultiTouch $203 Household Hitachi
ProductCompany
Cname Stock CountryGWorks 25 USACanon 65 JapanHitachi 15 Japan
PName PriceSingleTouch $149.99
SELECT PName, PriceFROM Product, CompanyWHERE Manufacturer = CName
AND Country=‘Japan’AND Price <= 200
MidtermReview>SQL
AnexampleofSQLsemantics
10
SELECT R.AFROM R, SWHERE R.A = S.B
A13
B C2 33 43 5
A B C1 2 31 3 41 3 53 2 33 3 43 3 5
CrossProduct
A B C3 3 43 3 5
A33
ApplyProjectionApply
Selections/Conditions
Output
MidtermReview>SQL
High-Level:AdvancedSQL
• Setoperators• INTERSECT,UNION,EXCEPT,[ALL]• Subtletiesofmultiset operations
• Nestedqueries• IN,ANY,ALL,EXISTS• Correlatedqueries
• Aggregation• AVG,SUM,COUNT,MIN,MAX,…
• GROUPBY
• NULLs&OuterJoins
MidtermReview>AdvancedSQL
12
SELECT DISTINCT R.AFROM R, S, TWHERE R.A=S.A OR R.A=T.A
AnUnintuitiveQuery
ComputesRÇ (SÈ T)
ButwhatifS=f?
S T
R
Gobacktothesemantics!
MidtermReview>AdvancedSQL
INTERSECT
13
SELECT R.AFROM R, SWHERE R.A=S.AINTERSECTSELECT R.AFROM R, TWHERE R.A=T.A
Q1 Q2 Q1 Q2
UNION
SELECT R.AFROM R, SWHERE R.A=S.AUNIONSELECT R.AFROM R, TWHERE R.A=T.A
Q1 Q2
EXCEPT
SELECT R.AFROM R, SWHERE R.A=S.AEXCEPTSELECT R.AFROM R, TWHERE R.A=T.A
MidtermReview>AdvancedSQL
14
Nestedqueries:Sub-queriesReturningRelations
SELECT c.cityFROM Company cWHERE c.name IN (
SELECT pr.makerFROM Purchase p, Product prWHERE p.product = pr.nameAND p.buyer = ‘Joe Blow‘)
“CitieswhereonecanfindcompaniesthatmanufactureproductsboughtbyJoeBlow”
Company(name, city)Product(name, maker)Purchase(id, product, buyer)
MidtermReview>AdvancedSQL
NestedQueries:OperatorSemantics
SELECT nameFROM ProductWHERE price > ALL(SELECT priceFROM ProductWHERE maker = ‘G’)
Product(name, price, category, maker)
Findproductsthataremoreexpensivethanallproducts producedby“G”
ALLSELECT nameFROM ProductWHERE price > ANY(SELECT priceFROM ProductWHERE maker = ‘G’)
ANYSELECT nameFROM Product p1WHERE EXISTS (SELECT *FROM Product p2WHERE p2.maker = ‘G’AND p1.price =
p2.price)
EXISTS
Findproductsthataremoreexpensivethananyoneproduct producedby“G”
Findproductswherethereexistssome productwiththesamepriceproducedby“G”
MidtermReview>AdvancedSQL
NestedQueries:OperatorSemantics
SELECT nameFROM ProductWHERE price > ALL(X)
Product(name, price, category, maker)
Pricemustbe>all entriesinmultiset X
ALLSELECT nameFROM ProductWHERE price > ANY(X)
ANYSELECT nameFROM Product p1WHERE EXISTS (X)
EXISTS
Pricemustbe>atleastone entryinmultiset X
Xmustbenon-empty
*Notethatp1canbereferencedinX(correlatedquery!)
MidtermReview>AdvancedSQL
17
CorrelatedQueries
SELECT DISTINCT titleFROM Movie AS mWHERE year <> ANY(
SELECT yearFROM MovieWHERE title = m.title)
Movie(title, year, director, length)
Notealso:thiscanstillbeexpressedassingleSFWquery…
Findmovieswhosetitleappearsmorethanonce.
Notethescopingofthevariables!
MidtermReview>AdvancedSQL
18
SimpleAggregationsPurchaseProduct Date Price Quantitybagel 10/21 1 20banana 10/3 0.5 10banana 10/10 1 10bagel 10/25 1.50 20
SELECT SUM(price * quantity)FROM PurchaseWHERE product = ‘bagel’
50(=1*20+1.50*20)
MidtermReview>AdvancedSQL
19
Grouping&Aggregations:GROUPBY
Findtotalsalesafter10/1/2005,onlyforproductsthathavemorethan10totalunitssold
HAVINGclausescontainsconditionsonaggregates
SELECT product, SUM(price*quantity)FROM PurchaseWHERE date > ‘10/1/2005’GROUP BY productHAVING SUM(quantity) > 10
WhereasWHEREclausesconditiononindividualtuples…
MidtermReview>AdvancedSQL
20
GROUPBY:(1)ComputeFROM-WHERE
Product Date Price QuantityBagel 10/21 1 20Bagel 10/25 1.50 20Banana 10/3 0.5 10Banana 10/10 1 10Craisins 11/1 2 5Craisins 11/3 2.5 3
SELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY productHAVING SUM(quantity) > 10
FROMWHERE
MidtermReview>AdvancedSQL
21
GROUPBY:(2)AggregatebytheGROUPBYSELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY productHAVING SUM(quantity) > 10
GROUP BY Product Date Price QuantityBagel 10/21 1 20Bagel 10/25 1.50 20Banana 10/3 0.5 10Banana 10/10 1 10Craisins 11/1 2 5Craisins 11/3 2.5 3
Product Date Price Quantity
Bagel10/21 1 2010/25 1.50 20
Banana10/3 0.5 1010/10 1 10
Craisins11/1 2 511/3 2.5 3
MidtermReview>AdvancedSQL
22
GROUPBY:(3)FilterbytheHAVING clause
HAVING
SELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY productHAVING SUM(quantity) > 30
Product Date Price Quantity
Bagel10/21 1 2010/25 1.50 20
Banana10/3 0.5 1010/10 1 10
Craisins11/1 2 511/3 2.5 3
Product Date Price Quantity
Bagel10/21 1 2010/25 1.50 20
Banana10/3 0.5 1010/10 1 10
MidtermReview>AdvancedSQL
23
GROUPBY:(3)SELECT clause
SELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY productHAVING SUM(quantity) > 100
Product TotalSales
Bagel 50
Banana 15
SELECTProduct Date Price Quantity
Bagel10/21 1 2010/25 1.50 20
Banana10/3 0.5 1010/10 1 10
MidtermReview>AdvancedSQL
24
GeneralformofGroupingandAggregationSELECT SFROM R1,…,RnWHERE C1GROUP BY a1,…,akHAVING C2
Evaluationsteps:1. EvaluateFROM-WHERE:applyconditionC1 onthe
attributesinR1,…,Rn2. GROUPBYtheattributesa1,…,ak3. ApplyHAVINGconditionC2 toeachgroup(mayhave
aggregates)4. ComputeaggregatesinSELECT,S,andreturntheresult
MidtermReview>AdvancedSQL
25
NullValues
• Fornumericaloperations,NULL->NULL:• Ifx=NULLthen4*(3-x)/7isstillNULL
• Forboolean operations,inSQLtherearethreevalues:
FALSE= 0UNKNOWN= 0.5TRUE= 1
• Ifx=NULLthenx=“Joe”isUNKNOWN
MidtermReview>AdvancedSQL
26
NullValues
• C1ANDC2=min(C1,C2)• C1OR C2=max(C1,C2)• NOTC1=1– C1
SELECT *FROM PersonWHERE (age < 25)
AND (height > 6 AND weight > 190)
Won’treturne.g.(age=20height=NULLweight=200)!
RuleinSQL:includeonlytuplesthatyieldTRUE/1.0
MidtermReview>AdvancedSQL
27
NullValues
Unexpectedbehavior:
SELECT *FROM PersonWHERE age < 25
OR age >= 25
SomePersonsarenotincluded!
MidtermReview>AdvancedSQL
SELECT *FROM PersonWHERE age < 25
OR age >= 25 OR age IS NULL
NowitincludesallPersons!
CantestforNULLexplicitly:• xISNULL• xISNOTNULL
28
RECAP:InnerJoinsBy default,joinsinSQLare“innerjoins”:
SELECT Product.name, Purchase.storeFROM Product JOIN Purchase ON Product.name = Purchase.prodName
SELECT Product.name, Purchase.storeFROM Product, PurchaseWHERE Product.name = Purchase.prodName
Product(name, category)Purchase(prodName, store)
Bothequivalent:BothINNERJOINS!
MidtermReview>AdvancedSQL
29
name category
Gizmo gadget
Camera Photo
OneClick Photo
prodName store
Gizmo Wiz
Camera Ritz
Camera Wiz
name store
Gizmo Wiz
Camera Ritz
Camera Wiz
Product PurchaseINNERJOIN:
SELECT Product.name, Purchase.storeFROM Product
INNER JOIN Purchase ON Product.name = Purchase.prodName
Note:anotherequivalentwaytowriteanINNERJOIN!
MidtermReview>AdvancedSQL
30
name category
Gizmo gadget
Camera Photo
OneClick Photo
prodName store
Gizmo Wiz
Camera Ritz
Camera Wiz
name store
Gizmo Wiz
Camera Ritz
Camera Wiz
OneClick NULL
Product PurchaseLEFTOUTERJOIN:
SELECT Product.name, Purchase.storeFROM Product
LEFT OUTER JOIN Purchase ON Product.name = Purchase.prodName
MidtermReview>AdvancedSQL
Generalclarification:Setsvs.Multisets
• Intheory,andinanymoreformalmaterial,bydefinition allrelationsaresets oftuples
• InSQL,relations(i.e.tables)aremultisets,meaningyoucanhaveduplicatetuples
• WeneedthisbecauseintermediateresultsinSQLdon’teliminateduplicates
• Ifyougetconfused:juststateyourassumptions&we’llbeforgiving!
MidtermReview>AdvancedSQL
High-Level:ERDiagrams
• ERdiagrams!• Entities(vs.EntitySets)• Relationships• Multiplicity• Constraints:Keys,single-value,referential,participation,etc…
MidtermReview>ERDiagrams
Entitiesvs.EntitySets
Example:
33
Product
name category
price
EntitySet
Product
Name:XboxCategory:Total
MultimediaSystemPrice:$250
Name:MyLittlePonyDollCategory:ToyPrice:$25
Entity
EntityAttribute
Entitiesarenot explicitlyrepresentedinE/Rdiagrams!
MidtermReview>ERDiagrams
WhatisaRelationship?
34
MakesProduct
name category
price
Company
name
Arelationship betweenentitysetsPandCisasubsetofallpossiblepairsofentitiesinPandC,withtuplesuniquelyidentifiedbyPandC’skeys
MidtermReview>ERDiagrams
WhatisaRelationship?
35
name category price
Gizmo Electronics $9.99
GizmoLite Electronics $7.50
Gadget Toys $5.50
name
GizmoWorks
GadgetCorp
ProductCompanyC.name P.name P.category P.price
GizmoWorks Gizmo Electronics $9.99
GizmoWorks GizmoLite Electronics $7.50
GizmoWorks Gadget Toys $5.50
GadgetCorp Gizmo Electronics $9.99
GadgetCorp GizmoLite Electronics $7.50
GadgetCorp Gadget Toys $5.50
CompanyC×ProductP
C.name P.name
GizmoWorks Gizmo
GizmoWorks GizmoLite
GadgetCorp Gadget
MakesMakesProduct
name categoryprice
Company
name
Arelationship betweenentitysetsPandCisasubsetofallpossiblepairsofentitiesinPandC,withtuplesuniquelyidentifiedbyPandC’skeys
MidtermReview>ERDiagrams
36
MultiplicityofE/RRelationships
Indicatedusingarrows
123
abcd
One-to-one:
123
abcd
Many-to-one:
123
abcd
One-to-many:
123
abcd
Many-to-many:
X->YmeansthereexistsafunctionmappingfromXtoY(recallthedefinitionofafunction)
MidtermReview>ERDiagrams
37
ConstraintsinE/RDiagrams• FindingconstraintsispartoftheE/Rmodelingprocess.Commonlyused
constraintsare:
• Keys: Implicitconstraintsonuniquenessofentities• Ex:AnSSNuniquelyidentifiesaperson
• Single-valueconstraints:• Ex:apersoncanhaveonlyonefather
• Referentialintegrityconstraints: Referencedentitiesmustexist• Ex:ifyouworkforacompany,itmustexistinthedatabase
• Otherconstraints:• Ex:peoples’agesarebetween0and150
RecallFOREIGNKEYs!
MidtermReview>ERDiagrams
38
RECALL:Mathematicaldef.ofRelationship
• Amathematicaldefinition:
• LetA,Bbesets• A={1,2,3},B={a,b,c,d},
• AxB(thecross-product)isthesetofallpairs(a,b)• A´ B={(1,a),(1,b),(1,c),(1,d),(2,a),(2,b),(2,c),(2,d),(3,a),(3,b),(3,c),(3,d)}
• Wedefinearelationship tobeasubsetofAxB• R={(1,a),(2,c),(2,d),(3,b)}
1
2
3
a
b
c
d
A= B=
MidtermReview>ERDiagrams
39
• Amathematicaldefinition:• LetA,Bbesets• AxB(thecross-product)isthesetofallpairs• Arelationship isasubsetofAxB
• Makes isrelationship- itisasubset ofProduct´ Company:
1
2
3
a
b
c
d
A= B=
makes CompanyProduct
MidtermReview>ERDiagrams
RECALL:Mathematicaldef.ofRelationship
• Therecanonlybeonerelationshipforeveryuniquecombinationofentities
• Thisalsomeansthattherelationshipisuniquelydeterminedbythekeysofitsentities
• Example:thekeyforMakes(toright)is{Product.name,Company.name}
40Whydoesthismakesense?
Thisfollowsfromourmathematicaldefinitionofarelationship- it’saSET!
MakesProduct
name categoryprice
Company
namesince
KeyMakes =KeyProduct ∪ KeyCompany
MidtermReview>ERDiagrams
RECALL:Mathematicaldef.ofRelationship
High-Level:DBDesign
• Redundancy&dataanomalies
• Functionaldependencies• Fordatabaseschemadesign• GivensetofFDs,findothersimplied- usingArmstrong’srules
• Closures• Basicalgorithm• TofindallFDs
• Keys&Superkeys
MidtermReview>DBDesign
ConstraintsPrevent(some)AnomaliesintheData
Student Course RoomMary CS145 B01Joe CS145 B01Sam CS145 B01.. .. ..
Ifeverycourseisinonlyoneroom,containsredundantinformation!
Apoorlydesigneddatabasecausesanomalies:
Ifweupdatetheroomnumberforonetuple,wegetinconsistentdata=anupdate anomaly
Ifeveryonedropstheclass,welosewhatroomtheclassisin!=adelete anomaly
Similarly,wecan’treservearoomwithoutstudents=aninsertanomaly
… CS229 C12
MidtermReview>DBDesign
ConstraintsPrevent(some)AnomaliesintheData
Student CourseMary CS145Joe CS145Sam CS145.. ..
Course RoomCS145 B01CS229 C12
Isthisformbetter?
• Redundancy?• Updateanomaly?• Deleteanomaly?• Insertanomaly?
MidtermReview>DBDesign
APictureOfFDsDefn (again):GivenattributesetsA={A1,…,Am} andB={B1,…Bn}inR,
Thefunctionaldependency Aà BonRholdsifforanyti,tj inR:
if ti[A1]=tj[A1]ANDti[A2]=tj[A2]AND…ANDti[Am]=tj[Am]
then ti[B1]=tj[B1]ANDti[B2]=tj[B2]AND…ANDti[Bn]=tj[Bn]
A1 … Am B1 … Bn
ti
tj
Ift1,t2agreehere.. …theyalsoagreehere!
MidtermReview>DBDesign
FDsforRelationalSchemaDesign
• High-levelidea:whydowecareaboutFDs?
1. Startwithsomerelationalschema
2. Findoutitsfunctionaldependencies(FDs)
3. Usethesetodesignabetterschema1. Onewhichminimizespossibilityofanomalies
Thispartcanbetricky!
MidtermReview>DBDesign
Equivalenttoasking:GivenasetofFDs,F={f1,…fn},doesanFDghold?
Inferenceproblem:Howdowedecide?
Answer:ThreesimplerulescalledArmstrong’sRules.
1. Split/Combine,2. Reduction,and3. Transitivity…ideasbypicture
FindingFunctionalDependencies
MidtermReview>DBDesign
47
ClosureofasetofAttributes
Given asetofattributesA1,…,An andasetofFDsF:Thentheclosure,{A1,…,An}+ isthesetofattributesB s.t. {A1,…,An}à B
{name} à {color}{category} à {department}{color, category} à {price}
Example: F=
ExampleClosures:
{name}+ = {name, color}{name, category}+ ={name, category, color, dept, price}{color}+ = {color}
MidtermReview>DBDesign
48
ClosureAlgorithmStartwithX={A1,…,An},FDsF.Repeatuntil Xdoesn’tchange;do:if {B1,…,Bn}à CisinFand {B1,
…,Bn}⊆ X:then addCtoX.
Return XasX+
F=
{name, category}+ ={name, category}
{name, category}+ ={name, category, color, dept, price}
{name, category}+ ={name, category, color}
{name, category}+ ={name, category, color, dept}{name} à {color}
{category} à {dept}
{color, category} à{price}
MidtermReview>DBDesign
KeysandSuperkeys
Asuperkey isasetofattributesA1,…,An s.t.foranyother attributeB inR,wehave {A1,…,An}à B
Akey isaminimal superkey
I.e.allattributesarefunctionallydeterminedbyasuperkey
Meaningthatnosubsetofakeyisalsoasuperkey
MidtermReview>DBDesign
CALCULATINGKeysandSuperkeys
AlsoseeLecture-5.ipynb!!!
MidtermReview>DBDesign
• Superkey?• ComputetheclosureofA• Seeifit=thefullsetofattributes
• Key?• ConfirmthatAissuperkey• MakesurethatnosubsetofAisasuperkey
• Onlyneedtocheckone‘level’down!
IsSuperkey(A,R,F):A+=ComputeClosure(A,F)Return (A+==R)?
IsKey(A,R,F):If not IsSuperkey(A,R,F):
return FalseFor BinSubsetsOf(A,size=len(A)-1):
ifIsSuperkey(B,R,F):return False
return True
LetAbeasetofattributes,Rsetofallattributes,FsetofFDs:
High-Level:Decompositions
• Conceptualdesign
• Boyce-Codd NormalForm(BCNF)• Definition• Algorithm
• Decompositions• Losslessvs.Lossy• AproblemwithBCNF
MidtermReview>Decompositions
52
BacktoConceptualDesign
NowthatweknowhowtofindFDs,it’sastraight-forwardprocess:
1. Searchfor“bad”FDs
2. Ifthereareany,thenkeepdecomposingthetableintosub-tablesuntilnomorebadFDs
3. Whendone,thedatabaseschemaisnormalized
Recall:thereareseveralnormalforms…
MidtermReview>Decompositions
53
Boyce-Codd NormalForm
BCNFisasimpleconditionforremovinganomaliesfromrelations:
Inotherwords:thereareno“bad”FDs
ArelationRisinBCNF if:
if{A1,...,An}à B isanon-trivial FDinR
then{A1,...,An}isasuperkey forR
Equivalently: ∀ setsofattributesX,either(X+ =X)or(X+ =allattributes)
MidtermReview>Decompositions
54
Example
Whatisthekey?{SSN,PhoneNumber}
Name SSN PhoneNumber CityFred 123-45-6789 206-555-1234 SeattleFred 123-45-6789 206-555-6543 SeattleJoe 987-65-4321 908-555-2121 WestfieldJoe 987-65-4321 908-555-1234 Westfield
{SSN} à {Name,City}
⟹Not inBCNF
ThisFDisbadbecauseitisnot asuperkey
MidtermReview>Decompositions
55
Example
Name SSN CityFred 123-45-6789 SeattleJoe 987-65-4321 Madison
SSN PhoneNumber123-45-6789 206-555-1234123-45-6789 206-555-6543987-65-4321 908-555-2121987-65-4321 908-555-1234
Let’scheckanomalies:• Redundancy?• Update?• Delete?
{SSN} à {Name,City}
NowinBCNF!
ThisFDisnowgoodbecauseitisthekey
MidtermReview>Decompositions
56
BCNFDecompositionAlgorithm
BCNFDecomp(R):FindXs.t.:X+ ≠XandX+≠[allattributes]
if (notfound)then Return R
let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)
Return BCNFDecomp(R1),BCNFDecomp(R2)
MidtermReview>Decompositions
57
BCNFDecompositionAlgorithm
BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠
[allattributes]
if (notfound)then Return R
let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)
Return BCNFDecomp(R1),BCNFDecomp(R2)
FindasetofattributesXwhichhasnon-trivial“bad”FDs,i.e.isnotasuperkey,usingclosures
MidtermReview>Decompositions
58
BCNFDecompositionAlgorithm
BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠
[allattributes]
if (notfound)then Return R
let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)
Return BCNFDecomp(R1),BCNFDecomp(R2)
Ifno“bad”FDsfound,inBCNF!
MidtermReview>Decompositions
59
BCNFDecompositionAlgorithm
BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠
[allattributes]
if (notfound)then Return R
let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)
Return BCNFDecomp(R1),BCNFDecomp(R2)
LetYbetheattributesthatXfunctionallydetermines(+thatarenotinX)
AndletZbetheotherattributesthatitdoesn’t
MidtermReview>Decompositions
60
BCNFDecompositionAlgorithm
BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠
[allattributes]
if (notfound)then Return R
let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)
Return BCNFDecomp(R1),BCNFDecomp(R2)
X ZY
R1 R2
Splitintoonerelation(table)withXplustheattributesthatXdetermines(Y)…
MidtermReview>Decompositions
61
BCNFDecompositionAlgorithm
BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠
[allattributes]
if (notfound)then Return R
let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)
Return BCNFDecomp(R1),BCNFDecomp(R2)
X ZY
R1 R2
AndonerelationwithXplustheattributesitdoesnotdetermine(Z)
MidtermReview>Decompositions
62
BCNFDecompositionAlgorithm
BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠
[allattributes]
if (notfound)then Return R
let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)
Return BCNFDecomp(R1),BCNFDecomp(R2)
Proceedrecursivelyuntilnomore“bad”FDs!
MidtermReview>Decompositions
R(A,B,C,D,E)BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠
[allattributes]
if (notfound)then Return R
let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)
Return BCNFDecomp(R1),BCNFDecomp(R2)
Example
{A} à {B,C}{C} à {D}
MidtermReview>Decompositions
64
Example
R(A,B,C,D,E){A}+ ={A,B,C,D}≠{A,B,C,D,E}
R1(A,B,C,D){C}+ ={C,D}≠{A,B,C,D}
R2(A,E)R11(C,D) R12(A,B,C)
R(A,B,C,D,E)
{A} à {B,C}{C} à {D}
MidtermReview>Decompositions
LosslessDecompositions
65
BCNFdecompositionisalwayslossless.Why?
Note:don’tneed{A1,...,An}à {C1,...,Cp}
If {A1,...,An}à {B1,...,Bm}Thenthedecompositionislossless.{A1,...,An} isakeyforoneofR1orR2
R(A1,...,An,B1,...,Bm,C1,...,Cp)
R1(A1,...,An,B1,...,Bm) R2(A1,...,An,C1,...,Cp)
MidtermReview>Decompositions
66
AProblemwithBCNF{Unit} à {Company}{Company,Product} à {Unit}
WedoaBCNFdecompositionona“bad”FD:{Unit}+ = {Unit, Company}
WelosetheFD{Company,Product} à {Unit}!!
Unit Company Product… … …
Unit Company… …
Unit Product… …
{Unit} à {Company}
MidtermReview>Decompositions
High-Level:StorageandBuffers
• Ourmodelofthecomputer:Diskvs.RAM
• BufferPool
• ReplacementPolicies
MidtermReview>StorageandBuffers
High-level:Diskvs.MainMemory
Disk:
• Slow: Sequentialaccess• (althoughfastsequentialreads)
• Durable:Wewillassumethatonceondisk,dataissafe!
• Cheap 68
Platters
SpindleDisk head
Arm movement
Arm assembly
Tracks
Sector
Cylinder
RandomAccessMemory(RAM)orMainMemory:
• Fast: Randomaccess,byteaddressable• ~10xfasterforsequentialaccess• ~100,000xfasterforrandomaccess!
• Volatile: Datacanbelostife.g.crashoccurs,powergoesout,etc!
• Expensive: For$100,get16GBofRAMvs.2TBofdisk!
MidtermReview>StorageandBuffers
TheBuffer(Pool)
Disk
MainMemory
Buffer(Pool)• Abuffer isaregionofphysicalmemoryusedtostoretemporarydata
• Inthislecture:aregioninmainmemoryusedtostoreintermediatedatabetweendiskandprocesses
• Keyidea:Reading/writingtodiskisslow-needtocachedata!
MidtermReview>StorageandBuffers
70
BufferManager
• Memorydividedintobufferframes:slotsforholdingdiskpages• Bookkeepingperframe:
• Pincount:#usersofthepageintheframe• Pinning :Indicatethatthepageisinuse• Unpinning :Releasethepage,andalsoindicateifthepageisdirtied
• Dirtybit :Indicatesifchangesmustbepropagatedtodisk
MidtermReview>StorageandBuffers
71
BufferManager• WhenaPageisrequested:
• Inbufferpool->returnahandletotheframe.Done!• Incrementthepincount
• Notinthebufferpool:• Chooseaframeforreplacement(Onlyreplacepageswithpincount==0)
• Ifframeisdirty,writeittodisk• Readrequestedpageintochosenframe• Pinthepageandreturnitsaddress
MidtermReview>StorageandBuffers
72
BufferManager• WhenaPageisrequested:
• Inbufferpool->returnahandletotheframe.Done!• Incrementthepincount
• Notinthebufferpool:• Chooseaframeforreplacement(Onlyreplacepageswithpincount==0)
• Ifframeisdirty,writeittodisk• Readrequestedpageintochosenframe• Pinthepageandreturnitsaddress
MidtermReview>StorageandBuffers
• Howdowechooseaframeforreplacement?• LRU(LeastRecentlyUsed)• Clock• MRU(MostRecentlyUsed)• FIFO,random,…
• Thereplacementpolicyhasbigimpacton#ofI/O’s(dependsontheaccesspattern)
73
Bufferreplacementpolicy
MidtermReview>StorageandBuffers
74
LRU
• usesaqueue ofpointerstoframesthathavepincount=0
• apagerequestusesframesonlyfromthehead ofthequeue
• whenathepincountofaframegoesto0,itisaddedtotheendofthequeue
MidtermReview>StorageandBuffers