75
CS 564 Midterm Review The Best Of Collection (Master Tracks), Vol. 1 Midterm Review

CS 564 Midterm Review - GitHub Pages · CS 564 Midterm Review The Best Of Collection (Master Tracks), Vol. 1 ... 10/10 1 10 Midterm Review > Advanced SQL. 24 General form of …

Embed Size (px)

Citation preview

CS564MidtermReviewTheBestOfCollection(MasterTracks),Vol.1

MidtermReview

Announcements

• MidtermnextWednesday

• Inclass:roughly70minutes• Comein10minutesearlier!

MidtermReview

Midterm

• Format:• Regularquestions.Nomultiple-choice.• ThisimpliesfewerquestionsJ• Acouplebonusquestions• Closedbookandnoaids!Wewillprovideacheatsheet

• Material:Everythingincludingbuffermanagement• Externalsortisnotfairgame!

• High-lights:• SimpleSQLandSchemaDefinitions• JoinSemantics• FunctionDependenciesandClosures• Decompositions(BCNFandProperties)• BufferPoolandReplacementPolicies

MidtermReview

High-Level:SQL

• Basicterminology:• relation/table(+“instanceof”),row/tuple,column/attribute,multiset

• TableschemasinSQL

• Single-tablequeries:• SFW(selection+projection)• BasicSQLoperators:LIKE,DISTINCT,ORDERBY

• Multi-tablequeries:• Foreignkeys• JOINS:

• BasicSQLsyntax&semanticsof

MidtermReview>SQL

5

TablesinSQL

PName Price Manufacturer

Gizmo $19.99 GizmoWorks

Powergizmo $29.99 GizmoWorks

SingleTouch $149.99 Canon

MultiTouch $203.99 Hitachi

Product

Arelation ortable isamultiset oftupleshavingtheattributesspecifiedbytheschema

Amultiset isanunorderedlist(or:asetwithmultipleduplicateinstancesallowed)

Anattribute (orcolumn)isatypeddataentrypresentineachtupleintherelation

Atuple orrow isasingleentryinthetablehavingtheattributesspecifiedbytheschema

MidtermReview>SQL

6

TableSchemas

• Theschema ofatableisthetablename,itsattributes,andtheirtypes:

• Akey isanattributewhosevaluesareunique;weunderlineakey

Product(Pname: string, Price: float, Category: string, Manufacturer: string)

Product(Pname: string, Price: float, Category: string, Manufacturer: string)

MidtermReview>SQL

7

SQLQuery

• Basicform(therearemanymanymorebellsandwhistles)

CallthisaSFW query.

SELECT <attributes>FROM <one or more relations>WHERE <conditions>

MidtermReview>SQL

8

LIKE:SimpleStringPatternMatching

SELECT *FROM ProductsWHERE PName LIKE ‘%gizmo%’

DISTINCT:EliminatingDuplicates

SELECT DISTINCT CategoryFROM Product

ORDERBY:SortingtheResults

SELECT PName, PriceFROM ProductWHERE Category=‘gizmo’ORDER BY Price, PName

MidtermReview>SQL

9

Joins

PName Price Category ManufGizmo $19 Gadgets GWorks

Powergizmo $29 Gadgets GWorks

SingleTouch $149 Photography Canon

MultiTouch $203 Household Hitachi

ProductCompany

Cname Stock CountryGWorks 25 USACanon 65 JapanHitachi 15 Japan

PName PriceSingleTouch $149.99

SELECT PName, PriceFROM Product, CompanyWHERE Manufacturer = CName

AND Country=‘Japan’AND Price <= 200

MidtermReview>SQL

AnexampleofSQLsemantics

10

SELECT R.AFROM R, SWHERE R.A = S.B

A13

B C2 33 43 5

A B C1 2 31 3 41 3 53 2 33 3 43 3 5

CrossProduct

A B C3 3 43 3 5

A33

ApplyProjectionApply

Selections/Conditions

Output

MidtermReview>SQL

High-Level:AdvancedSQL

• Setoperators• INTERSECT,UNION,EXCEPT,[ALL]• Subtletiesofmultiset operations

• Nestedqueries• IN,ANY,ALL,EXISTS• Correlatedqueries

• Aggregation• AVG,SUM,COUNT,MIN,MAX,…

• GROUPBY

• NULLs&OuterJoins

MidtermReview>AdvancedSQL

12

SELECT DISTINCT R.AFROM R, S, TWHERE R.A=S.A OR R.A=T.A

AnUnintuitiveQuery

ComputesRÇ (SÈ T)

ButwhatifS=f?

S T

R

Gobacktothesemantics!

MidtermReview>AdvancedSQL

INTERSECT

13

SELECT R.AFROM R, SWHERE R.A=S.AINTERSECTSELECT R.AFROM R, TWHERE R.A=T.A

Q1 Q2 Q1 Q2

UNION

SELECT R.AFROM R, SWHERE R.A=S.AUNIONSELECT R.AFROM R, TWHERE R.A=T.A

Q1 Q2

EXCEPT

SELECT R.AFROM R, SWHERE R.A=S.AEXCEPTSELECT R.AFROM R, TWHERE R.A=T.A

MidtermReview>AdvancedSQL

14

Nestedqueries:Sub-queriesReturningRelations

SELECT c.cityFROM Company cWHERE c.name IN (

SELECT pr.makerFROM Purchase p, Product prWHERE p.product = pr.nameAND p.buyer = ‘Joe Blow‘)

“CitieswhereonecanfindcompaniesthatmanufactureproductsboughtbyJoeBlow”

Company(name, city)Product(name, maker)Purchase(id, product, buyer)

MidtermReview>AdvancedSQL

NestedQueries:OperatorSemantics

SELECT nameFROM ProductWHERE price > ALL(SELECT priceFROM ProductWHERE maker = ‘G’)

Product(name, price, category, maker)

Findproductsthataremoreexpensivethanallproducts producedby“G”

ALLSELECT nameFROM ProductWHERE price > ANY(SELECT priceFROM ProductWHERE maker = ‘G’)

ANYSELECT nameFROM Product p1WHERE EXISTS (SELECT *FROM Product p2WHERE p2.maker = ‘G’AND p1.price =

p2.price)

EXISTS

Findproductsthataremoreexpensivethananyoneproduct producedby“G”

Findproductswherethereexistssome productwiththesamepriceproducedby“G”

MidtermReview>AdvancedSQL

NestedQueries:OperatorSemantics

SELECT nameFROM ProductWHERE price > ALL(X)

Product(name, price, category, maker)

Pricemustbe>all entriesinmultiset X

ALLSELECT nameFROM ProductWHERE price > ANY(X)

ANYSELECT nameFROM Product p1WHERE EXISTS (X)

EXISTS

Pricemustbe>atleastone entryinmultiset X

Xmustbenon-empty

*Notethatp1canbereferencedinX(correlatedquery!)

MidtermReview>AdvancedSQL

17

CorrelatedQueries

SELECT DISTINCT titleFROM Movie AS mWHERE year <> ANY(

SELECT yearFROM MovieWHERE title = m.title)

Movie(title, year, director, length)

Notealso:thiscanstillbeexpressedassingleSFWquery…

Findmovieswhosetitleappearsmorethanonce.

Notethescopingofthevariables!

MidtermReview>AdvancedSQL

18

SimpleAggregationsPurchaseProduct Date Price Quantitybagel 10/21 1 20banana 10/3 0.5 10banana 10/10 1 10bagel 10/25 1.50 20

SELECT SUM(price * quantity)FROM PurchaseWHERE product = ‘bagel’

50(=1*20+1.50*20)

MidtermReview>AdvancedSQL

19

Grouping&Aggregations:GROUPBY

Findtotalsalesafter10/1/2005,onlyforproductsthathavemorethan10totalunitssold

HAVINGclausescontainsconditionsonaggregates

SELECT product, SUM(price*quantity)FROM PurchaseWHERE date > ‘10/1/2005’GROUP BY productHAVING SUM(quantity) > 10

WhereasWHEREclausesconditiononindividualtuples…

MidtermReview>AdvancedSQL

20

GROUPBY:(1)ComputeFROM-WHERE

Product Date Price QuantityBagel 10/21 1 20Bagel 10/25 1.50 20Banana 10/3 0.5 10Banana 10/10 1 10Craisins 11/1 2 5Craisins 11/3 2.5 3

SELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY productHAVING SUM(quantity) > 10

FROMWHERE

MidtermReview>AdvancedSQL

21

GROUPBY:(2)AggregatebytheGROUPBYSELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY productHAVING SUM(quantity) > 10

GROUP BY Product Date Price QuantityBagel 10/21 1 20Bagel 10/25 1.50 20Banana 10/3 0.5 10Banana 10/10 1 10Craisins 11/1 2 5Craisins 11/3 2.5 3

Product Date Price Quantity

Bagel10/21 1 2010/25 1.50 20

Banana10/3 0.5 1010/10 1 10

Craisins11/1 2 511/3 2.5 3

MidtermReview>AdvancedSQL

22

GROUPBY:(3)FilterbytheHAVING clause

HAVING

SELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY productHAVING SUM(quantity) > 30

Product Date Price Quantity

Bagel10/21 1 2010/25 1.50 20

Banana10/3 0.5 1010/10 1 10

Craisins11/1 2 511/3 2.5 3

Product Date Price Quantity

Bagel10/21 1 2010/25 1.50 20

Banana10/3 0.5 1010/10 1 10

MidtermReview>AdvancedSQL

23

GROUPBY:(3)SELECT clause

SELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY productHAVING SUM(quantity) > 100

Product TotalSales

Bagel 50

Banana 15

SELECTProduct Date Price Quantity

Bagel10/21 1 2010/25 1.50 20

Banana10/3 0.5 1010/10 1 10

MidtermReview>AdvancedSQL

24

GeneralformofGroupingandAggregationSELECT SFROM R1,…,RnWHERE C1GROUP BY a1,…,akHAVING C2

Evaluationsteps:1. EvaluateFROM-WHERE:applyconditionC1 onthe

attributesinR1,…,Rn2. GROUPBYtheattributesa1,…,ak3. ApplyHAVINGconditionC2 toeachgroup(mayhave

aggregates)4. ComputeaggregatesinSELECT,S,andreturntheresult

MidtermReview>AdvancedSQL

25

NullValues

• Fornumericaloperations,NULL->NULL:• Ifx=NULLthen4*(3-x)/7isstillNULL

• Forboolean operations,inSQLtherearethreevalues:

FALSE= 0UNKNOWN= 0.5TRUE= 1

• Ifx=NULLthenx=“Joe”isUNKNOWN

MidtermReview>AdvancedSQL

26

NullValues

• C1ANDC2=min(C1,C2)• C1OR C2=max(C1,C2)• NOTC1=1– C1

SELECT *FROM PersonWHERE (age < 25)

AND (height > 6 AND weight > 190)

Won’treturne.g.(age=20height=NULLweight=200)!

RuleinSQL:includeonlytuplesthatyieldTRUE/1.0

MidtermReview>AdvancedSQL

27

NullValues

Unexpectedbehavior:

SELECT *FROM PersonWHERE age < 25

OR age >= 25

SomePersonsarenotincluded!

MidtermReview>AdvancedSQL

SELECT *FROM PersonWHERE age < 25

OR age >= 25 OR age IS NULL

NowitincludesallPersons!

CantestforNULLexplicitly:• xISNULL• xISNOTNULL

28

RECAP:InnerJoinsBy default,joinsinSQLare“innerjoins”:

SELECT Product.name, Purchase.storeFROM Product JOIN Purchase ON Product.name = Purchase.prodName

SELECT Product.name, Purchase.storeFROM Product, PurchaseWHERE Product.name = Purchase.prodName

Product(name, category)Purchase(prodName, store)

Bothequivalent:BothINNERJOINS!

MidtermReview>AdvancedSQL

29

name category

Gizmo gadget

Camera Photo

OneClick Photo

prodName store

Gizmo Wiz

Camera Ritz

Camera Wiz

name store

Gizmo Wiz

Camera Ritz

Camera Wiz

Product PurchaseINNERJOIN:

SELECT Product.name, Purchase.storeFROM Product

INNER JOIN Purchase ON Product.name = Purchase.prodName

Note:anotherequivalentwaytowriteanINNERJOIN!

MidtermReview>AdvancedSQL

30

name category

Gizmo gadget

Camera Photo

OneClick Photo

prodName store

Gizmo Wiz

Camera Ritz

Camera Wiz

name store

Gizmo Wiz

Camera Ritz

Camera Wiz

OneClick NULL

Product PurchaseLEFTOUTERJOIN:

SELECT Product.name, Purchase.storeFROM Product

LEFT OUTER JOIN Purchase ON Product.name = Purchase.prodName

MidtermReview>AdvancedSQL

Generalclarification:Setsvs.Multisets

• Intheory,andinanymoreformalmaterial,bydefinition allrelationsaresets oftuples

• InSQL,relations(i.e.tables)aremultisets,meaningyoucanhaveduplicatetuples

• WeneedthisbecauseintermediateresultsinSQLdon’teliminateduplicates

• Ifyougetconfused:juststateyourassumptions&we’llbeforgiving!

MidtermReview>AdvancedSQL

High-Level:ERDiagrams

• ERdiagrams!• Entities(vs.EntitySets)• Relationships• Multiplicity• Constraints:Keys,single-value,referential,participation,etc…

MidtermReview>ERDiagrams

Entitiesvs.EntitySets

Example:

33

Product

name category

price

EntitySet

Product

Name:XboxCategory:Total

MultimediaSystemPrice:$250

Name:MyLittlePonyDollCategory:ToyPrice:$25

Entity

EntityAttribute

Entitiesarenot explicitlyrepresentedinE/Rdiagrams!

MidtermReview>ERDiagrams

WhatisaRelationship?

34

MakesProduct

name category

price

Company

name

Arelationship betweenentitysetsPandCisasubsetofallpossiblepairsofentitiesinPandC,withtuplesuniquelyidentifiedbyPandC’skeys

MidtermReview>ERDiagrams

WhatisaRelationship?

35

name category price

Gizmo Electronics $9.99

GizmoLite Electronics $7.50

Gadget Toys $5.50

name

GizmoWorks

GadgetCorp

ProductCompanyC.name P.name P.category P.price

GizmoWorks Gizmo Electronics $9.99

GizmoWorks GizmoLite Electronics $7.50

GizmoWorks Gadget Toys $5.50

GadgetCorp Gizmo Electronics $9.99

GadgetCorp GizmoLite Electronics $7.50

GadgetCorp Gadget Toys $5.50

CompanyC×ProductP

C.name P.name

GizmoWorks Gizmo

GizmoWorks GizmoLite

GadgetCorp Gadget

MakesMakesProduct

name categoryprice

Company

name

Arelationship betweenentitysetsPandCisasubsetofallpossiblepairsofentitiesinPandC,withtuplesuniquelyidentifiedbyPandC’skeys

MidtermReview>ERDiagrams

36

MultiplicityofE/RRelationships

Indicatedusingarrows

123

abcd

One-to-one:

123

abcd

Many-to-one:

123

abcd

One-to-many:

123

abcd

Many-to-many:

X->YmeansthereexistsafunctionmappingfromXtoY(recallthedefinitionofafunction)

MidtermReview>ERDiagrams

37

ConstraintsinE/RDiagrams• FindingconstraintsispartoftheE/Rmodelingprocess.Commonlyused

constraintsare:

• Keys: Implicitconstraintsonuniquenessofentities• Ex:AnSSNuniquelyidentifiesaperson

• Single-valueconstraints:• Ex:apersoncanhaveonlyonefather

• Referentialintegrityconstraints: Referencedentitiesmustexist• Ex:ifyouworkforacompany,itmustexistinthedatabase

• Otherconstraints:• Ex:peoples’agesarebetween0and150

RecallFOREIGNKEYs!

MidtermReview>ERDiagrams

38

RECALL:Mathematicaldef.ofRelationship

• Amathematicaldefinition:

• LetA,Bbesets• A={1,2,3},B={a,b,c,d},

• AxB(thecross-product)isthesetofallpairs(a,b)• A´ B={(1,a),(1,b),(1,c),(1,d),(2,a),(2,b),(2,c),(2,d),(3,a),(3,b),(3,c),(3,d)}

• Wedefinearelationship tobeasubsetofAxB• R={(1,a),(2,c),(2,d),(3,b)}

1

2

3

a

b

c

d

A= B=

MidtermReview>ERDiagrams

39

• Amathematicaldefinition:• LetA,Bbesets• AxB(thecross-product)isthesetofallpairs• Arelationship isasubsetofAxB

• Makes isrelationship- itisasubset ofProduct´ Company:

1

2

3

a

b

c

d

A= B=

makes CompanyProduct

MidtermReview>ERDiagrams

RECALL:Mathematicaldef.ofRelationship

• Therecanonlybeonerelationshipforeveryuniquecombinationofentities

• Thisalsomeansthattherelationshipisuniquelydeterminedbythekeysofitsentities

• Example:thekeyforMakes(toright)is{Product.name,Company.name}

40Whydoesthismakesense?

Thisfollowsfromourmathematicaldefinitionofarelationship- it’saSET!

MakesProduct

name categoryprice

Company

namesince

KeyMakes =KeyProduct ∪ KeyCompany

MidtermReview>ERDiagrams

RECALL:Mathematicaldef.ofRelationship

High-Level:DBDesign

• Redundancy&dataanomalies

• Functionaldependencies• Fordatabaseschemadesign• GivensetofFDs,findothersimplied- usingArmstrong’srules

• Closures• Basicalgorithm• TofindallFDs

• Keys&Superkeys

MidtermReview>DBDesign

ConstraintsPrevent(some)AnomaliesintheData

Student Course RoomMary CS145 B01Joe CS145 B01Sam CS145 B01.. .. ..

Ifeverycourseisinonlyoneroom,containsredundantinformation!

Apoorlydesigneddatabasecausesanomalies:

Ifweupdatetheroomnumberforonetuple,wegetinconsistentdata=anupdate anomaly

Ifeveryonedropstheclass,welosewhatroomtheclassisin!=adelete anomaly

Similarly,wecan’treservearoomwithoutstudents=aninsertanomaly

… CS229 C12

MidtermReview>DBDesign

ConstraintsPrevent(some)AnomaliesintheData

Student CourseMary CS145Joe CS145Sam CS145.. ..

Course RoomCS145 B01CS229 C12

Isthisformbetter?

• Redundancy?• Updateanomaly?• Deleteanomaly?• Insertanomaly?

MidtermReview>DBDesign

APictureOfFDsDefn (again):GivenattributesetsA={A1,…,Am} andB={B1,…Bn}inR,

Thefunctionaldependency Aà BonRholdsifforanyti,tj inR:

if ti[A1]=tj[A1]ANDti[A2]=tj[A2]AND…ANDti[Am]=tj[Am]

then ti[B1]=tj[B1]ANDti[B2]=tj[B2]AND…ANDti[Bn]=tj[Bn]

A1 … Am B1 … Bn

ti

tj

Ift1,t2agreehere.. …theyalsoagreehere!

MidtermReview>DBDesign

FDsforRelationalSchemaDesign

• High-levelidea:whydowecareaboutFDs?

1. Startwithsomerelationalschema

2. Findoutitsfunctionaldependencies(FDs)

3. Usethesetodesignabetterschema1. Onewhichminimizespossibilityofanomalies

Thispartcanbetricky!

MidtermReview>DBDesign

Equivalenttoasking:GivenasetofFDs,F={f1,…fn},doesanFDghold?

Inferenceproblem:Howdowedecide?

Answer:ThreesimplerulescalledArmstrong’sRules.

1. Split/Combine,2. Reduction,and3. Transitivity…ideasbypicture

FindingFunctionalDependencies

MidtermReview>DBDesign

47

ClosureofasetofAttributes

Given asetofattributesA1,…,An andasetofFDsF:Thentheclosure,{A1,…,An}+ isthesetofattributesB s.t. {A1,…,An}à B

{name} à {color}{category} à {department}{color, category} à {price}

Example: F=

ExampleClosures:

{name}+ = {name, color}{name, category}+ ={name, category, color, dept, price}{color}+ = {color}

MidtermReview>DBDesign

48

ClosureAlgorithmStartwithX={A1,…,An},FDsF.Repeatuntil Xdoesn’tchange;do:if {B1,…,Bn}à CisinFand {B1,

…,Bn}⊆ X:then addCtoX.

Return XasX+

F=

{name, category}+ ={name, category}

{name, category}+ ={name, category, color, dept, price}

{name, category}+ ={name, category, color}

{name, category}+ ={name, category, color, dept}{name} à {color}

{category} à {dept}

{color, category} à{price}

MidtermReview>DBDesign

KeysandSuperkeys

Asuperkey isasetofattributesA1,…,An s.t.foranyother attributeB inR,wehave {A1,…,An}à B

Akey isaminimal superkey

I.e.allattributesarefunctionallydeterminedbyasuperkey

Meaningthatnosubsetofakeyisalsoasuperkey

MidtermReview>DBDesign

CALCULATINGKeysandSuperkeys

AlsoseeLecture-5.ipynb!!!

MidtermReview>DBDesign

• Superkey?• ComputetheclosureofA• Seeifit=thefullsetofattributes

• Key?• ConfirmthatAissuperkey• MakesurethatnosubsetofAisasuperkey

• Onlyneedtocheckone‘level’down!

IsSuperkey(A,R,F):A+=ComputeClosure(A,F)Return (A+==R)?

IsKey(A,R,F):If not IsSuperkey(A,R,F):

return FalseFor BinSubsetsOf(A,size=len(A)-1):

ifIsSuperkey(B,R,F):return False

return True

LetAbeasetofattributes,Rsetofallattributes,FsetofFDs:

High-Level:Decompositions

• Conceptualdesign

• Boyce-Codd NormalForm(BCNF)• Definition• Algorithm

• Decompositions• Losslessvs.Lossy• AproblemwithBCNF

MidtermReview>Decompositions

52

BacktoConceptualDesign

NowthatweknowhowtofindFDs,it’sastraight-forwardprocess:

1. Searchfor“bad”FDs

2. Ifthereareany,thenkeepdecomposingthetableintosub-tablesuntilnomorebadFDs

3. Whendone,thedatabaseschemaisnormalized

Recall:thereareseveralnormalforms…

MidtermReview>Decompositions

53

Boyce-Codd NormalForm

BCNFisasimpleconditionforremovinganomaliesfromrelations:

Inotherwords:thereareno“bad”FDs

ArelationRisinBCNF if:

if{A1,...,An}à B isanon-trivial FDinR

then{A1,...,An}isasuperkey forR

Equivalently: ∀ setsofattributesX,either(X+ =X)or(X+ =allattributes)

MidtermReview>Decompositions

54

Example

Whatisthekey?{SSN,PhoneNumber}

Name SSN PhoneNumber CityFred 123-45-6789 206-555-1234 SeattleFred 123-45-6789 206-555-6543 SeattleJoe 987-65-4321 908-555-2121 WestfieldJoe 987-65-4321 908-555-1234 Westfield

{SSN} à {Name,City}

⟹Not inBCNF

ThisFDisbadbecauseitisnot asuperkey

MidtermReview>Decompositions

55

Example

Name SSN CityFred 123-45-6789 SeattleJoe 987-65-4321 Madison

SSN PhoneNumber123-45-6789 206-555-1234123-45-6789 206-555-6543987-65-4321 908-555-2121987-65-4321 908-555-1234

Let’scheckanomalies:• Redundancy?• Update?• Delete?

{SSN} à {Name,City}

NowinBCNF!

ThisFDisnowgoodbecauseitisthekey

MidtermReview>Decompositions

56

BCNFDecompositionAlgorithm

BCNFDecomp(R):FindXs.t.:X+ ≠XandX+≠[allattributes]

if (notfound)then Return R

let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)

Return BCNFDecomp(R1),BCNFDecomp(R2)

MidtermReview>Decompositions

57

BCNFDecompositionAlgorithm

BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠

[allattributes]

if (notfound)then Return R

let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)

Return BCNFDecomp(R1),BCNFDecomp(R2)

FindasetofattributesXwhichhasnon-trivial“bad”FDs,i.e.isnotasuperkey,usingclosures

MidtermReview>Decompositions

58

BCNFDecompositionAlgorithm

BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠

[allattributes]

if (notfound)then Return R

let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)

Return BCNFDecomp(R1),BCNFDecomp(R2)

Ifno“bad”FDsfound,inBCNF!

MidtermReview>Decompositions

59

BCNFDecompositionAlgorithm

BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠

[allattributes]

if (notfound)then Return R

let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)

Return BCNFDecomp(R1),BCNFDecomp(R2)

LetYbetheattributesthatXfunctionallydetermines(+thatarenotinX)

AndletZbetheotherattributesthatitdoesn’t

MidtermReview>Decompositions

60

BCNFDecompositionAlgorithm

BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠

[allattributes]

if (notfound)then Return R

let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)

Return BCNFDecomp(R1),BCNFDecomp(R2)

X ZY

R1 R2

Splitintoonerelation(table)withXplustheattributesthatXdetermines(Y)…

MidtermReview>Decompositions

61

BCNFDecompositionAlgorithm

BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠

[allattributes]

if (notfound)then Return R

let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)

Return BCNFDecomp(R1),BCNFDecomp(R2)

X ZY

R1 R2

AndonerelationwithXplustheattributesitdoesnotdetermine(Z)

MidtermReview>Decompositions

62

BCNFDecompositionAlgorithm

BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠

[allattributes]

if (notfound)then Return R

let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)

Return BCNFDecomp(R1),BCNFDecomp(R2)

Proceedrecursivelyuntilnomore“bad”FDs!

MidtermReview>Decompositions

R(A,B,C,D,E)BCNFDecomp(R):FindasetofattributesXs.t.:X+ ≠XandX+≠

[allattributes]

if (notfound)then Return R

let Y=X+ - X,Z=(X+)Cdecompose R intoR1(XÈ Y)andR2(XÈ Z)

Return BCNFDecomp(R1),BCNFDecomp(R2)

Example

{A} à {B,C}{C} à {D}

MidtermReview>Decompositions

64

Example

R(A,B,C,D,E){A}+ ={A,B,C,D}≠{A,B,C,D,E}

R1(A,B,C,D){C}+ ={C,D}≠{A,B,C,D}

R2(A,E)R11(C,D) R12(A,B,C)

R(A,B,C,D,E)

{A} à {B,C}{C} à {D}

MidtermReview>Decompositions

LosslessDecompositions

65

BCNFdecompositionisalwayslossless.Why?

Note:don’tneed{A1,...,An}à {C1,...,Cp}

If {A1,...,An}à {B1,...,Bm}Thenthedecompositionislossless.{A1,...,An} isakeyforoneofR1orR2

R(A1,...,An,B1,...,Bm,C1,...,Cp)

R1(A1,...,An,B1,...,Bm) R2(A1,...,An,C1,...,Cp)

MidtermReview>Decompositions

66

AProblemwithBCNF{Unit} à {Company}{Company,Product} à {Unit}

WedoaBCNFdecompositionona“bad”FD:{Unit}+ = {Unit, Company}

WelosetheFD{Company,Product} à {Unit}!!

Unit Company Product… … …

Unit Company… …

Unit Product… …

{Unit} à {Company}

MidtermReview>Decompositions

High-Level:StorageandBuffers

• Ourmodelofthecomputer:Diskvs.RAM

• BufferPool

• ReplacementPolicies

MidtermReview>StorageandBuffers

High-level:Diskvs.MainMemory

Disk:

• Slow: Sequentialaccess• (althoughfastsequentialreads)

• Durable:Wewillassumethatonceondisk,dataissafe!

• Cheap 68

Platters

SpindleDisk head

Arm movement

Arm assembly

Tracks

Sector

Cylinder

RandomAccessMemory(RAM)orMainMemory:

• Fast: Randomaccess,byteaddressable• ~10xfasterforsequentialaccess• ~100,000xfasterforrandomaccess!

• Volatile: Datacanbelostife.g.crashoccurs,powergoesout,etc!

• Expensive: For$100,get16GBofRAMvs.2TBofdisk!

MidtermReview>StorageandBuffers

TheBuffer(Pool)

Disk

MainMemory

Buffer(Pool)• Abuffer isaregionofphysicalmemoryusedtostoretemporarydata

• Inthislecture:aregioninmainmemoryusedtostoreintermediatedatabetweendiskandprocesses

• Keyidea:Reading/writingtodiskisslow-needtocachedata!

MidtermReview>StorageandBuffers

70

BufferManager

• Memorydividedintobufferframes:slotsforholdingdiskpages• Bookkeepingperframe:

• Pincount:#usersofthepageintheframe• Pinning :Indicatethatthepageisinuse• Unpinning :Releasethepage,andalsoindicateifthepageisdirtied

• Dirtybit :Indicatesifchangesmustbepropagatedtodisk

MidtermReview>StorageandBuffers

71

BufferManager• WhenaPageisrequested:

• Inbufferpool->returnahandletotheframe.Done!• Incrementthepincount

• Notinthebufferpool:• Chooseaframeforreplacement(Onlyreplacepageswithpincount==0)

• Ifframeisdirty,writeittodisk• Readrequestedpageintochosenframe• Pinthepageandreturnitsaddress

MidtermReview>StorageandBuffers

72

BufferManager• WhenaPageisrequested:

• Inbufferpool->returnahandletotheframe.Done!• Incrementthepincount

• Notinthebufferpool:• Chooseaframeforreplacement(Onlyreplacepageswithpincount==0)

• Ifframeisdirty,writeittodisk• Readrequestedpageintochosenframe• Pinthepageandreturnitsaddress

MidtermReview>StorageandBuffers

• Howdowechooseaframeforreplacement?• LRU(LeastRecentlyUsed)• Clock• MRU(MostRecentlyUsed)• FIFO,random,…

• Thereplacementpolicyhasbigimpacton#ofI/O’s(dependsontheaccesspattern)

73

Bufferreplacementpolicy

MidtermReview>StorageandBuffers

74

LRU

• usesaqueue ofpointerstoframesthathavepincount=0

• apagerequestusesframesonlyfromthehead ofthequeue

• whenathepincountofaframegoesto0,itisaddedtotheendofthequeue

MidtermReview>StorageandBuffers

75

MRU

• usesastackofpointerstoframesthathavepincount=0

• apagerequestusesframesonlyfromthetopofthestack

• whenathepincountofaframegoesto0,itisaddedtothetopofthestack

MidtermReview>StorageandBuffers