27
Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Embed Size (px)

DESCRIPTION

Copyright © Curt Hill Functional Dependencies Why? –The analysis of the FDs shows the problems with First and Second Normal Forms –Why Third and Boyce-Codd Normal Forms are better Notation: A  B This is read: A determines B Or: B is dependent on A B is fully functionally dependent on A –B is functionally dependent on A –B is not functionally dependent on any subset of A –Notation is A ↠ B

Citation preview

Page 1: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Schema Refinement II

2nd NF to 3rd NF to BCNF

Page 2: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Recap• First Normal Form is just a standard

rectangular table• No repeating groups• It is good, but may have anomalies

– Insert • Need extra info to insert

– Delete• Extra info may be lost when deleting

– Update• Multiple updates may be needed

Page 3: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Functional Dependencies• Why?

– The analysis of the FDs shows the problems with First and Second Normal Forms

– Why Third and Boyce-Codd Normal Forms are better

• Notation:ABThis is read: A determines BOr: B is dependent on A

• B is fully functionally dependent on A – B is functionally dependent on A– B is not functionally dependent on any subset

of A– Notation is

A ↠ B

Page 4: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Lossless Join Decomposition

• The goal is to project a table in a lower normal form into several tables of higher normal form

• This is done using a lossless join decomposition

• Occurs when no dependencies are broken

• When the tables are joined, exact original table is reproduced

Page 5: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Second Normal Form (2ndNF)

• A table is in Second Normal Form if and only if

• It is in 1st NF and• Every non-key attribute is fully

functionally dependent on the whole key

• Eliminates partial dependencies

Page 6: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Partial DependenciesKey

X

A

• XA• X is part of key but not all of it• Violation of 2nd NF

Page 7: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Student File RevisitedSID SName LCode Status21 Jones A1 132 Smith A1 136 Ericson A3 239 Williams A2 3

• This is 2nd NF but still demonstrates anomalies, because LCode Status

• We must know the LCode and Status before an insert

• Updating Jone’s LCode could put conflicting information in file

• Deleting Ericson loses all info about LCode A3

Page 8: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Transitivity Again• Functional dependencies are

transitive– If A B and

B Cthen A C

– Equality, greater and lesser are also transitive

Page 9: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Transitivity is the Problem

• The Status depends on the LCode

• The LCode depends on SID which is the key

• Thus 2nd NF• Status depends directly on a

non-key – It depends transitively on the key

• There are still anomalies because of the transitive dependency

Page 10: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Transitive DependenciesSID LCode

• LCodeStatus• Lcode is not part of key

Status

Page 11: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Third Normal Form• A table is in 3rd Normal Form if

and only if• The table is 2nd NF• Every non-key item is

intransitively dependent on the key– In other words:

Each item not in the key depends directly on the key and does not depend on anything not in the key

Page 12: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Another view of 3rd NF• For each Functional Dependency in

the relation R, X A then one of the following conditions must be true

• The FD is trivial– A is part of X, so dependency is trivial

• X is a superkey– The key plus additional fields

• A is part of some key for R

Page 13: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Student File RevisitedSID SName LCode Status21 Jones A1 132 Smith A1 136 Ericson A3 239 Williams A2 3

• Is? LCode Status – The FD is trivial - No– X is a superkey - No– A is part of some key for R – No

• Not in 3rd Normal Form

Page 14: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

So how do we fix? Projection!

SID SName LCode Status21 Jones A1 132 Smith A1 136 Ericson A3 239 Williams A2 3

SID SName LCode21 Jones A132 Smith A136 Ericson A339 Williams A2

LCode StatusA1 1A3 2A2 3

Becomes

Page 15: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Notes• With LCode and Status in a

separate table some anomalies are eliminated

• No deletion of a student loses LCode information

• No insertion of a student needs status information, so it cannot be in conflict with other status information

• Changing status and LCode only needs a single update

Copyright © 2003-2012 Curt Hill

Page 16: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Boyce-Codd Normal Form

• Slight strengthening of 3rd NF• A table is in 3rd NF iff

– The table is 2nd NF– Every non-key item is

intransitively dependent on the the key

• A table is in Boyce-Codd NF iff– The table is 2nd NF– Every item is dependent only on

the key

Page 17: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

3rd NF and BCNF• For each FD in the relation R,

X A • One of the following conditions

must be true for 3rd NF– The FD is trivial– X is a superkey– A is part of some key for R

• One of the following conditions must be true for BCNF– The FD is trivial– X is a superkey

Page 18: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Explanation• Many consider 3rd NF and BCNF

as identical• What 3rd NF does not consider

is the possibility of alternate keys

• The definition of a key in 3rd NF is the primary or other candidate key

• BCNF forces everything to be dependent on only the primary key

Page 19: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

3NF and BCNFKey X

A

KeyX

A

Disallowed by 3NF and BCNF

Disallowed by BCNF

Page 20: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

The catchy saying• Each item should be

dependent on:• the key (1st NF)• the whole key (2nd NF) • and nothing but the key

(BCNF)

Page 21: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

3NF or BCNF?• Is there a practical difference?• Yes• BCNF is slightly stronger

– It eliminates a type of redundancy

– It may also introduce another problem

Page 22: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

An example• Relation has 6 attributes:

ABCDEF• FDs:

•A ABCDEF (A is key)•CE A (CE is also a key)•BD E

• Not BCNF, but is 3NF– BD is not a key

• Project into ABCDF and BDE to make it BCNF

Page 23: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Problem with this Projection

• Project into ABCDF and BDE• This is a lossless join

decomposition• Now in BCNF• There is an integrity constraint

issue– CE A cannot be checked without

doing a join• This was not a dependency

preserving decomposition

Page 24: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Testing for Dependency Preserving

Decompositions• If we decompose relation R

into S and T• If the R+ = (S U T)+

– Then it is dependency preserving• This was not ABCDF and BDE

– CE A is in R+ but not in (S U T)+

– Neither relation had all three fields

Page 25: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Decompositions May Be:

• Lossless join• Dependency preserving• Both (clearly the preferred)• Neither• There is always a lossless join and

dependency preserving decomposition into 3NF

• This is not always the case with BCNF– We can always get to BCNF– Is it desirable?

Page 26: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Perspective

• Some redundancy is allowed in 3NF that is disallowed in BCNF

• We can not always get to BCNF with dependency preserving decompositions– Even though we can always get to BCNF

• We then have to decide where to stop• We may actually settle for 2NF for

other reasons – Such as efficient queries

Page 27: Copyright © 2003-2012 Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF

Copyright © 2003-2012 Curt Hill

Final Thoughts• This is far as FDs and FFDs

may be pushed• Higher normal forms require

looking at something else• Fourth Normal Form requires

consideration of multi-valued dependencies