Upload
seanna
View
18
Download
0
Tags:
Embed Size (px)
DESCRIPTION
CS 430 Database Theory. Winter 2005 Lecture 9: Fourth and Fifth Normal Forms. Decompositions. Given a relation R = { A 1 , … , A n } (all of the A i are unique), then a set of relation schemas D = { R 1 , … , R m } is a decomposition of R if R is the union of the R i , or - PowerPoint PPT Presentation
Citation preview
1
CS 430Database Theory
Winter 2005
Lecture 9: Fourth and Fifth Normal Forms
2
Decompositions
Given a relation R = {A1, … , An} (all of the Ai are unique), then a set of relation schemas
D = {R1, … , Rm} is a decomposition of R if R is the union of the Ri, or
That is, all the attributes of R appear in the Ri
R Ri
i = 1
m
3
Goodness of Decomposition
When is a decomposition “good”? Two standards:
Dependency Preservation Lossless (Nonadditive) Join
4
Dependency Preservation
Suppose we have a set of FDs F on R and a decomposition D = {R1, … , Rm}, the projection of F on R is the set
Ri(F) = {X Y F+ | X Y Ri}
That is, Ri(F) consists of all the FDs in the closure of F which are FDs on Ri
5
Dependency Preservation
D is Dependency Preserving with respect to F if the closure of the union of the projections of F onto the Ri is the closure of F. Or,
(R1(F) … Rm(F))+ = F+
Or, if we project F onto the individual Ri, union the projections together, and compute the closure, we get the original closure of F. Or, no information contained in F is lost by
projecting F onto the individual Ri
6
Dependency Preservation Notes Claim: It is possible to find a 3NF
decomposition of R (each of Ri is 3NF) which is dependency preserving See Algorithm 11.2, page 340. (No proof.)
Why do we want this? When we update the database, we want to be
able verify FDs by verifying them on the individual relations
The alternative is having to do joins to verify that our update is good, slowing system.
7
Lossless (Nonadditive) Join Property D has the Lossless (Nonadditive) Join
property with respect to a set of FDs F if for every relation state r of R that satisfies F:
R1(r) … Rm(r) = r
( is the natural join) Lossless means no loss of information Nonadditive means that natural join doesn’t
add any information
8
Lossless (Nonadditive) Join Notes Algorithm 11.1, page 337, provides a way to
test for this property If D is a binary decomposition, D = {R1 ,
R2}, D is nonadditive if and only if:
(R1 R2) (R1 - R2) is in F+, or
(R1 R2) (R2 - R1) is in F+
That is, R1 R2 is a key for (at least) one of R1 or R2
9
Aside: Null Problems with Nulls See Figures 11.2, 11.3, Text Book Bottom line: If nulls are present, especially
nulls in foreign keys then May have to use outer joins instead of ordinary
(inner) joins Have to be careful if using aggregation (e.g. sum
or average)
10
Multi-Value Dependencies
If X ,Y attributes of R there is a Multi-Valued Dependency (MVD) X > Y, (we let Z = R - (X Y )) if for all states r of R, and t1, t2 tuples of r such that t1[X ] = t2[X ], then there exist tuples t3, t4 of r such that:
t3[X ] = t4[X ] = t1[X ] = t2[X ]
t3[Y ] = t1[Y ], t4[Y ] = t2[Y ]
t4[Z ] = t1[Z ], t3[Z ] = t2[Z ] An MVD X > Y, is trivial if Y X , or X Y = R
11
Fourth Normal Form
R is 4NF with respect to a set of FDs and MVDs F if for every non-trivial MVD X > Y, X is a superkey of R.
See Figure 11.4(a, b) in Text Book.
12
Fourth Normal Form Notes
If a relation is not 4NF then there are update anomalies: If you add a relation you must also add the corresponding
relations D is a lossless (nonadditive) decomposition of R,
D = {R1 , R2}, with respect to a set of FDs and MVDs F if and only if:
(R1 R2) > (R1 - R2), which is the same as
(R1 R2) > (R2 - R1)
13
Fifth Normal Form
• JD(R1, … , Rm) is a Join Dependency (JD) for a decomposition {R1, … , Rm} of R if for every legal state r of R:
R1(r) … Rm(r) = r• A JD is trivial if some Ri = R• A relation R is in Fifth Normal Form (5NF) if
for every non-trivial JD of R, every Ri is a superkey of R
14
Notes on Fifth Normal Form
An MVD is a JD with m = 2 Finding all the JDs of a database of any size
is probably not feasible Example: See Figure 11.4 (c, d) of Text Book
15
Products, Salesmen, TerritoriesA Data Design Problem Salesman
Sells specific products Has specific territories Has a quota: How much he is supposed to sell
Product Sold by salesmen Has a price
Territory Worked by salesmen
16
ER Model Version 1
Product
Salesman
Territory
SellsProduct
WorksTerritory
Quota
Price
A Salesman can sell any Product he sells in any Territory he works.A Product has one Price for all Salesmen and all Territories.A Salesman has one Quota for all his sales.Note: Each Entity and Relation becomes a relation in our database.
17
ER Model Version 2
Product
Salesman
Territory
SellsProduct
WorksTerritoryQuota
Price
A Salesman has a Quota for each product he sells.
18
ER Model Version 3
Product
Salesman
Territory
SellsProduct
WorksTerritoryQuota
Price
Products are only sold in specific Territories.A Product has a Price set for each Territory where it is sold.A Salesman can sell any Product he sells in any Territory he works
where that Product is sold.Note JD between “Sells Product”, “Sold In”, and “Works Territory”.
SoldIn
19
ER Model Version 4
Product
Salesman
Territory
SellsProduct
SellsProduct
in Territory
Quota
Price
A Salesman is assigned to sell specific Products in specific Territories.A Salesman has a Quota for each Product he sells in each Territory.Possible Integrity Constraint: Keys of “Sells Product” and “Sold In” are
projections of “Sells Product in Territory”.
SoldIn
20
ER Model Version 4A
Product
Salesman
Territory
SellsProduct
in Territory
Quota
Price
Possible Integrity Constraint: Key of “Sold In” is projection of “Sells Product in Territory”. (But I might want to assign a Price even though no Salemen have yet been assigned that Product in that Territory.)
SoldIn
21
Sample Fields
Employee Employee ID Number Employee Name Work Location Manager
Manager ID Number Manager Name
Territory Territory Number Territory Name Territory Bonus
Product Product Number Product Name Price Actual_Sales Target_Sales
Other Quota Commission Rate Commission Manager Commission
22
Possible Functional Dependencies {Employee ID Number}
{Employee Name, Work Location, Manager ID Number, Manager Commission(?)}
{Manage ID Number} {Manager Name, Manager Commission(?)}
{Territory Number} {Territory Name, Territory Bonus(?)}
{Product Number} {Product Name, Price(?), Actual Sales(?), Target
Sales (?)}
23
More Possible FDs
{Employee ID Number, Territory Number} {Territory Bonus(?), Quota(?), Commission
Rate(?)} {Employee ID Number, Product Number}
{Quota(?), Commission Rate(?)} {Territory Number, Product Number}
{Price(?), Actual Sales(?), Target Sales(?), Territory Bonus(?), Commission Rate(?), Commission(?), Manager Commission(?)}
24
More Possible FDs
{Employee ID Number, Product Number, Territory Number} {Quota(?), Actual Sales(?), Target Sales(?),
Commission Rate(?), Commission(?) , Manager Commission(?)}
{Actual Sales, Commission Rate} {Commission}
{Actual Sales, Manager Commission Rate} {Manager Commission}
25
Proposed Solution
Employee(Employee ID Number, Employee Name, Work Location, Manager ID Number)
Manager(Manager ID Number, Manager Name, Manager Commission)
Territory(Territory Number, Territory Name) Product(Product Number, Product Name)
26
More Proposed Solution
Product_Territory(Product Number, Territory Number, Price)
Employee_Territory(Employee ID Number, Territory Number, Territory Bonus)
Employee_Product(Employee ID Number, Product Number, Commission Rate)
Employee_Product_Territory(Employee ID Number, Product Number, Territory Number, Quota, Actual Sales, Target Sales, Commission, Manager Commission)