View
232
Download
3
Category
Tags:
Preview:
Citation preview
Introduction to Schema Refinement
Normal Forms• Types of Normal Form
• First Normal Form (1NF)• Second Normal Form (2NF)• Third Normal Form(3NF)• Boyce-Codd Normal Form ( BCNF)• Fourth Normal Form (4NF)• Fifth Nornal Form (5NF)
Normal Forms• First Normal Form (1NF)• A row of data cannot contain repeating group
of data • Ie atomic value
• Here the student Jeet is used twice in the table and subject PHY is repeated
• Another method is to divide the relation into 2
Cid name Subject101 Jeet PHY101 Jeet CHE102 Seet PHY103 Swet SOCIAL
Cid Cname101 Jeet102 Seet
Subid Cid Subject1 101 PHY2 101 CHE3 102 PHY4 103 SOCIAL
Normal Forms
• Second normal form (2NF): • A relation that is in 1NF and every non-primary
key attribute is fully functionally dependent on the primary key.
• Does not permit partial dependency• No attribute is dependent on only to primary
key• Primary key consists of only a single attribute it
is automatically be in 2NF if it is in 1NF
Normal Forms• Second normal form (2NF): • Consider the following relation, not in 2NF
• Here Cid & Order_id is PK• It is in 1NF• Not in 2NF, there are partial dependencies of columns on Primary key• Cname is only dependent on Cid• Order_name is dependent on order_id• There is no link between Cname & Sale_details• To reduce this table into 2NF, break the table into 3 different tables
Cid Cname Order_id Order_name Sale_details101 Adam 10 Order1 Sale1101 Adam 11 order2 sale2102 Alex 12 Order3 Sale3103 Sumo 13 Order4 Sale4
Cid Cname101 Adam102 Alex103 Sumo
Order_id Order_name10 Order111 order212 Order313 Order4
Cid Order_id Sale_details
101 10 Sale1101 11 sale2102 12 Sale3103 13 Sale4
Normal Forms• Second normal form (2NF): • 1NF 2NF
• This relation is in 1NF, because the value of each domain are atomic.
• To convert into 2NF• First find the attribute that make primary key• No one attribute alone form a primary key• 2NF does not contain partial dependency• Divide the above relation into 3
Order Product Customer Address Qty Unit Prices1 p1 Anish bhopal 300 500s1 p2 Anish bhopal 100 600s2 p3 Ammu kollam 500 300s3 p5 Sumo kottayam 200 200s3 p1 Sumo kottayam 450 500
Normal Forms• R1 R2 R3
• Advantages• Insert : can insert any row in relation R1 without
using product attribute• Delete : can delete the S1 & P2 by deleting a row
from relation R3, without losing the information that in R1
• Update: address for a given customer is written once, it is not repeated many times in relation R1, so we can update the address only once
Order Customer Addresss1 Anish bhopals2 Ammu kollams3 Sumo kottayam
Order Product Qtys1 p1 300s1 p2 100s2 p3 500s3 p5 200s3 p1 450
Product Unit Pricep1 500p2 600p3 300p5 200
Normal Forms• Third normal form (3NF)• 3NF is based on the concept of transitive dependency. • Transitive dependencies are not allowed in 3NF. • Transitive dependency means, if in a relation if XY and YZ
hold, then X Z is also a functional dependency that holds on R. • Here X, Y, Z are attributes of the table and also Y should not be
a candidate key or a subset of any key (prime attribute) of the table R.
• Ie all non-prime attribute of table must be dependent on primary key
• Example. • Student3
Normal Forms• Third normal form (3NF)
• 3 FD’s here. that is• • Fd1 Stdidgrade• • Fd2 Stdid marks• • Fd3 Marks grade• • We can see that marks is not a prime attribute of student3.• Stdid grade is a transitive dependency because of Fd2 and Fd3.• • This is not allowed in 3NF.
Normal Forms• Third normal form (3NF)• A relation R is said to be in 3NF, if R is in 2NF and also no non prime
attribute of R is transitively dependent on the key of R.
• The above relation schema student3 is in 2NF, since there are no partial dependencies on a key exists.
• But it is not in 3NF because of the transitive dependency stdid grade via ‘marks’.
• We can normalize student3 by decomposing it in to two 3NF relation schemas,
• Student3A and student3B as follows.• • Student3A (stdid, branch, sem, rn, name, marks)• Student3B (marks, grade)
Normal Forms• Third normal form (3NF)• Student3A and student3B as follows.• • Student3A (stdid, branch, sem, rn, name, marks)• Student3B (marks, grade)
Normal Forms
• Boyce Codd Normal form (BCNF)• It higher form of 3NF• This is because every relation in BCNF is als ion 3NF. • However a relation in 3NF may not be in BCNF.• A relation schema R is in BCNF if whenever a
non trivial functional dependency X A holds in R, then X is a superkey of R.
• The only difference between BCNF and 3NF is that the condition (b) of 3NF is absent from BCNF.
•
Normal Forms• Boyce Codd Normal form (BCNF)
• Here we can see that the relation Lots1A is not in BCNF, but it is in 3NF.
• FD5 violates BCNF because area is not a superkey.• Fd1 and Fd2 satisfies BCNF because the LHS are super keys.• So remove the attribute (county name) and place it in another
relation.
Normal Forms• Boyce Codd Normal form (BCNF)
Normal Forms• Fourth Normal Form (4NF)• A relation is in 4NF if • It is in BCNF• And• It has no multivalued dependency
• In this relation • Each subject has a well-defined set of trainers• Eg: MCA subject has 3 trainers• Each subject has a well-defined set of textbooks• Eg: mca subject has 2 textbooks• The textbook that are used for a given subject are independent of
the trainers
Subject Trainer Textbooks
MCAAjiJinsonLisha
DBMSJAVA
Computer JK OSC++
Normal Forms• Fourth Normal Form (4NF)• The table has been converted to a realtion by filling in
all of empty rows• The relation is in 1NF
• The primary key of this relation consists of all three attributes
• Since there is no determinants other than the primary key, this relation is actually in BCNF
Subject Trainer TextbooksMCA AJI DBMSMCA AJI JAVAMCA Jinson DBMSMCA Jinson JAVAMCA Lisha DBMSMCA Lisha JAVA
Computer JK OSComputer JK C++
Normal Forms• Fourth Normal Form (4NF)• Suppose for teaching MCA a new trainer
comes, it is necessary to create 2 new tuples , one for each of the 2 text books.
• See that it is not necessary to include all faculty• Decomposition cannot be made on the basis of
functional dependencies, because there are no functional dependencies in the relation.
• So we introduce multi valued dependencies (MVDs) in the relation.
Normal Forms• Multivalued dependencies and 4NF• Subject trainer• Subjecttextbook• Double arrows are used here. • Read as• “ subject multidetermines trainer”• or “trainer is multidependent on subject”)• we know that a subject does not have a single corresponding trainer, ie..• functional dependency subject trainer does not hold. • But each subject has a well defined set of corresponding trainers. • By well defined here means that for a given subject(MCA) and a given text
book (DBMS) the set of trainers (AJI,Jinson,Lisha) matching the pair (MCA,Computer) in the relation depends on the value of MCA alone.
• It makes no difference what particular value of text book we choose.• The second MVD can also be interpreted like this.
Normal Forms• Definition of multi valued dependency• Let R be a table, and let A, B, C be arbitrary subsets
of the set of attributes of R.• Then we say that B is multidependent on A , AB.• If and only if the set of B values matching a given
( A value, C value pair) in R depends only on the A value and is independent of the C value.
• MVDs always go together in pairs. That is given the table R (A, B, C), the MVD
• AB holds if and only if A C also holds.
Normal Forms
• Fourth normal form• This is based on multivalued functional
dependencies.• A relation schema R is in 4NF with respect to a
set of dependencies F (that includes FDs and• MVDs) if, for every non trivial multivalued
dependency• X Y, X is a super key of R.• Consider the table.
Subject Trainer TextbooksMCA AJI DBMSMCA AJI JAVAMCA Jinson DBMSMCA Jinson JAVAMCA Lisha DBMSMCA Lisha JAVA
Computer JK OSComputer JK C++
Normal Forms• Fourth normal form• The table or relation CFX is not in fourth normal
form because The MVDs subject textbook and subjecttrainer are not satisfying any of the 2 conditions of fourth normal form.
• So decomposing it into tables
Subject TrainerMCA AJIMCA JinsonMCA Lisha
Computer JK
Subject TextbooksMCA DBMSMCA JAVA
Computer OSComputer C++
Normal Forms• Lossless join decomposition• Consider the example EMP
• Suppose we decompose the EMP table into Emp_projects and Emp_dependents.
• Emp_projects Emp_dependents
ename Pname DnameSmith X JohnSmith Y AnnaSmith X Annasmith Y John
ename PnameSmith XSmith Y
ename DnameSmith JohnSmith Anna
Normal Forms
• Lossless join decomposition• Suppose we again join these tables we can
see that we get the original EMP table. • So this decomposition of EMP table in to
Emp_projects and Emp_dependents is a lossless join decomposition because nothing is lost after a decomposition.
Normal Forms
• Lossless join decomposition
• Suppose we decompose the supply table in to two that is R1 and R2.
• We get
Normal Forms• Lossless join decomposition• If we again join these two tables R1 and R2 we will get
• the join of these tables will not give our original table supply. So this is a lossy join
• decomposition because after decomposing the Supply table we have lost some values.
Normal Forms• Join dependencies and fifth normal form• In some cases there may be no lossless join
decomposition of a table R into 2 tables but there may be a lossless join decomposition into more than 2 tables.
• For example in the supply table
Normal Forms• Join dependencies and fifth normal form
• If we decompose the supply table in to 3
Normal Forms
• Join dependencies and fifth normal form• Here we can see that if we again join these
tables R1, R2, R3 we will get the original table. • We can see that by joining just R1 and R2 will
not get the supply table. • But by joining all these 3 tables we will get the
supply table
Normal Forms• Join dependencies and fifth normal form• So we are moving to another type of dependency called Join dependency.• If a join dependency is present in a table we perform decomposition to fifth
normal form(5NF)• Here for the supply table the join dependency is specified by JD (R1, R2, R3)• This is because by joining R1 and R2 and R3 tables we will get the original
table ‘Supply’.• JD (R1, R2, R3) can also be written as• JD( (sname, partname), (sname, projname), (partname,projname) )• We can see that JD( R1, R2) is not valid for the supply table because on
joining R1 and R2 we will not get the Supply table.• Trivial join dependency• For a table R, a join dependency specified as JD(R1, R2, R3…) is trivial,
if any of these Ri’ s is the table R.
Normal Forms• Fifth normal form• It is also called project join normal form.• A relation schema is in fifth normal form (5NF) ,
if for every nontrivial join dependency JD( R1, R2, R3…), every Ri is a superkey of R.
• example
Normal Forms• Fifth normal form• The key of this table is (sname, partname,
projname) • We have seen that it has a join dependency• JD { (sname,partname),(sname,projname),
(partname,projname) }• Here the projections are (sname,partname),
(sname,projname) and (partname,projname).
Normal Forms• Fifth normal form• We can say that this table supply is not in 5NF
because of this join dependency • Each of these projections do not form a
superkey of supply. • Superkey of supply is (sname,projname,partname).
• (sname,partname) is not a superkey. • (sname,projname) is not a super key.• (partname,projname) is not a superkey.
Normal Forms• Fifth normal form• So we have to normalise this table supply in to
tables that satisfy 5NF. • We are decomposing the table supply by
considering the JD. • Take each of the projections in the JD and form
tables as
Normal Forms• Fifth normal form
Normal Forms• Fifth normal form• See that each of these R1, R2, R3 are in fifth
normal form because there are no non trivial join dependencies in each of these tables.
• • A join dependency is very difficult to detect in
practice. • So it is not normally applied in a database.
Normal Forms• Fifth normal form
Normal Forms
Recommended