Upload
stanley-johnston
View
222
Download
0
Tags:
Embed Size (px)
Citation preview
Dr. Mohamed Osman Hegaz 1
Logical data base design (2)Logical data base design (2)
NormalizationNormalization
Dr. Mohamed Osman Hegaz 2
Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations (Bad relation: relation contains redundancy, or duplicated values and cause update anemones) Normal form: Condition using keys and FDs of a relation to certify whether a relation schema is in a particular normal form
Dr. Mohamed Osman Hegaz 3
The Process of Normalization
Formal technique for analyzing a relation based on its primary key and functional dependencies between its attributes.
Often executed as a series of steps. Each step corresponds to a specific normal form, which has known properties.
As normalization proceeds, relations become progressively more restricted (stronger) in format and also less vulnerable to update anomalies.
Dr. Mohamed Osman Hegaz 4
Relationship Between Normal Forms
Dr. Mohamed Osman Hegaz 5
“Key” Concepts
- Superkey - A set of attributes such that no two tuples have the same values for these attributes
– Primary key - A selected candidate key
Dr. Mohamed Osman Hegaz 6
Unnormalized Form (UNF) A table that contains one or
more repeating groups.
To create an unnormalized table: transform data from information
source (e.g. form) into table format with columns and rows.
Dr. Mohamed Osman Hegaz 7
First Normal Form (1NF) A relation in which intersection of each row
and column contains one and only one value.• A relation schema is in 1NF if domains of
attributes include only atomic (simple, indivisible) values and the value of an attribute is a single value from the domain of that attribute
1NF disallows– having a set of values, a tuple of values, or a
combination of both as an attribute value for a single tuple
– “relations within relations” and “relations as attributes of tuples
Dr. Mohamed Osman Hegaz 8
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the unnormalized table.
Identify repeating group(s) in unnormalized table which repeats for the key attribute(s).
Dr. Mohamed Osman Hegaz 9
UNF to 1NF
Remove repeating group by: entering appropriate data into the
empty columns of rows containing repeating data (‘flattening’ the table).
Or by placing repeating data along with
copy of the original key attribute(s) into a separate relation.
Dr. Mohamed Osman Hegaz 10
Non- 1NF Relation
Dr. Mohamed Osman Hegaz 11
Relations in 1NF
Dr. Mohamed Osman Hegaz 12
Relations in 1NF
Dr. Mohamed Osman Hegaz 13
(a) Relation schema that is not in 1NF.
(b) Example relation instance. (c) 1NF relation with redundancy.
Dr. Mohamed Osman Hegaz 14
(a) Schema of the EMP_PROJ
relation with a "nested relation“
PROJS.
(b) Example extension of theEMP_ PROJ relation showing nested relations within each tuple
Dr. Mohamed Osman Hegaz 15
Decomposing EMP_PROJ into 1NF relations
EMP_PROJ1 and EMP_PROJ2 by propagating the primary key.
Dr. Mohamed Osman Hegaz 16
Second Normal Form (2NF)
A relation schema is in 2NF if it is in 1NF, and every non- prime attribute is fully functionally dependent on the primary key
A FD X -> Y is termed “full” if removal of any attribute from X means that the FD no longer holds
A FD X -> Y is termed “partial” if some attribute can be removed from X and the dependency still holds
Dr. Mohamed Osman Hegaz 17
1NF to 2NF Identify primary key for the 1NF relation.
Identify functional dependencies in the relation.
If partial dependencies exist on the primary key remove them by placing them in a new relation along with copy of their determinant.
Dr. Mohamed Osman Hegaz 18
Normalizing EMP_PROJ into 2NF relations
Dr. Mohamed Osman Hegaz 19
Third Normal Form (3NF)Third Normal Form (3NF)
Based on concept of transitive dependency: A, B and C are attributes of a relation such that
if A B and B C, then C is transitively dependent on A through
B. (Provided that A is not functionally dependent on B or C).
3NF - A relation that is in 1NF and 2NF and in which no non-primary-key attribute is transitively dependent on the primary key.
Dr. Mohamed Osman Hegaz 20
2NF to 3NF
Identify the primary key in the 2NF relation.
Identify functional dependencies in the relation.
If transitive dependencies exist on the primary key remove them by placing them in a new relation along with copy of their determinant.
Dr. Mohamed Osman Hegaz 21
Normalizing EMP_DEPT into 3NF relations
Dr. Mohamed Osman Hegaz 22
General Definitions of 2NF and 3NF
Second normal form (2NF) A relation that is in 1NF and every non-primary-
key attribute is fully functionally dependent on any candidate key.
Third normal form (3NF) A relation that is in 1NF and 2NF and in which no
non-primary-key attribute is transitively dependent on any candidate key.
Dr. Mohamed Osman Hegaz 23
Boyce–Codd Normal Form (BCNF)
Based on functional dependencies that take into account all candidate keys in a relation, however BCNF also has additional constraints compared with general definition of 3NF.
BCNF - A relation is in BCNF if and only if every determinant is a candidate key.
Dr. Mohamed Osman Hegaz 24
Boyce–Codd normal form (BCNF)
Difference between 3NF and BCNF is that for a functional dependency A B, 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key.
Whereas, BCNF insists that for this dependency to remain in a relation, A must be a candidate key.
Every relation in BCNF is also in 3NF. However, relation in 3NF may not be in BCNF.
Dr. Mohamed Osman Hegaz 25
Summary :Summary :
2NF, 3NF, BCNF based on keys and FDs of a relation schema
4NF based on keys, multi-valued dependencies : MVDs; 5NF based on keys, join dependencies : JDs
Additional properties may be needed to ensure a good relational design (lossless join, dependency preservation)
Dr. Mohamed Osman Hegaz 26
Summary (cont)Summary (cont): : Normalization is carried out in practice so that
the resulting designs are of high quality and meet the desirable properties
The practical utility of these normal forms becomes questionable when the constraints on which they are based are hard to understand or to detect
The database designers need not normalize to the highest possible normal form. (usually up to 3NF, BCNF or 4NF)
Denormalization: the process of storing the join of higher normal form relations as a base relation—which is in a lower normal form
Dr. Mohamed Osman Hegaz 27
Summary (cont):Summary (cont):
Definitions of Keys and Attributes Definitions of Keys and Attributes Participating in Keys (1)Participating in Keys (1)
A superkey of a relation schema R = {A1, A2, ...., An} is a set of attributes S subset-of R with the property that no two tuples t1 and t2 in any legal relation state r of R will have t1[S] = t2[S]
A key K is a superkey with the additional property that removal of any attribute from K will cause K not to be a superkey any more.
Dr. Mohamed Osman Hegaz 28
Summary (cont):Summary (cont):
Definitions of Keys and Attributes Definitions of Keys and Attributes Participating in Keys (1)Participating in Keys (1)
If a relation schema has more than one key, each is called a candidate key. One of the candidate keys is arbitrarily designated to be the primary key, and the others are called secondary keys.
A Prime attribute must be a member of some candidate key
A Nonprime attribute is not a prime attribute—that is, it is not a member of any candidate key.