Upload
olivia-fields
View
221
Download
1
Embed Size (px)
Citation preview
Normalization
Are we Normal
Normalization
Normalization is the process of converting complex data structures into simple, stable data structures
It also is the process of removing from a database certain “anomalies”
Anomalies Update anomalies—you have to update
a record in a number of different places Insertion anomalies—Example: in order
to insert a new employee a project must be assigned. If there is no project yet a phantom one must be created.
Deletion anomalies—Two types: when you delete a record other vital information is lost, or must delete in several places with the possibility of leaving unattached data islands
Normal Forms
There are many Normal Forms—or stages of normalization possible, but we will only focus on the first three.
First Normal Form1. There are no duplicated rows in the
table.2. Each cell is single-valued (i.e., there
are no repeating groups or arrays).3. Entries in a column (attribute, field)
are of the same kind.
Example Table 1
CDID CDTITLE TrackTitle Artist Artist Country
1 Sergeant Pepper Sergeant Pepper, Lucy in the Sky, With a little Help
Beatles UK
2 Blood on the Tracks
Tangled up in Blue, Idiot Wind
Dylan US
CDS Table 1
Another Example Table
CDID CDTitle Track1 Track2 Track3
1 Sergeant Pepper
Sergeant Peppers lonely hearts club band
Lucy in the sky with diamonds
With a little help
2 Blood on the Tracks
Tangled up in Blue
Idiot Wind Bucket of Rain
Normalizing The sample tables have repeating
groups—ie the tracks associated with each CD.
Each column must contain only a single value
You also don’t want to find yourself numbering columns like track1, etc.
The next table puts the sample table into first normal form
First Normal Form Sample
CDID CDTITLE TrackTitle Artist Artist Country
1 Sergeant Pepper Sergeant Pepper
Beatles UK
1 Sergeant Pepper Lucy in the Sky Beatles UK
1 Sergeant Pepper With a little Help
Beatles UK
2 Blood on the Tracks
Tangled up in Blue
Dylan US
2 Blood on the Tracks
Idiot Wind Dylan US
Second Normal Form
• A table is in 2NF if it is in 1NF and if all non-key attributes are dependent on all of the key and nothing else.
• This is called functional dependency
Normalizing. . . In our sample table there are really
two separate things going on One is the CD information and one
is the track information- To get all track information creates
a lot of redundancy in the CD information
Each should be dependent on their own key
Second Normal Form Sample
CDID CDTitle
1 Sergeant Pepper
2 Blood on the Tracks
TrackID TrackTitle CDID Artist Artist Country
1 Sergeant Pepper 1 Beatles UK
2 Lucy in the Sky 1 Beatles UK
3 With a little help 1 Beatles UK
4 Tangled up in Blue
2 Dylan US
5 Idiot Wind 2 Dylan US
Third Normal Form
• A table is in 3NF if it is in 2NF and if it has no transitive dependencies.
• This means that the non primary key attributes don’t depend on each other.
• Look at our second sample table:
Sample TableTrackID TrackTitle CDID Artist Artist Country
1 Sergeant Pepper 1 Beatles UK
2 Lucy in the Sky 1 Beatles UK
3 With a little help 1 Beatles UK
4 Tangled up in Blue 2 Dylan US
5 Idiot Wind 2 Dylan US
Normalizing
There is a transitive dependency here
Artist Country is dependent on Artist, not on TrackID which is the key field of the table
The following tables resolve this:
Better
TrackID TrackTitle CDID ArtistID
1 Sergeant Pepper 1 1
2 Lucy in the Sky 1 1
3 With a little Help 1 1
4 Tangled up in Blue 2 2
5 Idiot wind 2 2
ArtistID Artist ArtistCountry
1 Beatles UK
2 Dylan US
Summary
Through the process of normalization our original table has become three tables, related by foreign keys:
CDs(CDID, CDTitle)ARTISTS(ArtistID, Artist, ArtistCountry)TRACKS(TrackID, TrackTitle, CDID, ArtistID)
MORE…
1. Boyce Codd Normal Form2. Fourth Normal Form3. Fifth Normal Form