21
Normalization Normalization

Normalization Accepted

Embed Size (px)

DESCRIPTION

About Normalization

Citation preview

Page 1: Normalization Accepted

NormalizationNormalization

Page 2: Normalization Accepted

Database Management SystemDatabase Management System

A A database database is a collection of relevant is a collection of relevant data. A data. A database management system database management system (DBMS)(DBMS) is a set of computer-based is a set of computer-based application programs that support the application programs that support the processes of storing, manipulating, processes of storing, manipulating, retrieving and presenting the data within retrieving and presenting the data within the database. The data within the the database. The data within the database are organized into database are organized into tables, tables, records, records, and and fields.fields.

Page 3: Normalization Accepted

Relational Database Management Relational Database Management SystemSystem

RDBMS is a Relational Data Base RDBMS is a Relational Data Base Management System Relational DBMS. Management System Relational DBMS. This adds the additional condition that the This adds the additional condition that the system supports a tabular structure for the system supports a tabular structure for the data, with enforced relationships between data, with enforced relationships between the tables. This excludes the databases the tables. This excludes the databases that don't support a tabular structure or that don't support a tabular structure or don't enforce relationships between tables.don't enforce relationships between tables.

Page 4: Normalization Accepted

Difference Between RDBMS and Difference Between RDBMS and DMBSDMBS

DBMS are for smaller organizations with sDBMS are for smaller organizations with small amount of data, where security of the mall amount of data, where security of the data is not of major concern and RDBMS data is not of major concern and RDBMS are designed to take care of large amountare designed to take care of large amounts of data and also the security of this data. s of data and also the security of this data.

Page 5: Normalization Accepted

Database NormalizationDatabase Normalization Database normalizationDatabase normalization is the process of removing is the process of removing

redundant data from your tables in to improve storage redundant data from your tables in to improve storage efficiency, data integrity, and scalability. efficiency, data integrity, and scalability.

In the relational model, methods exist for quantifying how In the relational model, methods exist for quantifying how efficient a database is. These classifications are called efficient a database is. These classifications are called normal formsnormal forms (or (or NFNF), and there are algorithms for ), and there are algorithms for converting a given database between them.converting a given database between them.

Normalization generally involves splitting existing tables Normalization generally involves splitting existing tables into multiple ones, which must be re-joined or linked into multiple ones, which must be re-joined or linked each time a query is issued. each time a query is issued.

Page 6: Normalization Accepted

HistoryHistory

Edgar F. Codd first proposed the process of Edgar F. Codd first proposed the process of normalization and what came to be known as normalization and what came to be known as the the 1st normal form1st normal form in his paper in his paper A Relational A Relational Model of Data for Large Shared Data BanksModel of Data for Large Shared Data Banks Codd stated:Codd stated:““There is, in fact, a very simple elimination There is, in fact, a very simple elimination procedure which we shall call normalization. procedure which we shall call normalization. Through decomposition nonsimple domains are Through decomposition nonsimple domains are replaced by ‘replaced by ‘domains whose elements are domains whose elements are atomic (nondecomposable) values.’”atomic (nondecomposable) values.’”

Page 7: Normalization Accepted

Normal FormNormal Form

Edgar F. Codd originally established three Edgar F. Codd originally established three normal forms: 1NF, 2NF and 3NF. There normal forms: 1NF, 2NF and 3NF. There are now others that are generally are now others that are generally accepted, but 3NF is widely considered to accepted, but 3NF is widely considered to be sufficient for most applications. Most be sufficient for most applications. Most tables when reaching 3NF are also in tables when reaching 3NF are also in BCNF (Boyce-Codd Normal Form). BCNF (Boyce-Codd Normal Form).

Page 8: Normalization Accepted

Table 1Table 1

TitleTitle Author1Author1 AuthorAuthor22

ISBNISBN SubjectSubject PagesPages PublisherPublisher

Database Database System System ConceptsConcepts

Abraham Abraham SilberschatzSilberschatz

Henry Henry F. KorthF. Korth

00729588630072958863 MySQL, MySQL, ComputersComputers

11681168 McGraw-HillMcGraw-Hill

Operating Operating System System ConceptsConcepts

Abraham Abraham SilberschatzSilberschatz

Henry Henry F. KorthF. Korth

04716946650471694665 ComputersComputers 944944 McGraw-HillMcGraw-Hill

Page 9: Normalization Accepted

Table 1 problemsTable 1 problems

This table is not very efficient with storage.This table is not very efficient with storage.

This design does not protect data integrity. This design does not protect data integrity.

Third, this table does not scale well. Third, this table does not scale well.

Page 10: Normalization Accepted

First Normal FormFirst Normal Form

In our Table 1, we have two violations of In our Table 1, we have two violations of First Normal Form: First Normal Form:

First, we have more than one author field, First, we have more than one author field, Second, our subject field contains more Second, our subject field contains more

than one piece of information. With more than one piece of information. With more than one value in a single field, it would be than one value in a single field, it would be very difficult to search for all books on a very difficult to search for all books on a given subject. given subject.

Page 11: Normalization Accepted

First Normal TableFirst Normal Table

Table 2Table 2

TitleTitle AuthorAuthor ISBNISBN SubjectSubject PagesPages PublisherPublisher

Database System Database System ConceptsConcepts

Abraham Abraham SilberschatzSilberschatz

00729588630072958863 MySQLMySQL 11681168 McGraw-HillMcGraw-Hill

Database System Database System ConceptsConcepts

Henry F. KorthHenry F. Korth 00729588630072958863 ComputersComputers 11681168 McGraw-HillMcGraw-Hill

Operating System Operating System ConceptsConcepts

Henry F. KorthHenry F. Korth 04716946650471694665 ComputersComputers 944944 McGraw-HillMcGraw-Hill

Operating System Operating System ConceptsConcepts

Abraham Abraham SilberschatzSilberschatz

04716946650471694665 ComputersComputers 944944 McGraw-HillMcGraw-Hill

Page 12: Normalization Accepted

We now have two rows for a single book. We now have two rows for a single book. Additionally, we would be violating the Additionally, we would be violating the Second Normal Form…Second Normal Form…

A better solution to our problem would be A better solution to our problem would be to separate the data into separate tables- to separate the data into separate tables- an Author table and a Subject table to an Author table and a Subject table to store our information, removing that store our information, removing that information from the Book table: information from the Book table:

Page 13: Normalization Accepted

Subject_IDSubject_ID SubjectSubject

11 MySQLMySQL

22 ComputersComputers

Author_IDAuthor_ID Last NameLast Name First NameFirst Name

11 SilberschatzSilberschatz AbrahamAbraham

22 KorthKorth HenryHenry

ISBNISBN TitleTitle PagesPages PublisherPublisher

00729588630072958863 Database System Database System ConceptsConcepts

11681168 McGraw-HillMcGraw-Hill

04716946650471694665 Operating System Operating System ConceptsConcepts

944944 McGraw-HillMcGraw-Hill

Subject Table

Author Table

Book Table

Page 14: Normalization Accepted

Each table has a primary key, used for Each table has a primary key, used for joining tables together when querying the joining tables together when querying the data. A primary key value must be unique data. A primary key value must be unique with in the table (no two books can have with in the table (no two books can have the same ISBN number), and a primary the same ISBN number), and a primary key is also an index, which speeds up data key is also an index, which speeds up data retrieval based on the primary key. retrieval based on the primary key.

Now to define relationships between the Now to define relationships between the tablestables

Page 15: Normalization Accepted

RelationshipsRelationships

ISBNISBN Author_IDAuthor_ID

00729588630072958863 11

00729588630072958863 22

04716946650471694665 11

04716946650471694665 22

ISBNISBN Subject_IDSubject_ID

00729588630072958863 11

00729588630072958863 22

04716946650471694665 22

Book_Author Table

Book_Subject Table

Page 16: Normalization Accepted

Second Normal FormSecond Normal Form

As the First Normal Form deals with redundancy As the First Normal Form deals with redundancy of data across a horizontal row, Second Normal of data across a horizontal row, Second Normal Form (or 2NF) deals with redundancy of data in Form (or 2NF) deals with redundancy of data in vertical columns. vertical columns.

As stated earlier, the normal forms are As stated earlier, the normal forms are progressive, so to achieve Second Normal progressive, so to achieve Second Normal Form, the tables must already be in First Normal Form, the tables must already be in First Normal Form. Form.

The Book Table will be used for the 2NF The Book Table will be used for the 2NF exampleexample

Page 17: Normalization Accepted

2NF Table2NF Table

Publisher_IDPublisher_ID Publisher NamePublisher Name

11 McGraw-HillMcGraw-Hill

ISBNISBN TitleTitle PagesPages Publisher_IDPublisher_ID

00729588630072958863 Database System Database System ConceptsConcepts

11681168 11

04716946650471694665 Operating System Operating System ConceptsConcepts

944944 11

Publisher Table

Book Table

Page 18: Normalization Accepted

2NF2NF

Here we have a one-to-many relationship Here we have a one-to-many relationship between the book table and the publisher. A between the book table and the publisher. A book has only one publisher, and a publisher will book has only one publisher, and a publisher will publish many books. When we have a one-to-publish many books. When we have a one-to-many relationship, we place a foreign key in the many relationship, we place a foreign key in the Book Table, pointing to the primary key of the Book Table, pointing to the primary key of the Publisher Table.Publisher Table.

The other requirement for Second Normal Form The other requirement for Second Normal Form is that you cannot have any data in a table with a is that you cannot have any data in a table with a composite key that does not relate to all portions composite key that does not relate to all portions of the composite key. of the composite key.

Page 19: Normalization Accepted

Third Normal FormThird Normal Form

Third normal form (3NF) requires that Third normal form (3NF) requires that there are no functional dependencies of there are no functional dependencies of non-key attributes on something other non-key attributes on something other than a candidate key. than a candidate key.

A table is in 3NF if all of the non-primary A table is in 3NF if all of the non-primary key attributes are mutually independent key attributes are mutually independent

There should not be transitive There should not be transitive dependencies dependencies

Page 20: Normalization Accepted

Boyce-Codd Normal FormBoyce-Codd Normal Form

BCNF requires that the table is 3NF and BCNF requires that the table is 3NF and only determinants are the candidate keysonly determinants are the candidate keys

Page 21: Normalization Accepted

ENDEND