Lecture 5 - Normalization of Relational Tables

Embed Size (px)

Citation preview

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    1/29

    Normalization of Tables

    Between two evils, choose neither; between two goods, choose both.Tryon Edwards

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    2/29

    S teps to E-R Transformation

    1. Identify entities2. Identify relationships3. Determine relationship type4. Determine level of participation

    5. Assign an identifier for each entity6. Draw completed E-R diagram7. Deduce a set of preliminary skeleton tables along with a

    proposed primary key for each table (using rules provided)8. Develop a list of all attributes of interest (not already

    listed and systematically assign each to a table in such away to achieve a 3NF design (i.e., no repeating groups,no partial dependencies, and no transitive dependencies)

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    3/29

    Tables

    Database design is the process of separatinginformation into multiple tables that are related toeach other Single table designs work only for the simplest of situations in which data integrity problems are easyto correctAnomalies (abnormalities) often arise in singletable designs as a result of inserting, deleting, or updating recordsSome tables are better structured than others (i.e.,

    result in fewer anomalies)

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    4/29

    Redundancy

    U nnecessary repetition or duplication of dataincreases likelihood of errors due to keyinginconsistencies

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    5/29

    Multi-valued Problems

    Solution 1? Include all authors names in a single fieldDifficult to search for a single authors name or create analphabetical list of authors

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    6/29

    Multi-valued Problems

    Solution 2? Add multiple columns, one for each valueempty fields waste storage spaceawkward to search across fields (e.g., Any books by Snoopy? Mustsearch Author1, Author2, etc.)

    necessitates the creation of a new column every time a book has anadditional author

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    7/29

    Multi-valued Problems

    Solution 3? Add multiple rows, one for each valueData about a book must be repeated for as many times as there areauthors of a book (also creates redundancy which lead to keying errorsand unnecessarily wasting storage space with large files)count of total # of books or # from each publisher would be wrong

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    8/29

    Update Anomalies

    U pdate AnomaliesTo update an agents telephone number, each instance must bechangedif we miss an item or enter it incorrectly we create an unreliabletable

    sometimes previous errorspropagate errors further

    An update anomaly occurs when multiple record changes

    for a single attribute are necessary.

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    9/29

    Deletion Anomalies

    Deletion anomaliesWhat happens if a customer record is deleted?What happens if an agent record is deleted?

    A deletion anomaly occurs when the removal of arecord results in the unintended loss of importantinformation.

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    10/29

    Insertion Anomalies

    Insertion anomaliesWhat happens if we want to enter information regardingan agent for whom we do not have a customer?Do we add n ull values (blanks) for the other fields?

    An insertion anomaly occurs when there is not areasonable place to assign attributes and attribute valuesto records.

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    11/29

    The Problem with Nulls

    C tegor Tot l O ure es

    0

    Accessories 2

    Bikes 1

    Components 1

    1. Nulls used in mathematical expressions- unknown quantity leads to unknown total value- misleading value of all inventory

    Product Product escription Category Price Quantity Total Value

    801 Shur-Lock U-Lock Accessories 75.00

    802 SpeedRite Cyclecomputer 60.00 20 1,200.00

    803 SteelHead Microshell Helme Accessories 40.00 40 1,600.00

    804 SureStop 133-MB Brakes Components 25.00 10 250.00

    805 Diablo ATM Mountain Bike B ikes 1,200.00

    806 Ultravision Helmet Mount Mi 7.45 10 74.50

    Total: 3,124.50

    2. Nulls used in aggregate functions- blanks exist under category

    - cannot be counted because they dont exist!

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    12/29

    Database Desi n Problems

    U se of the relational database model removessome database anomalies

    Further removal of database anomalies relies on

    a structured technique called n ormalizatio nP resence of some of these anomalies issometimes justified in order to enhance

    performance

    Thus, database design consists of balancing theart of desig n with the scie n ce of desig n

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    13/29

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    14/29

    F irst Normal Form

    A table is in first normal form if it meets thefollowing criteria: T he data are stored i n a two-dime n sion al table with n o two rows ide n tical a n d there are n o repeati n g groups.

    The following table in NOT in first normal form becauseit contains a multi-valued attribute (an attribute with morethan one value in each row).

    Member_ID Memb_ Name Memb_LName Hobbies1 Rodney Jones hiking, cooking3 rancine Moire golf, theatre, hiking2 Anne Abel concerts

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    15/29

    Handlin multi-valued attributes: Incorrect S olutions

    Member_ I Memb_F ame Memb_L ame Hobb 1 Hobb 2 Hobb 31 Rodney Jones hiking cooking3 Francine Moire golf theatre hiking2 Anne Abel concerts

    Member_ I Memb_F ame Memb_L ame Hobbies1 Rodney Jones fishing1 Rodney Jones cooking3 Francine Moire golf 3 Francine Moire theatre3 Francine Moire hiking

    2 Anne Abel concerts

    Member_ I Memb_F ame Memb_L ame Hobbies1 Rodney Jones hiking, cooking3 Francine Moire golf, theatre, hiking2 Anne Abel concerts

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    16/29

    Handlin multi-valued attributes: orrect S olution

    ember ID emb ame emb ame1 Rodney Jones3 Francine Moire2 Anne Abel

    ember ID Hobb1 hiking1 cooking3 golf 3 theatre

    3 hiking2 concerts

    Create another entity (table) to handle multiple instances of the

    repeating group. This second table is then linked to the original tablewith an identifier (i.e., foreign key). This solution has the followingadvantages :

    no limit to the number of hobbies per member no waste of disk space

    searching becomes much easier within a column (e.g., who likes hiking?)

    Member_IDMemb_FName Memb_LName obbies1 Rodne y ones hiking , ooking3 Francine Moire golf , theatre , hiking2 Anne Abel concerts

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    17/29

    Handlin Repeatin GroupsA n attribute can have a group of several data entries. Repeating

    groups can be removed by creating another table which holds thoseattributes that repeat. This second table (validation table) is thenlinked to the original table with an identifier (i.e., foreign key)Advantages: fewer characters tables; reduces miskeying, updateanomalies Produ c Produ c ame a e ory Pri ce

    80 S ur-Lo ck U -Lo ck ccess ory 75. 00

    80 S ee dRit e yc ec o m ut e r Co m on e nt 6 0.00

    80 S tee Hea d Mic ro s e He me t Access ory 4 0.00

    804 S ur eS top 133- MB Br akes Co m pon e nt 25. 00

    805 Di a o AT M Mount a in Bike Bike 1,2 00 .00

    806 U tr av is ion He me t Mount Mirr A ccess ory 7.45

    C a te ory_ID C a te ory1 A ccess ory2 Co m pon e nt

    3 B ike

    Produ c t_ID Produ c t_Name Ca te ory Pri ce801 S ur-Lo ck U-Lo ck 1 75. 00802 S p ee dRit e Cyc ec o m put e r 2 6 0.00803 S tee Hea d Micro s e He me t 1 4 0.00804 S ur eS top 133- MB Br akes 2 25. 00805 Di a o AT M Mount a in Bike 3 12 00 .00806 U tr av ision He me t Mount Mirr 1 7.45

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    18/29

    S econd Normal FormA table is in second normal form if it meets the followingcriteria: T he relatio n is in first n ormal form, a n d, all n onk ey attributes are fu n ctio n ally depe n den t on the e n tireprimary k ey.

    Applies only to tables that have a composite primary key.

    In the following table, both the EmpID and Training (compositeprimary key) determine Date, whereas, only EmpID (part of theprimary key) determines Dept.

    Emp I T i i pt1 Word 12-Sep-99 Oncology3 Excel 14-Oct-99 Paediatrics2 Excel 14-Oct-99 Renal1 Access 23-Nov-99 Oncology

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    19/29

    Removin Partial DependenciesRemove partial dependencies by separating the relationinto two relations. Reduces the problems of

    update anomaliesdelete anomaliesinsert anomalies

    redundancies

    EmpID Training Date1 Word 12-Sep-993 Excel 14-Oct-992 Excel 14-Oct-991 Access 23-Nov-99

    EmpID Dept1 Oncology2 Renal3 Paediatrics

    EmpID Training Date Dept1 Word 12-Sep-99 Oncology3 Excel 14-Oct-99 Paediatrics2 Excel 14-Oct-99 Renal1 Access 23-Nov-99 Oncology

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    20/29

    Third Normal FormA table is in third normal form if it meets the following

    criteria: T he relatio n is i n seco n d n ormal form, a n d, an onk ey field is n ot fu n ctio n ally depe n den t on an other n onk ey field.

    The following table is in second normal form but NOT in third

    normal form because Member_Id (the primary key) does notdetermine every attribute (does not determine RegistrationFee).RegistrationFee is determined by Sport.

    Member_ID Memb_FName Memb_LName S po rt Re g s trati on Fee1 Rodney Jones Swimming $1003 Francine Moire Tennis $2002 Anne Abel Tennis $2004 Goro Azuma Skiing $150

    M ember ID p FName, LName, Lesson; Lesson p Cost

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    21/29

    Removin non-key Transitive Dependencies

    Remove transitive dependencies by placing attributesinvolved in a new relational table. Reduces the problemsof:

    update anomalies

    delete anomaliesinsert anomaliesredundancies

    MemberID MembFName MembLName Sport1 Rodney Jones 13 Francine Moire 22 Anne Abel 24 Goro Azuma 1

    SportID Sport RegFee1 Swimming $1002 Tennis $200

    3 Skiing $150

    MemberID MembFName MembLName Sport RegFee

    1 Rodney Jones Swimmin $1003 Francine Moire Tennis $2002 Anne Abel Tennis $2004 Goro Azuma Skiing $150

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    22/29

    Normalization Example: Video S toreA video rental shop tracks all of their information in one table. There

    are now 20,000 records in it. Is it possible to achieve a more efficientdesign? (They charge $10/movie/day.)t Name t addre t one ental date

    Rodney Jones 23 Richmond St. 681-9854 15-Oct-99Francine Moire 750-12 Kipps Lane 672-9999 4-Nov-99

    Anne Abel 5 Sarnia Road 432-1120 3-Sep-99Rodney Jones 23 Richmond St. 681-9854 22-Sep-99

    Video_1 Video_2 Video_3 VideoType_1 VideoType_2 VideoTypeGone with the WBraveheart Mississippi Bur Classic Adventure AdventureManhatten Comedy

    Manhatten The African Queen Comedy ClassicNever Say Never Silence of the Lambs Adventure Horror

    Re tu rn _ d a te To ta lPric e Pa id ?17-Oct-99 60 .00$ ye s

    4-S e p -99 20.00$ ye s26-S e p -99 80 .00$ ye s

    V IDEO (Cust_name, Cust_address,Cust_phone, Rental_date, V ideo_1,V ideo_2, V ideo_3, V ideoType_1,V ideoType_2, V ideoType3,Return_date, Total_ P rice, P aid?)

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    23/29

    Is the Video store in 1NF?No attributes should form repeating groups - remove them by creating

    another table. There are repeating groups for videos and customers.Cu st N u Cu st N e Cu st ddres Cu st hone

    1 R y R hm 68 1-9 85F M 50-12 K 672-9999

    3 Anne Abel 5 nia R a 432-112 0

    VideoNum VideoName VideoType1 Gone with the ind Classic2 Man hatten Co med y3 Never ay Never A Adventure4 Brave heart Adventure5 Mississippi Burning Adventure6 Th e African Queen Classic

    7 ilence of the Lam Horror

    CUS T MER (Cust_Nu m,Cust_Na me ,Cust_address_Cust_p hone

    VIDE (VideoNu m,VideoNa me , Video Type

    RENTAL (Cust_nu m, VideoNu m, R ental_date , R eturn_date , Total P rice , P aid?)Cust_Num VideoNum Rental_date Retu r n_date TotalP r ice Paid?

    1 1 ,4,5 15-Oct-99 1 7-Oct-99 60.00$ yes2 2 4-Nov-993 2 ,6 3-Sep-99 4-Sep-99 2 0.00$ yes1 3 ,7 22-Sep-99 2 6-Sep-99 80.00$ yes

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    24/29

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    25/29

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    26/29

    Cus t um Cus t me Cus t r es Cus t h e1 Rodney Jones 23 Richmond 681-98542 Francine Moire 750-12 Kipps 672-99993 Anne Abel 5 Sarnia Roa 432-1120

    VideoNum VideoName VideoType1 Gone with the Wind Classic2 Manhatten Comedy3 Never Say Never A Advent ure4 Braveheart Advent ure5 Mississippi B urning Advent ure6 The Af rican Queen Classic7 Silence o f the Lam Horror

    CUSTOMER (Cust_Nu m, Cust_N ame , Cust_ address , Cust_ phone

    VIDEO (Video Nu m, Video Name , VideoType

    Ren talNum Cus t Num Ren tal_da te Re tu r n_da te To talP r ice Paid?1 1 15- Oct-99 17- Oct-99 60.00$ yes2 2 4- Nov-993 3 3-Sep-99 4-Sep-99 20.00$ yes4 1 22-Sep-99 26-Sep-99 80.00$ yes

    Ren talNum VideoNum1 11 4

    1 52 23 23 64 34 7

    RENTAL (Rental Nu m, Cust_Nu m, Rental _ date , Ret urn_Date , Total P rice , P aid?)

    RENTALDETAILS

    (Rental Nu m, Video Num)

    Is th e Video S to r e in 2NF?The only table that has a composite primary key has no other

    fields, therefore, yes.

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    27/29

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    28/29

  • 8/7/2019 Lecture 5 - Normalization of Relational Tables

    29/29

    C onflictin Goals of Desi n

    Database des i n must reconc ile the ollow ing req uire ments:

    Des ign e legance req uires that the des ign must ad here to des ign r ules concern ing nulls , der i ed attr ibutes ,

    red undanc ies , re lat ions hi types , etc.Infor mat ion req uire ments are dictated by the end usersOperat iona l (transact ion ) speed req uire ments are a lso dictated by the end users

    Clear ly, an e legant database des ign that fa ils to address end user infor mat ion req uire ments or one that for ms the bas is for an imp le mentat ion whose use progresses at a sna il's pace has little pract ica luse.