Lect#5 - Normalization.ppt

Embed Size (px)

Citation preview

  • 7/23/2019 Lect#5 - Normalization.ppt

    1/42

    Normalization

    Pearson Education Limited 1995, 2005

  • 7/23/2019 Lect#5 - Normalization.ppt

    2/42

    Objectives2

    What is normalization and the purpose ofnormalization

    What is update anomalies? How normal forms can be transformed

    from lower normal forms to highernormal forms; 1NF, 2NF and 3NF

    Pearson Education Limited 1995, 2005

  • 7/23/2019 Lect#5 - Normalization.ppt

    3/42

    Purpose of

    Normalization3

    Normalization is a techniue of anal!zing and correcting tablestructure for producing a set of suitable relations that supportthe data reuirements of an enterprise" #esult$ a set of relations with minimized data redundancies

    %haracteristics of a suitable set of relations include$ the minimalnumber of attributes necessar! to support the

    data reuirements of the enterprise;

    attributes with a close logical relationship are found in thesame relation i"e each table represent a single sub&ect

    minimalredundanc! with each attribute represented onl!once with the important e'ception of attributes that form allor part of foreign (e!s i"e no data item will be unnecessaril!stored in more than 1 table

    )ll attributes in a table are dependant on the primar! (e!

    Pearson Education Limited 1995, 2005

  • 7/23/2019 Lect#5 - Normalization.ppt

    4/42

    Purpose of

    Normalization4

    *he bene+ts of using a database that hasa suitable set of relations is that thedatabase will be$ easier for the user to access and maintain

    the data reducing the opportunities fordata inconsistencies;

    ta(e up minimal storage space on thecomputer"

    Pearson Education Limited 1995, 2005

  • 7/23/2019 Lect#5 - Normalization.ppt

    5/42

    How NormalizationSupports Database Design5

    Pearson Education Limited 1995, 2005

  • 7/23/2019 Lect#5 - Normalization.ppt

    6/42

    Data Redundancy andUpdate nomalies6 a&or aim of relational database design

    -i"e normalization. is to group attributesinto relations to minimize data

    redundanc! /roblems associated with data

    redundanc! are illustrated b! comparing

    the Staffand Branchrelations with theStaffBranchrelation"

    Pearson Education Limited 1995, 2005

  • 7/23/2019 Lect#5 - Normalization.ppt

    7/42

    Data Redundancy andUpdate nomalies7

    Pearson Education Limited 1995, 2005

    Design 1

    Design 2

    D R d d d U d

  • 7/23/2019 Lect#5 - Normalization.ppt

    8/42

    Data Redundancy and Updatenomalies

    8

    StaffBranchrelation has redundant data; thedetails of a branch are repeated for e0er!member of sta" -#efer to esign 2.

    n contrast, the branch information -bAddress.appears onl! once for each branch in the

    Branchrelation and onl! the branch number-branchNo. is repeated in the Staff relation,to represent where each member of sta islocated" -#efer to esign 1.

    #elations that contain redundant information

    ma! potentiall! suer from update anomalies" *!pes of update anomalies include

    nsertion eletion

    odi+cation Pearson Education Limited 1995, 2005

  • 7/23/2019 Lect#5 - Normalization.ppt

    9/42

    !nsertion nomalies "

    #$amples9

    f esign 2is used, to enter the details of new sta with branch no" 4556

    would reuire that the correct details of branch no" 4556is entered so that it will be consistent with 0alues for

    branch no" 4556 in other tuples" 4ut if esign 1 relationis used, the! do not suer this potential inconsistenc! to insert a new branch that has no member, other

    attributes would consist null 0alues 7 this can 0iolateprimar! (e! re"

  • 7/23/2019 Lect#5 - Normalization.ppt

    10/42

    Deletion nomalies "

    #$ample10

    f esign 2 is used, if we delete a tuple fromthe relation that represents the last memberof sta located at a branch -branchNO8

    4556., the details of the branch is lostcompared to if we used the relations Staffand Branch relationsin esign 1"

  • 7/23/2019 Lect#5 - Normalization.ppt

    11/42

    %odi&cation nomalies "#$ample11

    f esign 2 is used, if the 0alue of theattribute is to be changed for e'amplebAddress = 22 eer #d, 9ondon, the

    other tuples with the same bAddressmustalso be updated"

  • 7/23/2019 Lect#5 - Normalization.ppt

    12/42

    '(e Need forNormalization12:'ample$ %ompan! that manages building

    pro&ects :ach pro&ect has its own pro&ect number, name,

    emplo!ees assigned to it

    :ach emplo!ee has an emplo!ee number, name &ob classi+cation

    %harges its clients b! billing hours spent oneach contract

    Hourl! billing rate is dependent on emplo!ee

  • 7/23/2019 Lect#5 - Normalization.ppt

    13/42

    '(e Need for Normalization

    )continued*13

  • 7/23/2019 Lect#5 - Normalization.ppt

    14/42

    '(e Need for Normalization)continued*

    14

    >tructure of data set in pre0ious +guredoes not handle data 0er! well odi+cation anomalies nsertion anomalies eletion anomalies

  • 7/23/2019 Lect#5 - Normalization.ppt

    15/42

    '(e NormalizationProcess15Wor(s through a series of stages called

    normal forms$ nnormalized form -NF. 7 ) table that contain

    one or more repeating groups First normal form -1NF. 7 table format; no

    repeating group

    >econd normal form -2NF. 7 1NF and no partial

    dependencies *hird normal form -3NF. 7 2NF and no transiti0e

    dependencies

  • 7/23/2019 Lect#5 - Normalization.ppt

    16/42

    '(e Process of

    Normalization16

    Pearson Education Limited 1995, 2005

  • 7/23/2019 Lect#5 - Normalization.ppt

    17/42

    +onversion to ,irst

    Normal ,orm17

    #epeating group eri0es its name from the fact that a group of

    multiple entries of same t!pe can e'ist for an!single (e! attribute occurrence

    :'" PROJ_NUM81= has = entries that arerelated because the! each share thePROJ_NUM81= characteristics

    #elational table must not contain repeating

    groups 8@ reAecting data redundancies Normalizing table structure will reduce data

    redundancies

  • 7/23/2019 Lect#5 - Normalization.ppt

    18/42

    +onversion to

    -N,)continued*18

    >tep 1$ :liminate the #epeating Broups /resent data in tabular format, where each

    cell has single 0alue and there are no

    repeating groups :liminate repeating groups, eliminate nulls

    b! ma(ing sure that each repeating groupattribute contains an appropriate data

    0alue

  • 7/23/2019 Lect#5 - Normalization.ppt

    19/42

    +onversion to

    -N,)continued*19

  • 7/23/2019 Lect#5 - Normalization.ppt

    20/42

    +onversion to

    -N,)continued*20

    >tep 2$ dentif! )ll ependencies

    De&nition. ) functional dependenc!occurs when one attribute in a relationuniuel! determines another attribute"

    *his can be written ) 4 which wouldbe the same as stating C4 is functionall!

    dependent upon )C or C) determines4C"

  • 7/23/2019 Lect#5 - Normalization.ppt

    21/42

    +onversion to

    -N,)continued*21

    ependencies can be depicted with help ofa diagram -or dependenc! notation ) 4."

    ependenc! diagram$

    epicts all dependencies found within gi0entable structure

    Helpful in getting bird

  • 7/23/2019 Lect#5 - Normalization.ppt

    22/42

    +onversion to

    -N,)continued*

    /artial dependenc!7 a dependenc! that that is based on onl!part of a composite primar! (e!

    *ransiti0e dependenc!7 a dependenc! of one nonEprimeattribute on another nonEprime attribute

    22

  • 7/23/2019 Lect#5 - Normalization.ppt

    23/42

    +onversion to

    -N,)continued*23

    >tep 3$ dentif! the /rimar! e! /rimar! (e! must uniuel! identif!

    attribute 0alue" n other words, if a 0alue

    of the (e! is gi0en, onl! one answer canbe returned for other attributes" For e'ample, PROJ_NUM in the sample

    schema cannot be a primar! (e!" *his is so

    since PROJ_NUM81= can identif! an! one of =emplo!ees so PROJ_NUMalone is not enoughto be used as a primar! (e!

  • 7/23/2019 Lect#5 - Normalization.ppt

    24/42

    +onversion to

    -N,)continued*24

    /rimar! (e! can be determined based onthe functional dependencies identi+edearlier"

    t can be a single attribute, i"e", thedeterminant which can determine uniuel! allattributes or

    %omposition of se0eral determinants which

    can co0er all attributes" GNote$ if there are fewpossibilities, choose the ones with biggestscope

  • 7/23/2019 Lect#5 - Normalization.ppt

    25/42

    +onversion to

    -N,)continued*

    /rimar! (e! is combination of pro&Inum and empInum"

    25

  • 7/23/2019 Lect#5 - Normalization.ppt

    26/42

    Result of -N,26

    #esult from 1NF normalization process will beone relation with all attributes listed"/rimar!Jcomposite (e! is underlined"

    :'ample$

    :/9KL::I/#KM:%* -pro&Inum, pro&Iname,empInum,

    empIname, &obIclass,chgIhr, hours.Relation name

    Primary/composite

    key

    All attributes

  • 7/23/2019 Lect#5 - Normalization.ppt

    27/42

    -N,27

    n First normal form $ )ll (e! attributes are de+ned *here are no repeating groups in the table that is

    each rowJcolumn intersection contains one andonl! one 0alue, not a set of 0alues

    )ll attributes are dependent on primar! (e! /roblem$ 1NF table structure contains partial

    dependencies

    >ometimes used for performance reasons, butshould be used with caution

    >till sub&ect to dataredundancies" :'" Whathappen if EMP_NUM 8 15= changes JOB_CLASS?

  • 7/23/2019 Lect#5 - Normalization.ppt

    28/42

    +onversion to Second

    Normal ,orm28

    #elational database design can be impro0ed b!con0erting the database into 2NF"

    2NF remo0es partial dependenc!

    f 1NF relation has a single attribute as primar!(e!, then the relation is automaticall! in its 2NFas well"

    /artial dependenc! can onl! happen when

    composite (e! e'ists" f we ha0e more than oneattribute in the (e!, then there are possibilitiesthat some attributes ma! depend on a portionof the (e! onl!"

  • 7/23/2019 Lect#5 - Normalization.ppt

    29/42

    +onversion to0N,)continued*29

    Step -$ Write :ach e! %omponent on a>eparate 9ine

    Write each (e! component on separate line,

    then write original -composite. (e! on last linePROJ_NUM

    EMP_NUM

    PROJ_NUM EMP_NUM

    :ach component will become (e! in newtableJrelation

    !"E# $% t&e key &as 2 attributes 'A,(), t&en possible components *ill

    be A, (, and A(+ $% components 'A,(,-), t&en it *ill be A, (, -, A(,

    A-, (- and A(-+

  • 7/23/2019 Lect#5 - Normalization.ppt

    30/42

    +onversion to 0N,)continued*30Step 0. )ssign %orresponding ependent )ttributes etermine those attributes that are dependent on

    other attributes(PROJ_NUM, PROJ_NAME)

    (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR)

    (PROJ_NUM, EMP_NUM, HOURS)

    )t this point, most anomalies ha0e beeneliminatedPROJECT(PROJ_NUM, PROJ_NAME)

    EMPLOYEE(EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR)

    ASSIGNMENT(PROJ_NUM, EMP_NUM, HOURS)

  • 7/23/2019 Lect#5 - Normalization.ppt

    31/42

    +onversion to

    0N,)continued*31

    f at the end, there are relations withonl! the (e!s in them -e'cept for therelation ha0ing all (e!s., then the

    relations can be eliminated" #elation is in second normal form -2NF.

    when it includes no partial dependencies

  • 7/23/2019 Lect#5 - Normalization.ppt

    32/42

    not(er tec(ni1ue toconvert to 0N,32

    Step -. Write the original 1NF relation:/9KL::I/#KM:%* -pro&Inum, pro&Iname, empInum,

    empIname, &obIclass, chgIhr, hours.

    Step 0. For each partial dependenc!, create a newrelation, with the determinant as (e!" n the 1NF,delete the dependents and circleJitalicJcolored the(e! -foreign (e!."

    :/9KL::I/#KM:%* -pro&Inum, empInum,

    empIname, &obIclass, chgIhr, hours.

    /#KM:%* -pro&Inum, pro&Iname.

    Pro.num

    pro.name

    #e eat for other artial

  • 7/23/2019 Lect#5 - Normalization.ppt

    33/42

    Result of 0N,33

    #esult from 2NF normalization process will bemultiple relations" :ach primar!Jcomposite (e!is underlined" :ach foreign (e! identi+ed-circleJitalicJcolored.

    :'ample$

    :/9KL::I/#KM:%* -pro&Inum, empInum,hours.

    /#KM:%* -pro&Inum, pro&Iname

    :/9KL:: -empInum, empIname, &obIclass,chgIhr.

  • 7/23/2019 Lect#5 - Normalization.ppt

    34/42

    +onversion to

    0N,)continued*34

  • 7/23/2019 Lect#5 - Normalization.ppt

    35/42

    +onversion to '(irdNormal ,orm35 3NF remo0es transiti0e dependenc!

    >tep 1$ Write the pre0ious 2NF relations

    >tep 2$ For each transiti0e dependenc! -nonE(e! dependents on another nonE(e!., createa new relation with determinant as (e!"

    >tep 3$ n the original 2NF, delete thedependents" a(e the (e! into foreign (e!-circleJitalicJcolored."

  • 7/23/2019 Lect#5 - Normalization.ppt

    36/42

    Result of 2N,36

    #esult from 3NF normalization process will bemultiple relations" :ach primar!Jcomposite (e!is underlined" :ach foreign (e! identi+ed-circleJitalicJcolored.

    :'ample$

    :/9KL::I/#KM:%* -pro&Inum, empInum,hours.

    /#KM:%* -pro&Inum, pro&Iname

    :/9KL:: -empInum, empIname, &obIclass.

    MK4 -&obIclass, chgIhr.ee t&e di%%erence+ $n 2, t&e key is bot&

    %+k+ and p+k+ $n , t&e key is %+k+ only

  • 7/23/2019 Lect#5 - Normalization.ppt

    37/42

    +onversion to 2N,

    )continued*

    Note$ Kriginal EMPLOYEEtable

  • 7/23/2019 Lect#5 - Normalization.ppt

    38/42

    staffNo branchNo branchAddress name position hoursPerWeek

    S4552 B001 City South Plaza, Seattle, WA98122

    Ellen London Assistant 16

    S4555 B004 16 14th A!enue, Seattle, WA98128

    Ellen Lay"an Assistant 9

    S4612 B002 City Cente# Plaza, Seattle,WA 98122

    $a!e Sin%lai# Cle#& 14

    S4612 B004 16 14th A!enue, Seattle, WA

    98128

    $a!e Sin%lai# Cle#& 10

    38

    E'a"ine the ta(le sho)n a(o!e* +his ta(le #e#esents the hou#s )o#&ed e# )ee&

    -o# te"o#a#y sta-- at ea%h (#an%h o- a %o"any*

    1* .denti-y the -un%tional deenden%ies #e#esented (y the data sho)n in the

    ta(le*

    2* /sin the -un%tional deenden%ies identi-ied in a#t 2, des%#i(e and illust#ate

    the #o%ess o- no#"alization (y %on!e#tin the ta(le to +hi#d 3o#"al o#" 3

    #elations* .denti-y the #i"a#y and -o#ein &eys in you# 3 #elations*

    Sample #$ercise-

  • 7/23/2019 Lect#5 - Normalization.ppt

    39/42

    39

    i!en the -ollo)in #elational s%he"a7

    MOVIE(cinemaID, cinemaCapacit, mo!ieID, mo!ie"it#e, mo!ieDuration,sho$Date, sho$"ime, actorID, actorName, ticketPrice, ticket%o#d,tota#Co##ection&

    1* S&et%h a ta(le )ith the att#i(utes o- the a(o!e s%he"a as %olu"n heade#s*

    Poulate the ta(le )ith 10 #e%o#ds o- data*

    2* .denti-y the #i"a#y &ey and the -un%tional deenden%ies #e#esented (y the

    data sho)n in the ta(le*

    * /sin the -un%tional deenden%ies identi-ied in a#t 2, des%#i(e and illust#ate

    the #o%ess o- no#"alization (y %on!e#tin the ta(le to +hi#d 3o#"al o#" 3

    #elations* .denti-y the #i"a#y and -o#ein &eys in you# 3 #elations*

    Sample #$ercise0

  • 7/23/2019 Lect#5 - Normalization.ppt

    40/42

    40

    i!en the -ollo)in in%o"lete deenden%y dia#a",

    a $#a) an a##o) -o# ea%h -un%tional deenden%y

    ( State the #i"a#y &ey (ased on the identi-ied -un%tional deenden%ies

    % W#ite the 13 #elational s%he"a -o# the dia#a"

    d .denti-y any a#tial deenden%y

    e 3o#"alize the #elation in % into 23 #elations

    - .denti-y any t#ansiti!e deenden%y

    3o#"alize the #elation in e into 3 #elations*

    Sample #$ercise2

    '#%!D

    +'R3

    4RP

    %'+H

    !D

    D'#

    '!%#

    S'D!U%

    5O+ 6 5 D 4, 4 P'

  • 7/23/2019 Lect#5 - Normalization.ppt

    41/42

    5earning Outcomes41

    Now students should be able to$ :'plain what is normalization and the

    purpose of normalization

    /erform normalization process fromlower normal forms to higher normalforms; 1NF, 2NF and 3NF

  • 7/23/2019 Lect#5 - Normalization.ppt

    42/42

    References42

    Database Systems 7 practical pproac( toDesign8 !mplementation and %anagement9

    Thomas Connolly, Carolyn Begg (2010), AddisionWesley, Fifth Edition.

    Chapter 1

    Database Systems Design, Implementation& Management.

    !eter "o#, Carlos Coronel (200$), Thomson Co%rseTe&hnology, 'eenth edition.

    Chapter (pg 1* + 1*)