ch7_final

Embed Size (px)

Citation preview

  • 8/7/2019 ch7_final

    1/39

    Database System Concepts, 5th Ed.

    Silberschatz, Korth and Sudarshan

    See www.db-book.com for conditions on re-use

    Chapter 7: Relational Database DesignChapter 7: Relational Database Design

  • 8/7/2019 ch7_final

    2/39

    Silberschatz, Korth and Sudarshan7.2Database System Concepts - 5th Edition, July 28, 2005.

    First Normal FormFirst Normal Form

    Domain is atomicif its elements are considered to be indivisible units Examples of non-atomic domains:

    Set of names, composite attributes

    Identification numbers like CS101 that can be broken up into parts

    A relational schema R is in first normal form if the domains of all attributes of R

    are atomic

    Non-atomic values complicate storage and encourage redundant (repeated)

    storage of data

    Example: Set of accounts stored with each customer, and set of owners

    stored with each account

    We assume all relations are in first normal form (and revisit this in Chapter9)

  • 8/7/2019 ch7_final

    3/39

    Silberschatz, Korth and Sudarshan7.3Database System Concepts - 5th Edition, July 28, 2005.

    First Normal Form (Contd)First Normal Form (Contd)

    Example

    SupplierPart (S#, P#, status, city, Qty.)

    S#,p# Qty

    S# city

    city status

    S#status

    Insertion anomalies

    Cant insert information about supplier until supplier supplies certain part

    Deletion anomalies

    If we delete first tuple of supplier we not only delete shipment information but alsoinformation about supplier

    Updation anomalies

    The city value for a particular supplier appears many times and hence can lead toinconsistency while updation

  • 8/7/2019 ch7_final

    4/39

    Silberschatz, Korth and Sudarshan7.4Database System Concepts - 5th Edition, July 28, 2005.

    Functional DependenciesFunctional Dependencies

    Constraints on the set of legal relations.

    Require that the value for a certain set of attributes determines uniquely the

    value for another set of attributes.

    Let R be a relation schema

    E R andF R

    The functional dependency

    E p F

    holds on R if and only if for any legal relations r(R), whenever any two tuples t1and t2of r agree on the attributes E, they also agree on the attributesF. That is,

    t1[E] = t2[E] t1[F ] = t2[F]

  • 8/7/2019 ch7_final

    5/39

    Silberschatz, Korth and Sudarshan7.5Database System Concepts - 5th Edition, July 28, 2005.

    Functional Dependencies (Cont.)Functional Dependencies (Cont.)

    Example1:

    Consider r(A,B ) with the following instance of r.

    On this instance, A p B does NOThold, but B p A does hold.

    Example2:

    Consider r(A,B,C,D ) with the following instance of r.

    On this instance, A p C hold, but Cp A does NOThold.

    1 41 5

    3 7

    A B C D

    a1 b1 c1 d1

    a1 b2 c1 d2

    a2 b2 c2 d2

    a2 b3 c2 d3a3 b3 c2 d4

  • 8/7/2019 ch7_final

    6/39

    Silberschatz, Korth and Sudarshan7.6Database System Concepts - 5th Edition, July 28, 2005.

    Functional Dependencies (Cont.)Functional Dependencies (Cont.)

    Functional Dependency can be Trivial

    E p F is trivial if F E

    Non trivial

    E p F is non trivial if F E

    A functional dependency is a generalization of the notion of a key.

    K is a superkey for relation schema R if and only if Kp R

    K is a candidate key for R if and only if

    Kp R, and

    for no E K, E p R

  • 8/7/2019 ch7_final

    7/39

    Silberschatz, Korth and Sudarshan7.7Database System Concepts - 5th Edition, July 28, 2005.

    Closure of a Set of Functional DependenciesClosure of a Set of Functional Dependencies

    Given a set F of functional dependencies, there are certain other functionaldependencies that are logically implied by F.

    For example: If A p B and B p C, then we can infer thatA p C

    The set ofallfunctional dependencies logically implied by F is the closure of F.

    We denote the closure of F byF+.

    F+ is a superset of F.

  • 8/7/2019 ch7_final

    8/39

    Silberschatz, Korth and Sudarshan7.8Database System Concepts - 5th Edition, July 28, 2005.

    Closure of a Set of Functional DependenciesClosure of a Set of Functional Dependencies

    We can find all of F+ by applying ArmstrongsAxioms:

    if F E, then E p F (reflexivity)

    ifE p F, then K E p K F (augmentation)

    ifE p F, and F p K, then E p K (transitivity)

    IfE p F holds andE p K holds, then E p F K holds (union)

    IfE p F K holds, then E p F holds andE p K holds (decomposition)

    IfE p F holds andK F p H holds, then E K p H holds (pseudotransitivity)

  • 8/7/2019 ch7_final

    9/39

    Silberschatz, Korth and Sudarshan7.9Database System Concepts - 5th Edition, July 28, 2005.

    ExampleExample

    R = (A, B, C, G, H, I)

    F = {Ap BAp C

    CGp H

    CGp I

    B p H}

    some members of F+

    Ap H

    by transitivity from Ap B and B p H

    AGp I

    by augmentingAp C with G, to getAGp CG

    and then transitivity with CGp I CGp HI

    by augmenting CGp I to infer CGp CGI,

    and augmenting of CGp H to infer CGIp HI,

    and then transitivity

  • 8/7/2019 ch7_final

    10/39

    Silberschatz, Korth and Sudarshan7.10Database System Concepts - 5th Edition, July 28, 2005.

    Procedure for Computing FProcedure for Computing F++

    To compute the closure of a set of functional dependencies F:

    F+ = F

    repeat

    for each functional dependency f in F+

    apply reflexivity and augmentation rules on f

    add the resulting functional dependencies to F+for each pair of functional dependencies f1and f2in F

    +

    if f1 and f2can be combined using transitivity

    then add the resulting functional dependency to F+

    until F+ does not change any further

  • 8/7/2019 ch7_final

    11/39

    Silberschatz, Korth and Sudarshan7.11Database System Concepts - 5th Edition, July 28, 2005.

    Second Normal FormSecond Normal Form

    Arelation R is in second normal form if and only if it is in 1

    st

    NF and every nonkey attribute is irreducibly dependant on primary key

    So we decompose the relation of 1NF as

    (S# ,P#, Qty)

    (S#, City, Status)

    Insertion anomalies Cant insert information about status of a city until supplier resides in

    that city

    Deletion anomalies

    If we delete first tuple of supplier we not only delete Supplier

    information but also information about status of a city Updation anomalies

    The city value appears many times and hence can lead to

    inconsistency while updation

  • 8/7/2019 ch7_final

    12/39

    Silberschatz, Korth and Sudarshan7.12Database System Concepts - 5th Edition, July 28, 2005.

    Example (Convert it in 2NF)Example (Convert it in 2NF)

    ProjectNo. ProjectName EmployeeNo. EmployeeName Ratecategory Rate

    1203 Madagascar

    travel site

    11 Jessica

    Brookes

    A 90

    1203 Madagascar travel site 12 AndyEvans B 80

    1203 Madagascat

    travel site

    16 Max Fat C 70

    1506 Online

    estate

    agency

    11 Jessica

    Brookes

    A 90

    1506 Online

    estate

    agency

    17 Alex

    Branton

    B 70

  • 8/7/2019 ch7_final

    13/39

    Silberschatz, Korth and Sudarshan7.13Database System Concepts - 5th Edition, July 28, 2005.

    We look for partial dependencies

    We look for fields that depend on only part of the

    key and not the entire key.

    Field Project No Employee No

    Project Name 3

    Employee 3

    Rate Category 3

    Rate 3

    We remove partial dependencies

    The fields listed are only dependent on part of

    the key so we remove them from the table.

  • 8/7/2019 ch7_final

    14/39

    Silberschatz, Korth and Sudarshan7.14Database System Concepts - 5th Edition, July 28, 2005.

    We create new tables

    Clearly we cant take the data out and leave it out

    of our database.We put it into a new tableconsisting of the field that has the partial

    dependency and the field it is dependent on.

    Looking at our example we will need to create

    two new tables:

    Dependent On Partially

    Dependent

    Project No Project

    Name

    Dependent On Partially

    Dependent

    Employee No Employee

    Name

    Ratecategory

    Rate

  • 8/7/2019 ch7_final

    15/39

  • 8/7/2019 ch7_final

    16/39

  • 8/7/2019 ch7_final

    17/39

    Silberschatz, Korth and Sudarshan7.17Database System Concepts - 5th Edition, July 28, 2005.

    EmployeeNo EmployeeName RateCategory

    11 Jessica

    Brookes

    A

    12 Andy

    Evans

    B

    16 Max Fat C

    17 Alex

    Branton

    A

    Rate

    Category

    Rate

    A 90

    B 80

    C 70

  • 8/7/2019 ch7_final

    18/39

  • 8/7/2019 ch7_final

    19/39

  • 8/7/2019 ch7_final

    20/39

    Silberschatz, Korth and Sudarshan7.20Database System Concepts - 5th Edition, July 28, 2005.

    Closure ofAttribute SetsClosure ofAttribute Sets

    Given a set of attributes E define the closure ofE underF (denoted byE+

    ) asthe set of attributes that are functionally determined byE under F

    Algorithm to compute E+, the closure ofE under F

    result := E;

    while (changes to result) do

    for each F p K i nFdo

    begin

    ifF result then result := result K

    end

  • 8/7/2019 ch7_final

    21/39

    Silberschatz, Korth and Sudarshan7.21Database System Concepts - 5th Edition, July 28, 2005.

    Example ofAttribute Set ClosureExample ofAttribute Set Closure

    R = (A

    , B, C, G, H, I) F = {Ap B

    Ap CCGp HCGp IB p H}

    (AG)+

    1. result = AG

    2. result = ABCG ( Ap C andAp B)

    3. result = ABCGH (CG p H and CG AGBC)

    4. result = ABCGHI (CG p I and CG AGBCH)

    IsA

    G a candidate key?1. Is AG a super key?

    1. Does AGp R? == Is (AG)+ R

    2. Is any subset ofAG a superkey?

    1. Does A p R? == Is (A)+ R

    2. Does Gp R? == Is (G)+

    R

  • 8/7/2019 ch7_final

    22/39

    Silberschatz, Korth and Sudarshan7.22Database System Concepts - 5th Edition, July 28, 2005.

    Uses ofAttribute ClosureUses ofAttribute Closure

    There are several uses of the attribute closure algorithm: Testing for superkey:

    To test ifE is a superkey, we compute E+, and check ifE+ contains all

    attributes of R.

    Testing functional dependencies

    To check if a functional dependencyE p F holds (or, in other words, is inF+), just check if F E+.

    That is, we compute E+ by using attribute closure, and then check if it

    containsF.

    Is a simple and cheap test, and very useful

    Computing closure of F For each K R, we find the closure K+, and for each S K+, we output a

    functional dependencyK p S.

  • 8/7/2019 ch7_final

    23/39

    Silberschatz, Korth and Sudarshan7.23Database System Concepts - 5th Edition, July 28, 2005.

    Extraneous AttributesExtraneous Attributes

    Consider a set F of functional dependencies and the functional dependencyEp F in F.

    AttributeA is extraneous in E ifA E

    and F logically implies (F {E p F}) {(E A) p F}.

    AttributeA is extraneous inF ifA F

    and the set of functional dependencies

    (F { E pF}) {E p(F A)} logically implies F.

    Example: Given F = {A p C, AB p C }

    B is extraneous in AB p C because {A p C, AB p C} logically implies A

    p C (I.e. the result of dropping B from AB p C).

    Example: Given F = {A p C, AB p CD}

    C is extraneous in AB p CD since AB p C can be inferred even after

    deleting C

  • 8/7/2019 ch7_final

    24/39

  • 8/7/2019 ch7_final

    25/39

    Silberschatz, Korth and Sudarshan7.25Database System Concepts - 5th Edition, July 28, 2005.

    Canonical CoverCanonical Cover

    Acanonical coverfor F is a set of dependencies Fcsuch that

    F logically implies all dependencies in Fc, and

    Fclogically implies all dependencies in F, and

    No functional dependency in Fccontains an extraneous attribute, and

    Each left side of functional dependency in Fc is unique.

    To compute a canonical cover for F:repeat

    Use the union rule to replace any dependencies in FE1 pF1 andE1 pF2with E1 pF1 F2

    Find a functional dependencyE pF with anextraneous attribute either in E or inF

    If an extraneous attribute is found, delete it from E pF

    until F does not change

    Note: Union rule may become applicable after some extraneous attributes have beendeleted, so it has to be re-applied

  • 8/7/2019 ch7_final

    26/39

    Silberschatz, Korth and Sudarshan7.26Database System Concepts - 5th Edition, July 28, 2005.

    Computing a Canonical CoverComputing a Canonical Cover

    R = (A, B, C)

    F = {Ap BC

    B p C

    Ap B

    AB p C}

    CombineAp BC andAp B into Ap BC

    Set is now {Ap BC, B p C, AB p C}

    A is extraneous in AB p C

    Check if the result of deleting A from AB p C is implied by the other dependencies

    Yes: in fact, B p C is already present!

    Set is now {Ap BC, B p C}

    C is extraneous in A p BC

    Check ifAp C is logically implied byAp B and the other dependencies

    Yes: using transitivity on Ap B and B p C.

    Can use attribute closure ofA in more complex cases

    The canonical cover is: Ap B

    B p C

  • 8/7/2019 ch7_final

    27/39

    Silberschatz, Korth and Sudarshan7.27Database System Concepts - 5th Edition, July 28, 2005.

    LosslessLossless--join Decompositionjoin Decomposition

    A decomposition of R into R1

    and R2

    is lossless join if and only if at

    least one of the following dependencies is in F+:

    R1 R2p R1

    R1 R2p R2

  • 8/7/2019 ch7_final

    28/39

  • 8/7/2019 ch7_final

    29/39

    Silberschatz, Korth and Sudarshan7.29Database System Concepts - 5th Edition, July 28, 2005.

    Dependency PreservationDependency Preservation

    Let Fibe the set of dependencies F+ that include only attributes in R

    i.

    A decomposition is dependency preserving, if

    (F1 F2 Fn)+ = F+

  • 8/7/2019 ch7_final

    30/39

  • 8/7/2019 ch7_final

    31/39

    Silberschatz, Korth and Sudarshan7.31Database System Concepts - 5th Edition, July 28, 2005.

    ExampleExample

    R = (A

    , B, C )F = {A p B

    B p C}

    Key = {A}

    Decomposition R1 = (A, B), R2= (B, C)

    Lossless-join decomposition

    Dependency preserving

  • 8/7/2019 ch7_final

    32/39

    Silberschatz, Korth and Sudarshan7.32Database System Concepts - 5th Edition, July 28, 2005.

    A Lossy DecompositionA Lossy Decomposition

  • 8/7/2019 ch7_final

    33/39

  • 8/7/2019 ch7_final

    34/39

    Silberschatz, Korth and Sudarshan7.34Database System Concepts - 5th Edition, July 28, 2005.

    BCNFBCNF Boyce/codd Normal FormBoyce/codd Normal Form

    Insertion anomalies

    Cant insert information about teacher teaches a subject until a student

    is enrolled for that subject

    Deletion anomalies

    If we delete first tuple of a subject opted by a student we not only

    delete Student information but also information about teacher teaching

    that subject.

    Updation anomalies

    The subject value for a teacher appears many times and hence can

    lead to inconsistency while updation

    So we decompose the relation as

    (Student, Teacher)

    (Teacher, Subject)

  • 8/7/2019 ch7_final

    35/39

    Silberschatz, Korth and Sudarshan7.35Database System Concepts - 5th Edition, July 28, 2005.

    Fourth Normal FormFourth Normal Form

  • 8/7/2019 ch7_final

    36/39

    Silberschatz, Korth and Sudarshan7.36Database System Concepts - 5th Edition, July 28, 2005.

    Fourth Normal FormFourth Normal Form

  • 8/7/2019 ch7_final

    37/39

    Silberschatz, Korth and Sudarshan7.37Database System Concepts - 5th Edition, July 28, 2005.

    Fourth Normal FormFourth Normal Form

    The above relation is in BCNF

    Still the above relation has lot of redundancy and hence can lead to update anomalies

    The reason is the presence of multi valued dependencies

    Multi Valued Dependency

    Given a relation R with subset of attributes A,B and C, then we say that B is multi

    dependant on A written as AB, if and only if the set of B values matching a given

    AC value only depends on A value and is independent on C Value. Course teacher

    Each course has set of teachers which is not dependent on text

    Course text

    Each course has set of text which is not dependent on teacher

    Fourth Normal Form A relation R is in 4NF if there exist a non trivial multivalued dependencyAB, then

    all attributes of R are functionally dependent on A.

  • 8/7/2019 ch7_final

    38/39

    Silberschatz, Korth and Sudarshan7.38Database System Concepts - 5th Edition, July 28, 2005.

    Fifth Normal FormFifth Normal Form

  • 8/7/2019 ch7_final

    39/39