Handout - Tarleton State UniversityBecause it is not always possible to achieve both BCNF and dependency preservation, we consider a weaker NF, known as Show that cust_banker_branch

Handout

Combining schemas

Problems: redundancy, hard to update, possible NULLs

Problems?

Conclusion: Whether the join attribute is PK or not makes a great difference when

combining schemas!

Splitting schemas, a.k.a. decomposition (revert arrows below)

Functional dependency: loan_number amount

Coincidence or not? (And why it matters …)

An even worse decomposition: lossy!

Why do we say lossy when in

fact we end up with more data?

7.3 Decomposition using FDs

FD algebra

Example:

--------------------------------------------------------------------------------------------------

The most useful normal form:

loan = (loan_number, amount)

borrower = (customer_id, loan_number)

Find the set of all (non-trivial) FDs for the relation bor_loan

Another example:

Is this schema in BCNF?

In bor_loan, the violating FD is loan_number → amount, so we set

Why not simply say R – ?

Another example:

It was found earlier that this schema is not in BCNF. The violating FD is B → C.

Apply the BCNF decomposition algorithm!

Is this relation in BCNF?

If no, decompose it!

To do for next time: Rework all the BCNF examples!

----------------------------------------------------------------------------------------------

BCNF and preservation of dependencies

E-R design from Ch.6: a customer A customer can have more than 1 personal banker,

can have at most 1 personal banker but at most one at any given branch. (?)

A ternary relationship-set is needed:

Implementation:

R = cust_banker_branch = (customer_id, employee_id, branch_name, type)

FDs: FD1: employee_id branch_name

FD2: (customer_id, branch_name) (employee_id, type)

Is cust_banker_branch in BCNF?

No.

Apply the decomposition algorithm!

Decomposition: R1 = (employee_id, branch_name)

R2 = (customer_id, employee_id, type)

Problem: FD2 is now “spread” across two relations!

Conclusion: BCNF is not dependency preserving

R = cust_banker_branch = (customer_id, employee_id, branch_name, type)



Extra-credit: What if we started BCNF decomposition with F2 instead of F1?

Time: 2’

Because it is not always possible to achieve both BCNF and dependency preservation, we

consider a weaker NF, known as

Show that cust_banker_branch is in 3NF R = cust_banker_branch = (customer_id, employee_id, branch_name, type)



Whatever happened to 2NF?

In a nutshell, it forbids attributes to depend on parts of keys.

See Second normal form - Wikipedia, the free encyclopedia for more details.

Another BCNF/3NF example:

books (B-Name, Ed, A-Name, A-SSN, Nr-pag)

A_Name A_SSN

Is it in

BCNF?

3NF?

To do for next time: Rework all the BCNF & 3NF examples!

-----------------------------------------------------------------------------------------------------

http://en.wikipedia.org/wiki/2NF

Higher NFs

Consider this relation:

classes (course, teacher, book )

If (c, t, b) classes means that t is qualified to teach c, and b is a required textbook

for c.

What are the FDs for this relation?

Is it in BCNF?

Is it in 3NF?

We still have redundancies and insertion anomalies – e.g., if Marilyn is a new teacher

that can teach database, two tuples need to be inserted:

(database, Marilyn, DB Concepts)

(database, Marilyn, Ullman)

Whatever happened to 2NF?

In a nutshell, it forbids attributes to depend on parts of keys.

See Second normal form - Wikipedia, the free encyclopedia for more details.

http://en.wikipedia.org/wiki/2NF

The big picture

7.4 FD Theory

7.4.1 The Closure of a set of FDs

Yes, this is a

trivial FD!

Algorithm to compute F+

Although Armstrong’s axioms are sufficient to obtain the closure …

… in practice we want more “tools”

How about these?

• Idempotency: X X X

• Commutativity: X Y Y X

They are true, but it is customary to write all attributes as sets w/no repeating values and

sorted in alphabetical order.

Important lemma:

if and only if

Proof: Left as individual work for next time. Use the definition of a FD from

p.271:

Practice

exercise

7.4

Example:

Quiz: Generate 4 more FDs that are in F+

7.4.2 The Closure of a set of attributes (under the set of FDs)

Compare to the inefficient algorithm, based on F + …

For next time: Read and understand the example on p.281

--------------------------------------------------------------------------------------------------------

Applications of attribute closure:

Check if a set of attributes is superkey

Check if a set of attributes is candidate key (i.e. superkey + minimal)

Check if a functional dependency holds (i.e. if is in F+)

o Find + and then check if +

Computing closure F+ of F

o For each set of attributes R, find the closure +, and for each S +

output a functional dependency S

Attribute closure gives another algorithm to

find the FD closure F+! Compare it with the

first alg. from fig. 7.8. Which one do you

think is more efficient? Explain!

Example:

In general, a FD is of the form , with and sets of attributes, e.g.

EFG KL.

Food for thought:

Can be the empty set? (“nothing” )

Can be the total set? (“everything” )

Can be the empty set? ( “nothing”)

Can be the total set? ( “everything”)

Extraneous attributes

In English:

If we remove the attribute, the closure F+ does not change

Why is this of practical importance?

This part is trivial, so it

doesn’t need to be

checked (it was included

just for symmetry)

Examples:

Given F = {A C, AB C }

B is extraneous in AB C because {AB C} can be derived

from A C (How?)

As seen in this example, sometimes removal of extraneous attributes

makes an entire FD disappear (b/c it’s a duplicate)

Given F = {A C, AB CD}

C is extraneous in AB CD since AB C can be derived even

after deleting C (How?)

Algorithm:

Exercise

Add to the list of

applications of

attribute closure!

Add to the list of

applications of

attribute closure!

Answer:

-------------------------------------------------------------------------------------

Exercise

Answer:

A+ = {A, B, C, D}, so A+ contains C, so C is extraneous in ACD

Exercise

Same scenario as above.

Is D extraneous in ACD?

Exercise

F = {A B, B C, A C).

Is C extraneous in A C?

So what do we do about A C?

For next time: solve all the exercises above, plus the one on p.283!

Why is this of practical importance?

Algorithm:

?

Solve for practice!

Example not from text:

Solution:

Two things must be preserved when we perform decompositions:

Data (tuples)

FDs

Efficient algorithm (uses only attribute closure, not FD closure!)

How much of Ri can we

recover, based on the

current result?

Example (not in text, but in text slides):

----------------------------------------------------------------------------------

Apply the algorithm

above to prove this!

Trivial, don’t need

algorithm!

Solution:

Prove that the decomposition R1=(A, B) R2 = (A,C) is not

dependency preserving.

The FD that needs to be recovered is B→C. Apply algorithm:

result = {B}

Consider R1; result ∩ R1 = {B}; {B}+ = {BC}; {BC}∩R1 = {B};

resultU{B} = {B}

Consider R2; result ∩ R2 = Ø; Ø+ = Ø; … result = {B}

No progress, so algorithm stops.

We could not obtain the RHS of B→C, so FD cannot be recovered.

Week 12, Lect 1

7.5 Decomposition using FDs

Problem: The definitions of both BCNF and 3NF require F+ → expensive!

FYI there is a sketched proof

for this on p.289 (not required

for final)

Can you find super-keys?

Intuitively, we can “feel” that AC BDE … but how to prove it?

Hint: Armstrong’s axioms (and theorems)

So AC is a super-key. But is it a candidate key? (What’s the difference?)

Do you think there are other candidate keys? Why or why not?

Are there any BCNF violations?

Hint: To find BCNF violations, do we need to check F or F+? Why?

Which one do choose to start decomposition?

Now write down the two relations resulting from decomposition, including

their FDs F1 and F2 and their candidate keys:

SKIP the remainder of Section 7.5, starting with 7.5.1.2. (p.289)

SKIP 7.6, 7.7

Read and take notes: Sections 7.8, 7.9

Homework for Ch.7: 1, 3, 5, 6, 7, 11

Documents

Handout - Tarleton State UniversityBecause it is not always possible to achieve both BCNF and dependency preservation, we consider a weaker NF, known as Show that cust_banker_branch