Upload
tobias-fletcher
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Informal Design Guidelines for Relational Databases
Relational database design: The grouping of attributes to form "good" relation schemas
Two levels of relation schemas:
– The logical "user view" level
– The storage "base relation" level
Design is concerned mainly with base relations
2
Informal Design Guidelines for Relational Databases
Four informal measures of quality for relation schema design:
1. Semantics of the Relation Attributes
2. Reducing the redundant information in tuples
3. Reducing Null values in tuples
4. Disallowing the possibility of one generating spurious tuples.
3
1- Semantics of the Relation Attributes
Each tuple in a relation should represent one entity or
relationship instance
Guideline #1: Design a schema that can be explained easily
relation by relation. The semantics of attributes should be
easy to interpret.
4
2- Redundant Information in Tuples and Update Anomalies
Mixing attributes of multiple entities may cause problems:
– Information is stored redundantly wasting storage
– Problems with update anomalies:
Insertion anomalies
Deletion anomalies
Modification anomalies
Guideline #2: Design a schema that does not suffer from the insertion,
deletion and update anomalies. If there are any present, then note
them so that applications can be made to take them into account5
3- Null Values in Tuples
Reasons for nulls:
a. attribute not applicable or invalid
b. attribute value unknown (may exist)
c. value known to exist, but unavailable
Guideline #3: Relations should be designed such that their tuples will have as few NULL values as possible
Attributes that are NULL frequently could be placed in separate relations (with the primary key)
7
4- Spurious Tuples
Bad designs for a relational database may result in
erroneous results for certain JOIN operations
Guideline #4: The relations should be designed to satisfy
the lossless join condition. No spurious tuples should be
generated by doing join of any relations.
8
Functional Dependencies
Functional dependencies (FDs) are used to specify
formal measures of the "goodness" of relational designs
FDs and keys are used to define normal forms for
relations
FDs are constraints that are derived from the meaning
and interrelationships of the data attributes
9
Examples of FD constraints
Social security number determines employee name
SSN -> ENAME Project number determines project name and location
PNUMBER -> {PNAME, PLOCATION} Employee ssn and project number determines the hours per week that
the employee works on the project
{SSN, PNUMBER} -> HOURS10
Functional Dependencies
Describes the relationship between attributes in a relation.
If A and B are attributes of relation R,
B is functionally dependent on A, denoted by A B, if each value of A is associated with exactly one value of B. B may have several values of A.
Determinant Dependent
A BB is functionallydependent on A
11
Functional Dependencies
Example
StaffNo positionPosition is functionallydependent on Staffno
position StaffNoStaffNo is NOT functionallydependent on position
SL21 Manager
Manager SL21 SG5
1:1 or M:1 relationship
between attributes in a
relation
1:M relationship
between attributes in a
relation
12
Trivial Functional Dependencies
A B is trivial if B A
StaffNo, Sname SName
StaffNo, SName StaffNo
We are not interested in trivial functional dependencies as it provides no genuine integrity constraints on the value held by these attributes.
13
QuestionQuestion
Find FDs of the relation shown below that lists dentist/patient appointment data; known that:
• A patient is given an appointment at a specific time and date with a dentist located at a particular surgery.
• On each day of patient appointments, a dentist is allocated to a specific surgery for that day.
Dentist-patient (staffNo, aDate, aTime, dentistName, patNo, patName, surgeryNo)
14
QuestionQuestion
Dentist-patient (staffNo, aDate, aTime, dentistName, patNo, patName, surgeryNo)
FDs list
FD1: staffNo, aDate, aTime patNo, patName
FD2: staffNo dentistName
FD3: patNo patName, surgeryNo
FD4: staffNo, aDate surgeryNo
FD5: aDate, aTime, patNo dentistName, staffNo15
Introduction to Normalization
Normalization: Process of decomposing unsatisfactory
"bad" relations by breaking up their attributes into smaller
relations
Normal form: Condition using keys and FDs of a relation to
certify whether a relation schema is in a particular normal
form
16
ExamplesFirst Normal Form
EMP_PROJ (Ssn, Ename, {Phone#}) { } Mulitvalue attribute
EMP_PROJ1 (Ssn, Ename)
EMP_PROJ2 (Ssn, Phone#)
EMP_PROJ (Ssn, Ename (Fname, Lname)) ( ) composite attribute
EMP_PROJ (Ssn, Fname,Lname)
EMP_PROJ (Ssn, Ename, {PROJS (Pnamber, Hours)})
EMP_PROJ1 (Ssn, Ename)
EMP_PROJ2 (Ssn, Pnamber, Hours)
18
Second Normal Form
Uses the concepts of FDs, primary key
Definitions:
– Prime attribute - attribute that is member of the
primary key K
– Full functional dependency - a FD Y Z where
removal of any attribute from Y means the FD does
not hold any more
19
Full Functional DependencyFull Functional Dependency
If A and B are attributes of a relation.
B is fully functionally dependent on A if B is functionally dependent on A, but not on any proper subset of A.
B is partial functional dependent on A if some attributes can be removed from A & the dependency still holds.
StaffNo, Sname BranchNo Partial dependency
ClientNo, PropertyNo RentDate Full dependency
20
1NF 2NF1NF 2NF
1. Start with 1NF relation.
2. Find the FDs of a relation.
3. Test the FDs whose determinant attribute is part of the PK.
21
ExamplesSecond Normal Form
{SSN, PNUMBER} HOURS is a full FD since neither
SSN HOURS nor PNUMBER HOURS hold
{SSN, PNUMBER} ENAME is not a full FD (it is called a partial
dependency ) since SSN ENAME also holds
A relation schema R is in second normal form (2NF) if every non-
prime attribute A in R is fully functionally dependent on the
primary key
R can be decomposed into 2NF relations via the process of 2NF
normalization
22
Second Normal Form
Note: The test for 2NF involves testing for functional
dependencies whose left-hand side attributes are part of
the primary key. If the primary key contains a single
attribute, the test need not be applied at all.
24
Third Normal Form
Definition
– Transitive functional dependency – a FD X Y in R is a
transitive dependency if there is a set of attributes Z that are
neither a primary or candidate key and both X Z and Z Y
holds.
Examples:
– SSN DMGRSSN is a transitive FD since
SSN DNUMBER and DNUMBER DMGRSSN hold
– SSN ENAME is non-transitive since there is no set of
attributes X where SSN X and X ENAME
25
3rd Normal Form
A relation schema R is in third normal form (3NF) if it is in 2NF and no non-prime attribute A in R is transitively dependent on the primary key
26
BCNF (Boyce-Codd Normal Form)
A relation schema R is in Boyce-Codd Normal Form (BCNF) if
whenever an FD X A holds in R, then X is a superkey of R
– Each normal form is strictly stronger than the previous one:
Every 2NF relation is in 1NF
Every 3NF relation is in 2NF
Every BCNF relation is in 3NF
– There exist relations that are in 3NF but not in BCNF
– The goal is to have each relation in BCNF (or 3NF)
29
BCNF
FDs:
{Student,course} Instructor
Instructor Course
It is in 3NF not in BCNF
Decomposing into 2 schemas
{Student, Instructor} {Instructor, Course}
32
ExamplesBCNF
R ( Client#, Problem, Consultant _name)
R1 (Client#, Consultant _name)
R2 (Consultant _name, Problem)
■ R (Stud#, Class#, Instructor, Grade)
R1 (Stud#, Instructor, Grade)
R2 (Instructor, Class#)33
Example
Consider the following relation for published books:
BOOK (Book_title, Author_name, Book_type, Listprice, Author_affil, Publisher)
- Author_affil referes to the affiliation of the author.
Suppose thefollowing dependencies exist:
Book_title -> Publisher, Book_type
Book_type -> Listprice
Author_name -> Author-affil
(a) What normal form is the relation in? Explain your answer.
(b) Apply normalization until you cannot decompose the relations further. State the reasons
behind each decomposition.
34
Answer
BOOK (Book_title, Authorname, Book_type, Listprice, Author_affil, Publisher)
(a) The key for this relation is (Book_title, Authorname). This relation is in 1NF and not in 2NF as no attributes are Full FD on the key. It is also not in 3NF.
(b) 2NF decomposition:Book0(Book_title, Authorname)Book1(Book_title, Publisher, Book_type, Listprice)Book2(Authorname, Author_affil)
This decomposition eliminates the partial dependencies.3NF decomposition:
Book0(Book_title, Authorname)Book1-1(Book_title, Publisher, Book_type)Book1-2(Book_type, Listprice)Book2(Authorname, Author_affil)
This decomposition eliminates the transitive dependency of Listprice
35
Example
Given the relation schemaCar_Sale (Car#, Salesman#, Date_sold, Commission%, Discount_amt)
with the functional dependenciesDate_sold -> Discount_amtSalesman# -> Commission%Car# -> Date_sold
This relation satisfies 1NF but not 2NF (Car# -> Date_sold and Salesman# -> Commission%)so these two attributes are not Full FD on the primary key and not 3NF
36
To normalize,2NF:Car_Sale1 (Car#, Salesman#)Car_Sale2 (Car#, Date_sold, Discount_amt)Car_Sale3 (Salesman#,Commission%)3NF:Car_Sale1(Car#, Salesman#)Car_Sale2-1(Car#, Date_sold)Car_Sale2-2(Date_sold, Discount_amt)Car_Sale3(Salesman#,Commission%)
Answer
37
Given the following Dentist-patient database schema:
Dentist-patient (staffNo, aDate, aTime, dentistName, patNo, patName, surgeryNo)
Normalize the above relation, showing appropriate dependency diagrams to justify decomposition.
QuestionQuestion
FDs List:
FD1: staffNo, aDate, aTime patNo, patName
FD2: staffNo dentistName
FD3: patNo patName, surgeryNo
FD4: staffNo, aDate surgeryNo
FD5: aDate, aTime, patNo dentistName, staffNo
38
1NF
Dentist-patient (staffNo, aDate, aTime, dentistName, patNo, patName, surgeryNo)
2NF (fd2 and fd4 violates 2NF)
Dentist-patient (staffNo, aDate, aTime, patNo, patName)
Surgery (staffNo, aDate, surgeryNo)
Dentist (staffNo, dentistName)
3NF (Fd3’ violates 3NF)
Dentist-patient (staffNo, aDate, aTime, patNo)
Surgery (staffNo, aDate, surgeryNo)
Dentist (staffNo, dentistName)
Patient (patNo, patName)
Answer
39