33
TERMINOLOGY & NORMALIZATION DBS201

T ERMINOLOGY & N ORMALIZATION DBS201. Equivalent Terms: Relational Model Table-Oriented DBMS Conventional File Systems Conceptionally Repreesents RelationTableFileEntity

Embed Size (px)

Citation preview

TERMINOLOGY & NORMALIZATION

DBS201

Equivalent Terms:Relational Model

Table-Oriented DBMS

Conventional File Systems

Conceptionally Repreesents

Relation Table File Entity Type

Tuple Row Record Entity Instance

Attribute Column Field Property

Domain Column Type Data Type Allowable Values

Element Column Value Field Value Property Value

What is a Relation?

• Rows contain data about an entity• Columns contain data about attributes of the

entity• Cells of the table hold a single value• All entries in a column are of the same kind• Each column has a unique name• The order of the rows and columns is

unimportant• No two rows may be identical

Types of Keys?

• A key is one or more columns of a relation that is used to identify a row

• A key can be unique or nonunique• In Employee relation Emp_No vs Department• Composite Key• Primary Key• Candidate Key• Surrogate Key• Foreign Key

Keys

• A key that contains two or more attributes• Is a composite key• Keys that uniquely identify each row in a

relation• Are candidate keys• The candidate key that is chosen as the key

that will actually be used by the DBMS to uniquely identify each row in a relation

• Is a primary key

Keys

• An attribute that is a key of one or more relations other than the one in which it appears

• Is a foreign key

• Foreign keys are one or more fields in a dependent file that reference the primary key in a parent file

Surrogate Keys

• A column with a unique, DBMS assigned identifier that has been added to a table to be the primary key

• The unique values of the surrogate key are assigned by the DBMS each time a row is added and the values never change

• PROPERTY(Street,City,Prov,Pcode,OwnerID)

• PROPERTY(PropertyID,Street,City,Prov,Pcode,OwnerID)

• Surrogate keys are short, numeric and never change• Ideal as a primary key

Primary Key?Example of Hockey Awards

Award PlayerName Pnumber Position Team YearBest Defense Joe Wall 17 Left Defense Toronto 1999 Sy Stopp 7 Right Defense Detroit 2000

Pete Puck 22 Left Defense Montreal 2001 Joe Wall 17 Left Wing Toronto2002MostValuable Sam Scores 18 Center Chicago 1999 Wayne Gret 99 Center New York 2000

Joe Wall 17 Left Wing Toronto 2002

Primary Key? Fill in all attributesExample of Hockey Awards

Award PlayerName Pnumber Position Team YearBest Defense Joe Wall 17 Left Defense Toronto 1999Best Defense Sy Stopp 7 Right Defense Detroit 2000Best Defense Pete Puck 22 Left Defense Montreal 2001Best Defense Joe Wall 17 Left Wing Toronto 2002MostValuable Sam Scores 18 Center Chicago 1999 MostValuable Wayne Gret 99 Center New York 2000MostValuable Joe Wall 17 Left Wing Toronto 2002

2. Step 2: Look for any column that has no value that is used in more than one row. We are looking for a column for which it’s value is UNIQUE. We first check to see if we can have a single attribute primary key.

No one column meets this criterion.

Step 3: Look for any pairs of columns which when concatenated produce a unique value.

Team + Year…………

Position + Team……..Position + Year………

PNumber + Position….PNumber + Team…….PNumber + Year……..

PlayerName + PNumber …PlayerName + Position……PlayerName + Team………

PlayerName + Year………. Award + PlayerName……..Award + PNumber………...Award + Position………….Award + Team…………….Award + Year……………..

Concatenated Primary Key

• So Award + Year will be selected as the PK.• and the relation will be:

• (Award, Year, PlayerName, PNumber, Position, Team)

NORMALIZATION REVIEW

SubjectCode Section InstNo InstName SubjectName StudentNo StudentName

DBS201 A 122 Russ Pangborn Intro to DB 111111111222222222

Terry AdamsJack Chan

DBS201 B 323 Bill Gates Intro to DB 121212121323233232

Frank BrownMary Wong

RPG544 A 122 Russ Pangborn RPGIV 444444444143211222

Wendy ClarkPeter Lind

• Write CLASSLIST in UNF

• CLASSLIST [ SubjectCode, SectionCode, InstructorNo, InstructorName, SubjectName, {StudentNumber, StudentName} ]

• A relation is in 1st normal form when the primary key determines a single value of each attribute for all attributes in the relation (i.e. the relation contains no repeating groups)

• Two ways to get to 1NF – how did we do it last week?

CLASSLIST

SubjectCode Section InstNo InstName SubjectName StudentNo StudentName

DBS201 A 122 Russ Pangborn Intro to DB 111111111 Terry Adams

DBS201 A 122 Russ Pangborn Intro to DB 222222222 Jack Chan

DBS201 B 323 Bill Gates Intro to DB 121212121 Frank Brown

DBS201 B 323 Bill Gates Intro to DB 323233232 Mary WongRPG544 A 122 Russ Pangborn RPGIV 444444444 Wendy Clark

RPG544 A 122 Russ Pangborn RPGIV 143211222 Peter Lind

• Add to key of unnormalized relation to insure primary key identifies 1 and only 1 value of each attribute in the relation

• CLASSLIST [ SubjectCode, SectionCode, InstructorNo, InstructorName, SubjectName, StudentNumber, StudentName ]

• CLASSLIST [ SubjectCode, SectionCode, StudentNumber, InstructorNo, InstructorName, SubjectName, StudentName ]

CLASSLIST

• Restate the original un-normalized relation without the repeating group

• CLASSLIST [ SubjectCode, SectionCode, InstructorNo, InstructorName, SubjectName ]

• Create a new relation consisting of key of original relation and attributes within repeating group and add to key to ensure uniqueness

• CLASSLISTSTUDENT [ SubjectCode, SectionCode, StudentNumber, StudentName ]

Method 2

• A 1NF relation is in 2NF when the entire primary key is needed to determine the value of each non-key attribute (i.e. relation has no partial dependencies – attributes whose values can be determined by knowing only part of the key)

2nd Normal Form

• 1NF Relations: CLASSLIST [ SubjectCode, SectionCode, InstructorNo, InstructorName, SubjectName ]

• contains the partial dependency SubjectCode -> SubjectName

• CLASSLISTSTUDENT [ SubjectCode, SectionCode, StudentNumber, StudentName ]

• contains the partial dependency StudentNumber-> StudentName

1st Normal Form -> 2nd Normal Form

• Create new relation(s) consisting of part of the primary key and all attributes whose values are determined by this part of the primary key:

• SUBJECT [SubjectCode, SubjectName ]

• STUDENT [StudentNumber, StudentName ]

• Restate original relation(s) without partially dependent attributes:

• CLASSLISTSTUDENT [ SubjectCode, SectionCode, StudentNumber ]

• CLASSLIST [ SubjectCode, SectionCode, InstructorNo, InstructorName ]

2NF

3Rd Normal Form

• A 2NF relation is in 3NF when the primary key and nothing but the primary key can be used to determine the value of each non-key attribute (i.e. relation has no transitive dependencies – attributes whose values can be determined by knowing something other than the key)

2NF -> 3NF

• 2NF Relations: • CLASSLISTSTUDENT [ SubjectCode, SectionCode,

StudentNumber ] , • CLASSLIST [ SubjectCode, SectionCode, InstructorNo,

InstructorName ] , • SUBJECT [SubjectCode, SubjectName ]• STUDENT [StudentNumber, StudentName ]

• . In CLASSLIST the Instructor Name is determined by InstructorNo so create the new relation:

• INSTRUCTOR [InstructorNo, InstructorName ]• Remove the transitive dependency• CLASSLIST [ SubjectCode, SectionCode, InstructorNo ]

Resulting 3NF Relations for ClassList Userview

• Set of 3NF Relations for the Class List Userview: CLASSLIST [ SubjectCode, SectionCode, InstructorNo ]

CLASSLISTSTUDENT [ SubjectCode, SectionCode, StudentNumber ]

SUBJECT [SubjectCode, SubjectName ]

STUDENT [StudentNumber, StudentName ]

INSTRUCTOR [InstructorNo, InstructorName ]

1 unnormalized userview will always result in 1 or more relations in 1NF

• Each 1NF relation will result in 1 or more 2NF relations• Each 2NF relation will result in 1 or more 3NF relations• You can never lose (ie not include) an attribute – it must

always be found in one of the relations at each step • You can never lose a relation

Normalize Remaining User views

• Normalization process is then applied to each remaining user view (eg grade sheet, timetable request, …)

• A set of 3NF relations is produced for each user view

• Then 3NF relations from each user view are then integrated to form one complete set of relations for the application

Writing a relation from a verbal or written description

• Write the DBDL for the following description: • Each dentist’s office has a unique identifier for

insurance companies. There is a mailing address for the office as well as the name of the head dentist. There are many patients and each patient has a unique identifier number.

• 1. List Attributes• OfficeNo, MailAddress, HeadDentist,PatientNo,

PatientName

Each dentist’s office has a unique identifier for insurance companies. There is a mailing address for the office as well as the name of the head dentist. There are many patients and each patient has a unique identifier number.

• 2. Select Primary Key (unique identifier for each row)

• OfficeNo, MailAddress, HeadDentist,PatientNo, PatientName• 3. Show mulit-valued dependencies• OfficeNo,MailAddress,HeadDentist,(PatientNo, PatientName)• Give the table a name• DentistsOffice [OfficeNo, MailAddress, HeadDentist,

(PatientNo, PatientName) ]• We call this 0NF or UNF (Unnormalized Form)

because there is a multi-valued dependency

Change from UNF to 1NF

• DentistsOffice [OfficeNo, MailAddress, HeadDentist, (PatientNo, PatientName) ]

• Select the Primary Key for the multi-valued dependency.

• Create a two-part primary key by concatenating the original PK with the PK of the multi-valued dependency

• DentistsOffice [OfficeNo, PatientNo MailAddress, HeadDentist, PatientName ]

Change from 1NF to 2NF

• DentistsOffice [OfficeNo, PatientNo MailAddress, HeadDentist, PatientName ]

• Look for partial dependencies• MailAddress is dependent on OfficeNo• PatientName is dependent on PatientNo

• OfficePatient[OfficeNo,PatientNo)• DentistsOffice[ OfficeNo, MailAddress,HeadDentist]• Patient[PatientNo,PatientName]• Already in 3NF

Purchases at Shoppers Drug Mart-1111 Young Street Toronto are identified by a unique purchase # and a date on the bill. There can be several items and the purchase must record the item #, the quantity, the unit price, a tax code for each item, and the total price.

• UNF• [Purchase#, date, (item#, quantity, unit_

price, tax code)]• 1NF• Item# is the Primary Key of the multi-valued

dependency.• [ Purchase#, item#, date, quantity, unit_

price, tax code]

Purchases at Shoppers Drug Mart-1111 Young Street Toronto are identified by a unique purchase # and a date on the bill. There can be several items and the purchase must record the item #, the quantity, the unit price, a tax code for each item, and the total price.

• 2NF• PurchaseItem[ Purchase#, item#, quantity]• Purchase[ Purchase#, date ]• Item[ item#, unit_ price, tax code]

ACME Taxi Company has several vehicles. The company is very interested in safety and they have their own mechanics. They keep track of the tire usage on each vehicle. The mechanics try to ensure that tires are rotated so as to get the full usage while achieving maximum safety. Using the following TIRE USAGE HISTORY REPORT , you are asked to produce the UNF, 1NF & 2NF tables to support this user view

• ACME TIRE TIRE KM• TIRE MANUF. MANUF. TIRE THIS• VEHICLE MAKE YEAR NUM CO# COMPANY SIZE VEH.• ------------- --------- --------- -------- ------------- -------------- ------ ---------• 11 FORD 2007 1327 119 Goodyear 15R7 43000• 1328 119 Goodyear 15R7 43000• 1329 119 Goodyear 15R7 25000• 1330 119 Goodyear 15R7 24000• 1800 099 Komoto 12R6 1000• 2013 121 BF Goodrich 15R7 18000 2014 121 BF Goodrich 15R7 18000• 15 LEXUS 2008 2013 121 BF Goodrich 15R7 29000• 2014 121 BF Goodrich 15R7 29000• etc…..

Acme Taxi Company

• WRITING THE RELATION:• UNF• VEHICLE[Vehicle# , Make, Year, (Tire#, Manuf#,

ManufName, Size, KM/This_Veh) ] • 1NF• VEHICLE[Vehicle# , Tire#, Make, Year, Manuf#,

ManufName, Size, KM/This_Veh] •

Acme Taxi Company• 2NF [ Vehicle#,• [ Tire#• [ Vehicle#, Tire#, • 2NF [ Vehicle#, Make, Year ]• [ Tire# , Manuf#, ManufName, Size ]• [ Vehicle#, Tire#, KM/This_Veh ]

• 2NF VEHICLE [ Vehicle#, Make, Year ]• TIRE [ Tire# , Manuf#, ManufName, Size ]• VEHICLE-TIRE [ Vehicle#, Tire#, KM/This_Veh ]•

Acme Taxi Company

• 2NF VEHICLE [ Vehicle#, Make, Year ]• TIRE [ Tire# , Manuf#, ManufName, Size ]• VEHICLE-TIRE [ Vehicle#, Tire#, KM/This_Veh ]• 3NF ?• TIREMANUF[Manuf#,ManufName]

TIRE [ Tire# , Manuf#, Size ]VEHICLE [ Vehicle#, Make, Year ]VEHICLE-TIRE [ Vehicle#, Tire#, KM/This_Veh ]

GTA Landscaping Inc.

• GTA Landscaping Description• GTA Landscaping Invoice• GTA Landscaping ERD• Identify the many to many Service

relationship