42
1 . Ching, Ph.D. • MIS • California State University, Sacramento Week 10 Week 10 October 31 October 31 Extended ERD Extended ERD Data Normalization Data Normalization

1 R. Ching, Ph.D. MIS California State University, Sacramento Week 10 October 31 Extended ERDExtended ERD Data NormalizationData Normalization

Embed Size (px)

Citation preview

1

R. Ching, Ph.D. • MIS • California State University, Sacramento

Week 10Week 10October 31October 31

• Extended ERDExtended ERD• Data NormalizationData Normalization

2

R. Ching, Ph.D. • MIS • California State University, Sacramento

Problems with ER modelingProblems with ER modeling

• Fan traps - Pathway between two entities is ambiguousFan traps - Pathway between two entities is ambiguous

• Chasm traps - Pathway does not exist between certain Chasm traps - Pathway does not exist between certain entity occurrencesentity occurrences

• Inheritance - An entity receives its attributes from a class Inheritance - An entity receives its attributes from a class of attributesof attributes

Extended Entity Relationship (ERR) modelingExtended Entity Relationship (ERR) modeling

3

R. Ching, Ph.D. • MIS • California State University, Sacramento

Connection Trap: Fan TrapConnection Trap: Fan Trap

Merchandise LinesMerchandise Lines

Merchandise_lineMerchandise_lineDescriptionDescription

Merchandise LinesMerchandise Lines

Merchandise_lineMerchandise_lineDescriptionDescription

Product CategoriesProduct Categories

Product_categoryProduct_categoryMerchandise_lineMerchandise_line

Product CategoriesProduct Categories

Product_categoryProduct_categoryMerchandise_lineMerchandise_line

ProductsProducts

Product_codeProduct_codeDescriptionDescriptionMerchandise_lineMerchandise_line

ProductsProducts

Product_codeProduct_codeDescriptionDescriptionMerchandise_lineMerchandise_line

What products belong to which product categories?What products belong to which product categories?Which products have restricted use aboard a plane?Which products have restricted use aboard a plane?What products belong to which product categories?What products belong to which product categories?Which products have restricted use aboard a plane?Which products have restricted use aboard a plane?

HaveHave

ClassifyClassify

4

R. Ching, Ph.D. • MIS • California State University, Sacramento

Connection Trap: Fan TrapConnection Trap: Fan Trap

Merchandise LinesMerchandise Lines

Merchandise_lineMerchandise_lineDescriptionDescription

Merchandise LinesMerchandise Lines

Merchandise_lineMerchandise_lineDescriptionDescription

Product CategoriesProduct Categories

Product_categoryProduct_categoryMerchandise_lineMerchandise_line

Product CategoriesProduct Categories

Product_categoryProduct_categoryMerchandise_lineMerchandise_line

ProductsProducts

Product_codeProduct_codeDescriptionDescriptionMerchandise_lineMerchandise_line

ProductsProducts

Product_codeProduct_codeDescriptionDescriptionMerchandise_lineMerchandise_line

HaveHave

ClassifyClassify

To satisfy these queries, To satisfy these queries, we need to form a relationshipwe need to form a relationship

What products belong to which product categories?What products belong to which product categories?Which products have restricted use aboard a plane?Which products have restricted use aboard a plane?What products belong to which product categories?What products belong to which product categories?Which products have restricted use aboard a plane?Which products have restricted use aboard a plane?

5

R. Ching, Ph.D. • MIS • California State University, Sacramento

Connection Trap: Chasm TrapConnection Trap: Chasm Trap

Merchandise LinesMerchandise Lines

Merchandise_lineMerchandise_lineDescriptionDescriptionUL_listingUL_listing

Merchandise LinesMerchandise Lines

Merchandise_lineMerchandise_lineDescriptionDescriptionUL_listingUL_listing

Product CategoriesProduct Categories

Product_categoryProduct_categoryMerchandise_lineMerchandise_line

Product CategoriesProduct Categories

Product_categoryProduct_categoryMerchandise_lineMerchandise_line

ProductsProducts

Product_codeProduct_codeDescriptionDescriptionProduct_categoryProduct_category

ProductsProducts

Product_codeProduct_codeDescriptionDescriptionProduct_categoryProduct_category

HaveHave

ClassifyClassify

To satisfy these queries, To satisfy these queries, we need to form a relationshipwe need to form a relationship

What products belong to the same merchandise line?What products belong to the same merchandise line?Which products require a UL listing?Which products require a UL listing?

What products belong to the same merchandise line?What products belong to the same merchandise line?Which products require a UL listing?Which products require a UL listing?

Known: What merchandise lines are composed of what productsKnown: What merchandise lines are composed of what products

6

R. Ching, Ph.D. • MIS • California State University, Sacramento

EER modelingEER modelingSuperclass and Subclass Entity TypesSuperclass and Subclass Entity Types

• Superclass - Higher order of classification or Superclass - Higher order of classification or categorizationcategorization

• Subclass - A member of a superclass that provides Subclass - A member of a superclass that provides specificationspecification

attributes of CDattributes of CDattributes of receiversattributes of receivers

attributes of cassette decksattributes of cassette decks

Electronic MerchandiseElectronic MerchandiseElectronic MerchandiseElectronic Merchandise

AudioAudioAudioAudio VisualVisualVisualVisual

CDCDCDCD ReceiverReceiverReceiverReceiver CassetteCassetteCassetteCassette

Music and VideosMusic and VideosMusic and VideosMusic and Videos SuperclassesSuperclasses

7

R. Ching, Ph.D. • MIS • California State University, Sacramento

EER modelingEER modelingSuperclass and Subclass Entity TypesSuperclass and Subclass Entity Types

• Specialization - top-downSpecialization - top-down– Maximizing differences between members by identifying Maximizing differences between members by identifying

distinguishing characteristicsdistinguishing characteristics

• Generalization - bottom-upGeneralization - bottom-up

attributes of CDattributes of CDattributes of receiversattributes of receivers

attributes of cassette decksattributes of cassette decks

GeneralGeneral

SpecificSpecific

Electronic MerchandiseElectronic MerchandiseElectronic MerchandiseElectronic Merchandise

AudioAudioAudioAudio VisualVisualVisualVisual

CDCDCDCD ReceiverReceiverReceiverReceiver CassetteCassetteCassetteCassette

8

R. Ching, Ph.D. • MIS • California State University, Sacramento

EER modelingEER modelingSuperclass and Subclass Entity TypesSuperclass and Subclass Entity Types

• Specialization - top-downSpecialization - top-down• Generalization - bottom-upGeneralization - bottom-up

– Minimizing differences between entities by identifying common Minimizing differences between entities by identifying common featuresfeatures

attributes of CDattributes of CDattributes of receiversattributes of receivers

attributes of cassette decksattributes of cassette decks

GeneralGeneral

SpecificSpecific

Electronic MerchandiseElectronic MerchandiseElectronic MerchandiseElectronic Merchandise

AudioAudioAudioAudio VisualVisualVisualVisual

CDCDCDCD ReceiverReceiverReceiverReceiver CassetteCassetteCassetteCassette

9

R. Ching, Ph.D. • MIS • California State University, Sacramento

EER modelingEER modelingAttribute InheritanceAttribute Inheritance

CDP-325 (Sony CD Changer)CDP-325 (Sony CD Changer)CDP-325 (Sony CD Changer)CDP-325 (Sony CD Changer)

attributes of CDattributes of CDattributes of receiversattributes of receivers

attributes of cassette decksattributes of cassette decks

GeneralGeneral

SpecificSpecific

Electronic MerchandiseElectronic MerchandiseElectronic MerchandiseElectronic Merchandise

AudioAudioAudioAudio VisualVisualVisualVisual

CDCDCDCD ReceiverReceiverReceiverReceiver CassetteCassetteCassetteCassette

Attributes common to all audio Attributes common to all audio merchandise are inheritedmerchandise are inherited

10

R. Ching, Ph.D. • MIS • California State University, Sacramento

EER DiagramEER Diagram

ProductsProductsProductsProducts

CDCDCDCD ReceiverReceiverReceiverReceiver CassetteCassetteCassetteCassette

dd

Product_codeProduct_code Prod_descipProd_descip

ManufacturersManufacturersManufacturersManufacturers SellsSells

ProducesProduces

11 MM

11

11

Disjoint constraintDisjoint constraint

Superclass/subclassSuperclass/subclass

Db rangeDb range FlutterFlutterWattsWatts

11

R. Ching, Ph.D. • MIS • California State University, Sacramento

ConstraintsConstraints

• Disjoint (d, o)Disjoint (d, o)

– Entity can be a member of only one of the subclasses of Entity can be a member of only one of the subclasses of specializationspecialization

– Under non-disjoint, an entity can be a member of more Under non-disjoint, an entity can be a member of more than one subclass of specialization than one subclass of specialization

• Participation (partial or total)Participation (partial or total)

– Total - every entity in the superclass must be a member Total - every entity in the superclass must be a member of a subclass in specializationof a subclass in specialization

– Partial - An entity need not belong to any of the Partial - An entity need not belong to any of the subclasses of specializationsubclasses of specialization

12

R. Ching, Ph.D. • MIS • California State University, Sacramento

Data NormalizationData Normalization

The process of decomposing complex data The process of decomposing complex data structures into simple relations according to a set structures into simple relations according to a set of dependency rules.of dependency rules.

McFadden and HofferMcFadden and Hoffer

13

R. Ching, Ph.D. • MIS • California State University, Sacramento

Data NormalizationData Normalization

• The purpose of normalization is to produce a stable set of The purpose of normalization is to produce a stable set of relations that is a faithful model of the operations of the relations that is a faithful model of the operations of the enterprise.enterprise.

– Achieve a design that is highly flexibleAchieve a design that is highly flexible

– Reduce redundancy Reduce redundancy – Ensure that the design is free of certain update, Ensure that the design is free of certain update,

insertion and deletion anomaliesinsertion and deletion anomalies

Catherine Richardo, Catherine Richardo, 19901990

14

R. Ching, Ph.D. • MIS • California State University, Sacramento

4NF4NF4NF4NF

BCNFBCNFBCNFBCNF

3NF3NF3NF3NF

2NF2NF2NF2NF

NormalizationNormalization

1NF1NF1NF1NF

Progressively putting the Progressively putting the relation into a higher relation into a higher

normal formnormal form

15

R. Ching, Ph.D. • MIS • California State University, Sacramento

Stereos To GoInvoice

Order No.

Date: / /

Account No.

ItemNumber Product Description/Manufacturer Qty Price

ProductCode

1

2

3

4

5

Date Shipped: / /

Customer:Address:

City State Zip Code

10001

6 15 99

0000-000-0000-0

John Smith2036-26 StreetSacramento CA 95819

SAGX730 Pioneer Remote A/V Receiver

AT10 Cervwin Vega LoudspeakersCDPC725 Sony Disc-Jockey CD Changer

6 18 99

SubtotalShipping & Handling

Sales TaxTotal

1329851000010306

153291

111

569953599539995

Go, HogsGo, Hogs

1/051/05

Stereos To Go

0000 000 0000 00000 000 0000 0John SmithJohn Smith

16

R. Ching, Ph.D. • MIS • California State University, Sacramento

File-Based SystemFile-Based System

Invoice ProgramInvoice ProgramInvoice ProgramInvoice Program

Customer MailingsCustomer MailingsProgramProgram

Customer MailingsCustomer MailingsProgramProgram

InvoicesInvoices

Customer AccountCustomer AccountProgramProgram

Customer AccountCustomer AccountProgramProgram

AccountAccount

ReportReport

Mailing ListMailing List

CustomerCustomerOrdersOrders

FileFile

FileFile

CustomerCustomerMailingMailing

ListList

Customer Customer AccountsAccounts

17

R. Ching, Ph.D. • MIS • California State University, Sacramento

Data RedundancyData Redundancy

• Customer Order FileCustomer Order File– PO numberPO number– Customer account numberCustomer account number– Customer name, address, city, state, zip codeCustomer name, address, city, state, zip code– Order dateOrder date– Product code, product description, price, unitProduct code, product description, price, unit

• Customer Account FileCustomer Account File– Account NumberAccount Number– Customer name, mailing address, city, state, zip codeCustomer name, mailing address, city, state, zip code

• Customer Mailing List FileCustomer Mailing List File– Customer name, mailing address, city, state, zip codeCustomer name, mailing address, city, state, zip code

18

R. Ching, Ph.D. • MIS • California State University, Sacramento

Unnormalized RelationUnnormalized Relation

How would a program process the data to recreate the invoice?How would a program process the data to recreate the invoice?

((Invoice_numberInvoice_number, Invoice_date, Date_delivered, Cust_account , Invoice_date, Date_delivered, Cust_account Cust_name Cust_addr Cust_city Cust_state Zip_code,Cust_name Cust_addr Cust_city Cust_state Zip_code,Item1 Item1_descrip Item1_qty Item1_price,Item1 Item1_descrip Item1_qty Item1_price,Item2 Item2_descrip Item2_qty Item2_price, Item2 Item2_descrip Item2_qty Item2_price, . . . , . . . , Item7 Item7_descrip Item7_qty Item7_price)Item7 Item7_descrip Item7_qty Item7_price)

19

R. Ching, Ph.D. • MIS • California State University, Sacramento

First Normal Form (1NF)First Normal Form (1NF)

• A relation is in first normal form if and only if every A relation is in first normal form if and only if every attribute is single-valued for each tuple.attribute is single-valued for each tuple.

– Remove all repeating groupsRemove all repeating groups

– Create a flat fileCreate a flat file

20

R. Ching, Ph.D. • MIS • California State University, Sacramento

Unnormalized to 1NFUnnormalized to 1NF

((Invoice_numberInvoice_number, Invoice_date, Date_delivered, Cust_account , Invoice_date, Date_delivered, Cust_account Cust_name Cust_addr Cust_city Cust_state Zip_code,Cust_name Cust_addr Cust_city Cust_state Zip_code,Item1, Item1_descrip, Item1_qty, Item1_price,Item1, Item1_descrip, Item1_qty, Item1_price,Item2, Item2_descrip, Item2_qty, Item2_price, Item2, Item2_descrip, Item2_qty, Item2_price, . . . , . . . , Item7, Item7_descrip, Item7_qty, Item7_price)Item7, Item7_descrip, Item7_qty, Item7_price)

A flat file places all the data of a transaction into a single record. A flat file places all the data of a transaction into a single record. A flat file places all the data of a transaction into a single record. A flat file places all the data of a transaction into a single record.

This is reminiscent of a COBOL or BASIC program This is reminiscent of a COBOL or BASIC program processing a single transaction with one read statement.processing a single transaction with one read statement.

Repeating groupsRepeating groups

21

R. Ching, Ph.D. • MIS • California State University, Sacramento

Unnormalized to 1NFUnnormalized to 1NF

Nominated group of attributes Nominated group of attributes to serve as the keyto serve as the key

(form a unique combination)(form a unique combination)

• Eliminate the repeating groups.Eliminate the repeating groups.• Each row retains data for one item.Each row retains data for one item.• If a person bought 5 items, we If a person bought 5 items, we

would have five tupleswould have five tuples

((Invoice_numberInvoice_number, Invoice_date, Date_delivered, Cust_account, , Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code,Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code,Item, Item_descrip, Item_qty, Item_price)Item, Item_descrip, Item_qty, Item_price)

22

R. Ching, Ph.D. • MIS • California State University, Sacramento

1NF1NF

10001 123456 John Smith 10001 123456 John Smith ••• SAGX730••• SAGX730 Pioneer Remote A/V Rec Pioneer Remote A/V Rec 11 569.95 569.9510001 123456 John Smith 10001 123456 John Smith ••• SAGX730••• SAGX730 Pioneer Remote A/V Rec Pioneer Remote A/V Rec 11 569.95 569.95

10001 123456 John Smith10001 123456 John Smith ••• ••• AT10 AT10 Cerwin Vega LoudspeakersCerwin Vega Loudspeakers 1 359.951 359.9510001 123456 John Smith10001 123456 John Smith ••• ••• AT10 AT10 Cerwin Vega LoudspeakersCerwin Vega Loudspeakers 1 359.951 359.95

10001 123456 John Smith10001 123456 John Smith ••• ••• CDPC725 CDPC725 Sony Disc Jockey CD Sony Disc Jockey CD 11 399.95 399.9510001 123456 John Smith10001 123456 John Smith ••• ••• CDPC725 CDPC725 Sony Disc Jockey CD Sony Disc Jockey CD 11 399.95 399.95

10001 123456 John Smith10001 123456 John Smith ••• ••• S/HS/H Shipping Shipping 11 100.00 100.0010001 123456 John Smith10001 123456 John Smith ••• ••• S/HS/H Shipping Shipping 11 100.00 100.00

10001 123456 John Smith10001 123456 John Smith ••• ••• TaxTax Sales Tax Sales Tax 11 103.06 103.0610001 123456 John Smith10001 123456 John Smith ••• ••• TaxTax Sales Tax Sales Tax 11 103.06 103.06

Flat FileFlat File

Invo

ice nu

mber

Invo

ice nu

mber

Accou

nt nu

mber

Accou

nt nu

mber

Custom

er na

me

Custom

er na

me

DescriptionDescriptionItem Item

QuantityQuantityItem Item PricePriceItemItem

23

R. Ching, Ph.D. • MIS • California State University, Sacramento

Second Normal Form (2NF)Second Normal Form (2NF)

• A relation is in second normal form if and only if it is in A relation is in second normal form if and only if it is in first normal form and the nonkey attributes are fully first normal form and the nonkey attributes are fully functionally dependent on the key.functionally dependent on the key.

• Full functional dependencyFull functional dependency

– B is functionally dependent on A if each value of A is B is functionally dependent on A if each value of A is associated with exactly one value of Bassociated with exactly one value of B

DeterminantDeterminant

Attribute BAttribute BAttribute BAttribute BAttribute AAttribute AAttribute AAttribute A

24

R. Ching, Ph.D. • MIS • California State University, Sacramento

2NF2NF

• Second normal form Second normal form applies to relations with composite applies to relations with composite keyskeys (i.e., a primary key composed of two or more (i.e., a primary key composed of two or more attributes)attributes)

• A relation with a single attribute primary key is A relation with a single attribute primary key is automatically in at least 2NFautomatically in at least 2NF

25

R. Ching, Ph.D. • MIS • California State University, Sacramento

From 1NF to 2NFFrom 1NF to 2NF

If the primary key consisted of invoice_number and item (i.e., If the primary key consisted of invoice_number and item (i.e., composite key), we would need to remove the partial composite key), we would need to remove the partial dependencies.dependencies.

What attribute(s) can be used to uniquely identify a tuple?What attribute(s) can be used to uniquely identify a tuple?

(Invoice_number, Invoice_date, Date_delivered, (Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code,Cust_state, Zip_code,Item, Item_descrip, Item_qty, Item_price)Item, Item_descrip, Item_qty, Item_price)

26

R. Ching, Ph.D. • MIS • California State University, Sacramento

From 1NF to 2NFFrom 1NF to 2NF

Some of the attributes are Some of the attributes are dependentdependent upon invoice_number for upon invoice_number for their values and others on item. In either case, they are not their values and others on item. In either case, they are not functionally dependentfunctionally dependent on the entire key. on the entire key.

Using Invoice number and Item as the key...Using Invoice number and Item as the key...

((Invoice_numberInvoice_number, Invoice_date, Date_delivered, , Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code,Cust_state, Zip_code,ItemItem, Item_descrip, Item_qty, Item_price), Item_descrip, Item_qty, Item_price)

27

R. Ching, Ph.D. • MIS • California State University, Sacramento

From 1NF to 2NFFrom 1NF to 2NF

Which attributes are functionally dependent on which keys?Which attributes are functionally dependent on which keys?

Invoice_numberInvoice_numberVs.Vs.ItemItem

Invoice_date, Invoice_date, Date_delivered, Date_delivered, Cust_account, Cust_account, Cust_name, Cust_name, Cust_addr, Cust_addr, Cust_city, Cust_city, Cust_state, Cust_state, Zip_code,Zip_code,tem_descrip, tem_descrip, Item_qty, Item_qty, Item_priceItem_price

??

28

R. Ching, Ph.D. • MIS • California State University, Sacramento

From 1NF to 2NFFrom 1NF to 2NF

((Invoice_numberInvoice_number, Invoice_date, Date_delivered, , Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code)Cust_state, Zip_code)

((ItemItem, Item_descrip, Item_qty, Item_price), Item_descrip, Item_qty, Item_price)

Is this unique by itself?Is this unique by itself?What happens if the item is purchased more than once?What happens if the item is purchased more than once?

29

R. Ching, Ph.D. • MIS • California State University, Sacramento

From 1NF to 2NFFrom 1NF to 2NF

((Invoice_numberInvoice_number, Invoice_date, Date_delivered, , Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code)Cust_state, Zip_code)

((Invoice_number,Invoice_number, ItemItem, Item_descrip, Item_qty, Item_price), Item_descrip, Item_qty, Item_price)

Composite key (forms a unique combination)Composite key (forms a unique combination)

Partial dependencyPartial dependency

30

R. Ching, Ph.D. • MIS • California State University, Sacramento

From 1NF to 2NFFrom 1NF to 2NF

((Invoice_numberInvoice_number, Invoice_date, Date_delivered, Cust_account, , Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code)Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code)

((Invoice_number,Invoice_number, ItemItem, Item_qty, Item_price), Item_qty, Item_price)

((ItemItem, Item_descrip), Item_descrip)

31

R. Ching, Ph.D. • MIS • California State University, Sacramento

From 1NF to 2NFFrom 1NF to 2NF

((Invoice_numberInvoice_number, , Invoice_dateInvoice_date, Date_delivered,, Date_delivered,Cust_account Cust_name Cust_addr Cust_city Cust_state Zip_code,Cust_account Cust_name Cust_addr Cust_city Cust_state Zip_code,Item Item_descrip Item_qty Item_price)Item Item_descrip Item_qty Item_price)

If the primary key consisted of invoice_number and If the primary key consisted of invoice_number and Invoice_date (i.e., composite key), we would NOT have partial Invoice_date (i.e., composite key), we would NOT have partial dependencies. Thus, the relation would be in 2NF.dependencies. Thus, the relation would be in 2NF.

In contrast...In contrast...

32

R. Ching, Ph.D. • MIS • California State University, Sacramento

Third Normal Form (3NF)Third Normal Form (3NF)

• A relation is in third normal form if it is in second normal A relation is in third normal form if it is in second normal form and no nonkey attribute is transitively dependent on form and no nonkey attribute is transitively dependent on the key.the key.

– Remove transitive dependenciesRemove transitive dependencies– ““Each nonkey attribute must depend upon the key, the Each nonkey attribute must depend upon the key, the

whole key, and nothing but key.”whole key, and nothing but key.”

Kent, 1978Kent, 1978

33

R. Ching, Ph.D. • MIS • California State University, Sacramento

From 2NF to 3NFFrom 2NF to 3NF

Which attributes are dependent on others?Which attributes are dependent on others?Is there a problem?Is there a problem?

((Invoice_numberInvoice_number, Invoice_date, Date_delivered, Cust_account, , Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code)Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code)

((Invoice_number,Invoice_number, ItemItem, Item_qty, Item_price), Item_qty, Item_price)

((ItemItem, Item_descrip), Item_descrip)

34

R. Ching, Ph.D. • MIS • California State University, Sacramento

Transitive Dependencies and AnomaliesTransitive Dependencies and Anomalies

• Insertion anomaliesInsertion anomalies

– To add a new row, all customer (name, address, city, To add a new row, all customer (name, address, city, state, zip code, phone) and products (description) must state, zip code, phone) and products (description) must be consistent with previous entriesbe consistent with previous entries

• Deletion anomaliesDeletion anomalies

– By deleting a row, a customer or product may cease to By deleting a row, a customer or product may cease to existexist

• Modification anomaliesModification anomalies

– To modify a customer’s or product’s data in one row, To modify a customer’s or product’s data in one row, all modifications must be carried out to all othersall modifications must be carried out to all others

35

R. Ching, Ph.D. • MIS • California State University, Sacramento

Insertion and Modification AnomaliesInsertion and Modification AnomaliesFor example…For example…

DVD-A110DVD-A110 PanasonicPanasonicPV-4210PV-4210 PanasonicPanasonicPV-4250PV-4250 PanasonicPanasonic

DVD-A110DVD-A110 PanasonicPanasonicPV-4210PV-4210 PanasonicPanasonicPV-4250PV-4250 PanasonicPanasonic

CT-32S35CT-32S35 PANPANCT-32S35CT-32S35 PANPAN

InconsistencyInconsistency

DVD-A110DVD-A110 PanasonicPanasonicPV-4210PV-4210 PanaSonicPanaSonicPV-4250PV-4250 Pana SonicPana SonicCT-32S35CT-32S35 PANPAN

DVD-A110DVD-A110 PanasonicPanasonicPV-4210PV-4210 PanaSonicPanaSonicPV-4250PV-4250 Pana SonicPana SonicCT-32S35CT-32S35 PANPAN

Change all Panasonic Change all Panasonic products’ manufacturer products’ manufacturer

name to “Panasonic USA”name to “Panasonic USA”

Product_codeProduct_code Manufacturer_nameManufacturer_nameInsert a new Panasonic productInsert a new Panasonic product

36

R. Ching, Ph.D. • MIS • California State University, Sacramento

Deletion AnomalyDeletion AnomalyFor Example…For Example…

43771824377182 John SmithJohn Smith SacramentoSacramento CACA 958319583143987114398711 Arnold SArnold S DavisDavis CACA 956919569145784614578461 Gray DavisGray Davis SacramentoSacramento CACA 958319583148731794873179 Lisa CarrLisa Carr RenoReno NVNV 8955789557

By deleting customer Arnold S, we would also be deleting By deleting customer Arnold S, we would also be deleting Davis, California. Davis, California.

37

R. Ching, Ph.D. • MIS • California State University, Sacramento

Transitive Transitive DependenciesDependencies

Invoice_numberInvoice_number

Invoice_dateInvoice_date

Date_deliveredDate_delivered

Cust_accountCust_account

Cust_nameCust_name

Cust_addrCust_addr

Cust_cityCust_city

Cust_stateCust_state

Zip_codeZip_code

ItemItem

Item_descripItem_descrip

Invoice_number+ItemInvoice_number+Item

Item_qtyItem_qty

Item_priceItem_price

A condition where A, B, C A condition where A, B, C are attributes of a relation are attributes of a relation such that if A such that if A B and B and B B C, then C is transitively C, then C is transitively dependent on A via B dependent on A via B (provided that A is not (provided that A is not functionally dependent on B functionally dependent on B or C).or C).

38

R. Ching, Ph.D. • MIS • California State University, Sacramento

Why Should City and State Be Separated Why Should City and State Be Separated from Customer Relation?from Customer Relation?

• City and state are dependent on zip code for their values City and state are dependent on zip code for their values and not the customer’s identifier (i.e., key).and not the customer’s identifier (i.e., key).

Zip_code Zip_code City, State City, State

• Otherwise,Otherwise,

Cust_account Cust_account Cust_addr, Zip_code Cust_addr, Zip_code City, State City, State

In which case, you have transitive dependency.In which case, you have transitive dependency.

39

R. Ching, Ph.D. • MIS • California State University, Sacramento

3NF3NF

Invoice RelationInvoice Relation(Invoice_number, Invoice_date, Date_delivered, Cust_account)(Invoice_number, Invoice_date, Date_delivered, Cust_account)

Customer RelationCustomer Relation(Cust_account, Cust_name, Cust_addr, Zip_code)(Cust_account, Cust_name, Cust_addr, Zip_code)

Zip_code RelationZip_code Relation(Zip_code, City, State)(Zip_code, City, State)

Invoice_items RelationInvoice_items Relation(Invoice_number, Item, Item_qty, Item_price)(Invoice_number, Item, Item_qty, Item_price)

Items RelationItems Relation(Item, Item_descrip)(Item, Item_descrip)

40

R. Ching, Ph.D. • MIS • California State University, Sacramento

3NF3NF

Invoice RelationInvoice Relation(Invoice_number, Invoice_date, Date_delivered, Cust_account)(Invoice_number, Invoice_date, Date_delivered, Cust_account)

Customer RelationCustomer Relation(Cust_account, Cust_name, Cust_addr, Zip_code)(Cust_account, Cust_name, Cust_addr, Zip_code)

Zip_code RelationZip_code Relation(Zip_code, City, State)(Zip_code, City, State)

Invoice_items RelationInvoice_items Relation(Invoice_number, Item, Item_qty, Item_price)(Invoice_number, Item, Item_qty, Item_price)

Items RelationItems Relation(Item, Item_descrip)(Item, Item_descrip)

Since the Items relation contains the manufacturer’s name in Since the Items relation contains the manufacturer’s name in the description, a separate Manufacturers relation can be the description, a separate Manufacturers relation can be createdcreated

Manufacturers RelationManufacturers Relation(Manuf_code, Manuf_name)(Manuf_code, Manuf_name)

41

R. Ching, Ph.D. • MIS • California State University, Sacramento

First to Third Normal FormFirst to Third Normal Form(1NF - 3NF)(1NF - 3NF)

• 1NF: 1NF: A relation is in first normal form if and only if every A relation is in first normal form if and only if every attribute is single-valued for each tuple attribute is single-valued for each tuple (remove the (remove the repeating or multi-value attributes and create a flat file)repeating or multi-value attributes and create a flat file)

• 2NF: 2NF: A relation is in second normal form if and only if it A relation is in second normal form if and only if it is in first normal form and the nonkey attributes are fully is in first normal form and the nonkey attributes are fully functionally dependent on the key functionally dependent on the key (remove partial (remove partial dependencies)dependencies)

• 3NF: 3NF: A relation is in third normal form if it is in second A relation is in third normal form if it is in second normal form and no nonkey attribute is transitively normal form and no nonkey attribute is transitively dependent on the keydependent on the key (remove transitive dependencies) (remove transitive dependencies)

42

R. Ching, Ph.D. • MIS • California State University, Sacramento