34
CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

Page 1: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

CMSC424: Database Design

Lecture 3

Page 2: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005 2

Database Design Steps

Three Levels of Modeling

info

Conceptual Data Model

Logical Data Model

Physical Data Model

Conceptual DB design

Logical DB design

Physical DB design

Entity-relationship Model Typically used for conceptual database design

Relational Model Typically used for logical database design

Page 3: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Review: Entity-Relationship Model

Basics

E1 Entity set

R Relationship set

Attribute (primary key if underlined)

E1

……

R E2

a1 an

c1 ck

b1 bm

a

Page 4: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

• One-to-One

• One-to-Many

• One customer can be associated with many accounts.

customer has account

customer has account

Relationship Cardinalities

Page 5: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

• One-to-One

• One-to-Many

• Many-to-One

• Many customers can be associated with one account.

customer has account

customer has account

customer has account

Relationship Cardinalities

Page 6: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

• One-to-One

• One-to-Many

• Many-to-One

• Many-to-Many

customer has account

customer has account

customer has account

customer has account

Relationship Cardinalities

Page 7: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Relationship Set Keys

• Is {cust-id, account-number} a candidate key ?• Depends

customer has

cust-id

account

numberaccess-date

• If one-to-many relationship (as shown), {account-number} is a candidate key• A given customer can have many accounts, but at most one account

holder per account allowed• So account number sufficient to uniquely identify a relationship

Page 8: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Thoughts…

Nothing about actual data

How is it stored ?

No talk about the query languages

How do we access the data ?

Semantic vs Syntactic Data ModelsRemember: E/R Model is used for conceptual modeling

Many conceptual models have the same properties

They are much more about representing the knowledge than about database storage/querying

Page 9: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Thoughts…

Basic design principles

FaithfulMust make sense

Satisfies the application requirements

Models the requisite domain knowledgeIf not modeled, lost afterwards

Avoid redundancyPotential for inconsistencies

Go for simplicity

Typically an iterative process that goes back and forth

Page 10: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Relational Data Model

• Before = “Network Data Model” (Cobol as DDL, DML)• Very contentious: Database Wars (Charlie Bachman vs. Mike Stonebraker)

Introduced by Ted Codd (late 60’s – early 70’s)

1. Separation of logical, physical data models (data independence)2. Declarative query languages3. Formal semantics4. Query optimization (key to commercial success)

Relational data model contributes:

• Ingres CA • Postgres Illustra Informix IBM• System R Oracle, DB2

1st prototypes:

Page 11: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Key Abstraction: Relation

bname acct_no balance

Downtown

Brighton

Brighton

A-101

A-201

A-217

500

900

500

Account =

Terms:

• Tables (aka: Relations)

Why called Relations?

Page 12: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Why Called Relations?

Given sets: R = {1, 2, 3}, S = {3, 4}

• R S = { (1, 3), (1, 4), (2, 3), (2, 4), (3, 3), (3, 4) }

• A relation on R, S is any subset () of R S (e.g: { (1, 4), (3,

4)})

Mathematical relations

Account Branches Accounts Balances{ (Downtown, A-101, 500),

(Brighton, A-201, 900),(Brighton, A-217, 500) }

Database relationsGiven attribute domains

Branches = { Downtown, Brighton, … }Accounts = { A-101, A-201, A-217, … }Balances = R

Page 13: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Relations

bname acct_no balance

Downtown

Brighton

Brighton

A-101

A-201

A-217

500

900

500

Account =

Relational database semantics defined in terms of mathematical relations

{ (Downtown, A-101, 500), (Brighton, A-201, 900), (Brighton, A-217, 500) }

Considered equivalent to…

Page 14: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Relations

bname acct_no balance

Downtown

Brighton

Brighton

A-101

A-201

A-217

500

900

500

• Rows (aka: tuples)

Account =

Terms:

• Columns (aka: attributes)

{ (Downtown, A-101, 500), (Brighton, A-201, 900), (Brighton, A-217, 500) }

Considered equivalent to…

• Tables (aka: Relations)

• Schema (e.g.: Acct_Schema = (bname, acct_no, balance))

Page 15: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Definitions

1. Relation Schema (or Schema) A list of attributes and their domains

We will require the domains to be atomic

E.g. account(account-number, branch-name, balance)

• Relation Instance A particular instantiation of a relation with actual values

Will change with time

bname acct_no balance

Downtown

Brighton

Brighton

A-101

A-201

A-217

500

900

500

Programming language equivalent: A variable (e.g. x)

Programming language equivalent: Value of a variable

Page 16: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Rest of the Class

• Converting from an E/R diagram to a relational schema

– Remember: We still use E/R models for conceptual modeling of the database

• Relational Algebra– Data retrieval language

Page 17: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams Relations

Convert entity sets into a relational schema with the same set of attributes

Customer

cname ccity cstreet

Customer_Schema(cname, ccity, cstreet)

Branch

bname bcity assetsBranch_Schema(bname, bcity, assets)

Page 18: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams Relations

Convert relationship sets also into a relational schema

Remember: A relationship is completely described by primary keys of associate entities and its own attributes

Depositor_Schema(cname, acct-no, access-date)

Account

Customer

Depositor

acct-no balance

cname ccity cstreet

access-date

Well… Not quite. We can do better.It depends on the relationship cardinality

Customer_Schema(cname, ccity, cstreet)

Account_Schema(acct-no, balance)

Page 19: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams Relations

Say One-to-Many Relationship from Customer to Account

Many accounts per customer

Account

Customer

Depositor

acct-no balance

cname ccity cstreet

access-date

Customer_Schema(cname, ccity, cstreet)

Account_Schema(acct-no, balance,cname, access-date)

Exactly same information, fewer tables

Page 20: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams Relations

E/R Relational SchemaEntity Sets

E = (a1, …, an)E1

…a1 an

Page 21: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams Relations

E/R Relational SchemaEntity Sets

E = (a1, …, an)

Relationship Sets

R = (a1, b1, c1, …, cn)

a1: E1’s key

b1: E2’s key

c1, …, ck: attributes of R

Not the whole story for Relationship Sets …

E1

…a1 an

E1

……

R E2

a1 an

c1 ck

b1 bm

Page 22: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams Relations

Relationship Cardinality Relational Schema

n:m E1 = (a1, …, an)E2 = (b1, …, bm)R = (a1, b1, c1, …, cn)

R

E1

……

R E2

a1 an

c1 ck

b1 bm

Page 23: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams Relations

Relationship Cardinality Relational Schema

n:m E1 = (a1, …, an)E2 = (b1, …, bm)R = (a1, b1, c1, …, cn)

n:1 E1 = (a1, …, an, b1, c1, …, cn)E2 = (b1, …, bm)

R

R

E1

……

R E2

a1 an

c1 ck

b1 bm

Page 24: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams Relations

Relationship Cardinality Relational Schema

n:m E1 = (a1, …, an)E2 = (b1, …, bm)R = (a1, b1, c1, …, cn)

n:1 E1 = (a1, …, an, b1, c1, …, cn)E2 = (b1, …, bm)

1:n E1 = (a1, …, an)E2 = (b1, …, bm,, a1, c1, …, cn)

R

R

R

E1

……

R E2

a1 an

c1 ck

b1 bm

Page 25: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams Relations

Relationship Cardinality Relational Schema

n:m E1 = (a1, …, an)E2 = (b1, …, bm)R = (a1, b1, c1, …, cn)

n:1 E1 = (a1, …, an, b1, c1, …, cn)E2 = (b1, …, bm)

1:n E1 = (a1, …, an)E2 = (b1, …, bm,, a1, c1, …, cn)

1:1

Treat as n:1 or 1:n

R

R

R

R

E1

……

R E2

a1 an

c1 ck

b1 bm

Page 26: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Translating E/R Diagrams to Relations

Acct-BranchAccount Branch

BorrowerCustomer Loan

Depositor Loan-Branch

Q. How many tables does this get translated into?A. 6 (account, branch, customer, loan, depositor, borrower)

acct_no balance bname bcity assets

cname ccity cstreet lno amt

Page 27: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Bank DatabaseAccount

bname acct_no balance

Depositor

cname acct_no

Customer

cname cstreet ccity

Branch

bname bcity assets

Borrower

cname lno

Loan

bname lno amt

Page 28: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Bank DatabaseAccount

bname acct_no balance

DowntownMianusPerryR.H.

BrightonRedwoodBrighton

A-101A-215A-102A-305A-201A-222A-217

500700400350900700750

Depositor

cname acct_no

JohnsonSmithHayesTurner

JohnsonJones

Lindsay

A-101A-215A-102A-305A-201A-217A-222

Customer

cname cstreet ccity

JonesSmithHayesCurry

LindsayTurner

WilliamsAdamsJohnsonGlennBrooksGreen

MainNorthMainNorthPark

PutnamNassauSpringAlma

Sand HillSenatorWalnut

HarrisonRye

HarrisonRye

PittsfieldStanfordPrincetonPittsfieldPalo AltoWoodsideBrooklynStanford

Branch

bname bcity assets

DowntownRedwood

PerryMianus

R.H.Pownel

N. TownBrighton

BrooklynPalo AltoHorseneckHorseneckHorseneckBennington

RyeBrooklyn

9M2.1M1.7M0.4M8M

0.3M3.7M7.1M

Borrower

cname lno

JonesSmithHayes

JacksonCurrySmith

WilliamsAdams

L-17L-23L-15L-14L-93L-11L-17L-16

Loan

bname lno amt

DowntownRedwood

PerryDowntown

MianusR.H.Perry

L-17L-23L-15L-14L-93L-11L-16

1000200015001500500900

1300

Page 29: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams & Relations

E/R Relational Schema

Weak Entity Sets

E1 = (a1, …, an)

E2 = (a1, b1, …, bm)E1

……

IR E2

a1 anb1 bm

Page 30: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams & Relations

E/R Relational SchemaMultivalued Attributes

Emp = (ssn, name)

Emp-Phones = (ssn, phone)

Emp

ssn name

001

Smith

Emp-Phones

ssn phone

001

001

4-1234

4-5678

Employee

ssn name phone

Page 31: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams & Relations

E/R Relational Schema

Subclasses

Method 1:

E = (a1, …, an)

E1 = (a1, b1, …, bm)

E2 = (a1, c1, …, ck)

E

E2

ISA

E1

……b1bm c1 ck

…a1an

Page 32: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams & Relations

E/R Relational SchemaSubclasses

Method 1:

E = (a1, …, an)

E1 = (a1, b1, …, bm)

E2 = (a1, c1, …, ck)

Method 2:

E1 = (a1, …, an, b1, …, bm)

E2 = (a1, …, an, c1, …, ck)

E

E2

ISA

E1

……b1bm c1 ck

…a1an

Page 33: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

E/R Diagrams & Relations

Subclasses example:

Method 1:Account = (acct_no, balance)SAccount = (acct_no, interest)CAccount = (acct_no, overdraft)

Method 2:SAccount = (acct_no, balance, interest)CAccount = (acct_no, balance, overdraft)

Q: When is method 2 not possible?

A: When subclassing is partial

Page 34: CMSC424, Spring 2005 CMSC424: Database Design Lecture 3

CMSC424, Spring 2005

Keys and Relations

1. Superkeys• set of attributes of table for which every row has distinct set of values

2. Candidate keys•“minimal” superkeys

3. Primary keys•DBA-chosen candidate keys

As in the E/R Model:

Act as Integrity Constraintsi.e., guard against illegal/invalid instance of given schema

e.g., Branch = (bname, bcity, assets)

bname bcity assets

Brighton

Brighton

Brooklyn

Boston

5M

3M

Invalid!!