47
Introduction to Introduction to Database Design Database Design Entity Relationship Model Entity Relationship Model

Introduction to Database Design Entity Relationship Model

  • View
    262

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Introduction to Database Design Entity Relationship Model

Introduction to Database Introduction to Database DesignDesign

Entity Relationship ModelEntity Relationship Model

Page 2: Introduction to Database Design Entity Relationship Model

Design of a DatabaseDesign of a Database

Design phases:Design phases: Requirement AnalysisRequirement Analysis

• Talk to people and figure out what they wantTalk to people and figure out what they want Conceptual Database DesignConceptual Database Design

• Do the designDo the design• Many tools/modeling techniques Many tools/modeling techniques

ERER, , UMLUML, Rambaugh, Booch, Yordon, Rambaugh, Booch, Yordon Logical Database DesignLogical Database Design

• Actual database tables in relational model, or OO model Actual database tables in relational model, or OO model or XML modelor XML model

• Here – only relational model.Here – only relational model.

Page 3: Introduction to Database Design Entity Relationship Model

Overview of Database Overview of Database DesignDesign

Conceptual designConceptual design: : ((ER Model ER Model is used at this is used at this stage.) stage.) What are the What are the entitiesentities and and relationshipsrelationships in the enterprise? in the enterprise? What information about these entities and relationships What information about these entities and relationships

should we store in the database?should we store in the database? What are the What are the integrity constraints integrity constraints or or business rules business rules that that

hold? hold?

A database `schema’ in the ER Model can be A database `schema’ in the ER Model can be represented pictorially (represented pictorially (ER diagramsER diagrams).).

Can map an ER diagram into a relational schema.Can map an ER diagram into a relational schema.

Page 4: Introduction to Database Design Entity Relationship Model

Entity-Relationship ModelEntity-Relationship Model

Entity SetsEntity Sets Relationship SetsRelationship Sets Mapping Constraints Mapping Constraints KeysKeys E-R DiagramE-R Diagram Extended E-R FeaturesExtended E-R Features Design Issues Design Issues Design of an E-R Database SchemaDesign of an E-R Database Schema Reduction of an E-R Schema to TablesReduction of an E-R Schema to Tables

Page 5: Introduction to Database Design Entity Relationship Model

Entity SetsEntity Sets

A A databasedatabase can be modeled as: can be modeled as: a collection of entities,a collection of entities, relationship among entities.relationship among entities.

An An entityentity is an object that exists and is distinguishable is an object that exists and is distinguishable from other objects.from other objects.

Example: specific person, company, event, plantExample: specific person, company, event, plant Entities are described using Entities are described using attributesattributes

Example: people have Example: people have names names and and addressesaddresses An An entityentity setset is a set of entities of the same type that is a set of entities of the same type that

share the same properties.share the same properties.Example: set of all persons, companies, trees, Example: set of all persons, companies, trees,

holidaysholidays

Page 6: Introduction to Database Design Entity Relationship Model

Entity Sets Entity Sets customercustomer and and loanloan

customer-id customer- customer- customer- loan- amount name street city number

Page 7: Introduction to Database Design Entity Relationship Model

AttributesAttributes An An entityentity is represented by a is represented by a set of attributesset of attributes, that is , that is

descriptive properties possessed by all members of an entity set.descriptive properties possessed by all members of an entity set.

Example: Example:

customer = (customer-id, customer-name, customer = (customer-id, customer-name, customer-street, customer-city) customer-street, customer-city)loan = (loan-number, amount)loan = (loan-number, amount)

DomainDomain – the set of permitted values for each attribute – the set of permitted values for each attribute

Keys: Minimal set of attributes whose values uniquely identify an Keys: Minimal set of attributes whose values uniquely identify an entity in the setentity in the set

Candidate Keys: all sets of attributes that can potentially be a key.Candidate Keys: all sets of attributes that can potentially be a key. Primary Key: One of the candidate keys is chosen to be a “primary” Primary Key: One of the candidate keys is chosen to be a “primary”

key. key.

Page 8: Introduction to Database Design Entity Relationship Model

Relationship SetsRelationship Sets A A relationshiprelationship is an association among several entities is an association among several entities

Example:Example: Hayes Hayes depositordepositor A-102A-102

customercustomer entity entity relationship setrelationship set accountaccount entity entity

A A relationship relationship setset is a mathematical relation among is a mathematical relation among nn 2 entities, 2 entities, each taken from entity setseach taken from entity sets{({(ee11, , ee22, … , … eenn) | ) | ee11 EE11, , ee22 EE22, …, , …, eenn EEnn}}

where (where (ee11, , ee22, …, , …, eenn) is a relationship) is a relationship Example: Example:

(Hayes, A-102) (Hayes, A-102) depositordepositor There can be multiple relationship sets between the same two There can be multiple relationship sets between the same two

entities.entities. A relationship must be uniquely identified by the participating entities. A relationship must be uniquely identified by the participating entities.

Page 9: Introduction to Database Design Entity Relationship Model

Relationship Set Relationship Set borrowerborrower

Page 10: Introduction to Database Design Entity Relationship Model

Descriptive AttributesDescriptive Attributes Descriptive attributes: used to record information about the Descriptive attributes: used to record information about the

relationshiprelationship When was the last time that the When was the last time that the customercustomer accessedaccessed his/her his/her accountaccount. .

Page 11: Introduction to Database Design Entity Relationship Model

E-R DiagramsE-R Diagrams

Rectangles represent entity sets.

Diamonds represent relationship sets.

Lines link attributes to entity sets and entity sets to relationship sets.

Ellipses represent attributes

Underline indicates primary key attributes (coming up)

Page 12: Introduction to Database Design Entity Relationship Model

Ternary RelationshipsTernary Relationships Ternary relationships - used to record associations between three Ternary relationships - used to record associations between three

entity sets.entity sets. Example: Each branch has several jobs that can be worked on by Example: Each branch has several jobs that can be worked on by

For this we need to record the association between employees, branches For this we need to record the association between employees, branches and jobs.and jobs.

Page 13: Introduction to Database Design Entity Relationship Model

Roles/Self Referential Roles/Self Referential RelationshipsRelationships

Entity sets of a relationship need not be distinctEntity sets of a relationship need not be distinct The labels “manager” and “worker” are called The labels “manager” and “worker” are called rolesroles; they specify ; they specify

how employee entities interact via the works-for relationship set.how employee entities interact via the works-for relationship set. Roles are indicated in E-R diagrams by labeling the lines that Roles are indicated in E-R diagrams by labeling the lines that

connect diamonds to rectangles.connect diamonds to rectangles. Role labels are optional, and are used to clarify semantics of the Role labels are optional, and are used to clarify semantics of the

relationshiprelationship

Page 14: Introduction to Database Design Entity Relationship Model

Constraints in ER Constraints in ER

Key ConstraintsKey Constraints Cardinality ConstraintsCardinality Constraints Participation ConstraintsParticipation Constraints

Overlapping Constraints (ISA)Overlapping Constraints (ISA) Coverage Constraints (ISA)Coverage Constraints (ISA)

Page 15: Introduction to Database Design Entity Relationship Model

Key ConstraintsKey Constraints Consider Consider depositordepositor relationship: A customer can deposit into relationship: A customer can deposit into

many accounts; an account can have many depositors. many accounts; an account can have many depositors.

Compare with: Each department has at most one ManagerCompare with: Each department has at most one Manager

Contrast with: Each customer can be the borrower on one loan. However, each loan can have many borrowers. The restriction that each customer can be borrower on one loan => Key Constraint

Page 16: Introduction to Database Design Entity Relationship Model

Key Constraint IIKey Constraint II

Relationship set like Relationship set like borrowerborrower - sometimes said to - sometimes said to be be one-to-manyone-to-many

Relationship set between Relationship set between customerscustomers and and accountsaccounts -> -> many-to-manymany-to-many

Page 17: Introduction to Database Design Entity Relationship Model

Key Constraint IIIKey Constraint III Additional Restriction: a Additional Restriction: a loanloan may be borrowed by only one may be borrowed by only one customercustomer -> -> one-to-oneone-to-one

Textbook clarification: arrow shown to go from customer to borrower Textbook clarification: arrow shown to go from customer to borrower Means same thing!Means same thing! Implies that Implies that customercustomer entity participates in the entity participates in the borrowerborrower relationship set only once. relationship set only once.

Page 18: Introduction to Database Design Entity Relationship Model

Key Constraints for Ternary Key Constraints for Ternary RelationshipsRelationships

Key constraints in binary relationships can Key constraints in binary relationships can be easily extended to ternary.be easily extended to ternary.

Page 19: Introduction to Database Design Entity Relationship Model

Alternative Notation for Alternative Notation for Cardinality LimitsCardinality Limits

Cardinality limits can also express participation constraints

Page 20: Introduction to Database Design Entity Relationship Model

Participation ConstraintsParticipation Constraints

Total participation (indicated by double/thick line): every entity in the entity set participates in at least one relationship in the relationship set E.g. participation of loan in borrower is total

every loan must have a customer associated to it via borrower

Partial participation: some entities may not participate in any relationship in the relationship set E.g. participation of customer in borrower is partial Not every customer has a loan

Page 21: Introduction to Database Design Entity Relationship Model

KeysKeys

A A super keysuper key of an entity set is a set of one or more of an entity set is a set of one or more attributes whose values uniquely determine each attributes whose values uniquely determine each entity.entity.

A A candidate keycandidate key of an entity set is a minimal super of an entity set is a minimal super keykey

Customer-idCustomer-id is candidate key of is candidate key of customercustomer account-numberaccount-number is candidate key of is candidate key of accountaccount

Although several candidate keys may exist, one of Although several candidate keys may exist, one of the candidate keys is selected to be the the candidate keys is selected to be the primary primary keykey..

Page 22: Introduction to Database Design Entity Relationship Model

Weak Entity SetsWeak Entity Sets Assumption so far:Assumption so far:

Attributes associated with an entity contain a key (to uniquely identify the Attributes associated with an entity contain a key (to uniquely identify the entities)entities)

Not always the case!Not always the case! Example:Example:

Employees can purchase policies to cover their dependents.Employees can purchase policies to cover their dependents. We need to record information about policies:We need to record information about policies:

• Who is covered, Who owns the policyWho is covered, Who owns the policy Don’t really care about the dependents beyond thatDon’t really care about the dependents beyond that If employee quits, policy is deleted and coverage for dependents stopped!If employee quits, policy is deleted and coverage for dependents stopped!

This above is modeled via a Weak Entity Set.This above is modeled via a Weak Entity Set. An entity set that does not have a primary key is referred to as a An entity set that does not have a primary key is referred to as a weak entity weak entity

setset.. Weak entity is uniquely identified by a conjunction of some of its Weak entity is uniquely identified by a conjunction of some of its

attributes and the primary key of another entityattributes and the primary key of another entity - Identifying entity - Identifying entity setset

Page 23: Introduction to Database Design Entity Relationship Model

Weak Entity SetsWeak Entity Sets

Restrictions:Restrictions: it must relate to the identifying entity set via a it must relate to the identifying entity set via a one-to-one-to-manymany relationship set from the identifying to the weak relationship set from the identifying to the weak entity setentity set

It must have total participation in the identifying It must have total participation in the identifying relationship set. relationship set.

Page 24: Introduction to Database Design Entity Relationship Model

Weak Entity Sets (Cont.)Weak Entity Sets (Cont.) We depict a weak entity set by double rectangles.We depict a weak entity set by double rectangles. We underline the discriminator of a weak entity set with a We underline the discriminator of a weak entity set with a

dashed line.dashed line. payment-numberpayment-number – discriminator of the – discriminator of the payment payment entity set entity set Primary key for Primary key for payment payment – (– (loan-number, payment-loan-number, payment-

numbernumber) )

Page 25: Introduction to Database Design Entity Relationship Model

Conceptual Design Using the ER Conceptual Design Using the ER ModelModel

Design choices:Design choices: Should a concept be modeled as an entity or an attribute?Should a concept be modeled as an entity or an attribute? Should a concept be modeled as an entity or a relationship?Should a concept be modeled as an entity or a relationship? Identifying relationships: Binary or ternary? Aggregation?Identifying relationships: Binary or ternary? Aggregation?

Constraints in the ER Model:Constraints in the ER Model: A lot of data semantics can (and should) be captured.A lot of data semantics can (and should) be captured. But some constraints cannot be captured in ER diagrams.But some constraints cannot be captured in ER diagrams.

• Constraints on individual attributes of an entityConstraints on individual attributes of an entity Employee enitites must have age > 24Employee enitites must have age > 24

Page 26: Introduction to Database Design Entity Relationship Model

Entity vs. AttributeEntity vs. Attribute

Remember – attribute values are atomic (cannot be Remember – attribute values are atomic (cannot be broken down further)broken down further)

Should Should addressaddress be an attribute of Employees or an be an attribute of Employees or an entity (connected to Employees by a relationship)?entity (connected to Employees by a relationship)?

Depends upon the use of address information, and the Depends upon the use of address information, and the semantics of the data:semantics of the data:

• If we have several addresses per employee, If we have several addresses per employee, addressaddress must must be an entity (since attributes cannot be set-valued). be an entity (since attributes cannot be set-valued).

• If address is to be shared by many employees, address If address is to be shared by many employees, address should be an entity.should be an entity.

• If the structure (city, street, etc.) is important, e.g., we want If the structure (city, street, etc.) is important, e.g., we want to retrieve employees in a given city, to retrieve employees in a given city, addressaddress must be must be modeled as an entity (since attribute values are atomic). modeled as an entity (since attribute values are atomic).

Page 27: Introduction to Database Design Entity Relationship Model

Entity vs. Attribute (Contd.)Entity vs. Attribute (Contd.)

Works_In2Works_In2 does not does not allow an employee to allow an employee to work in a department work in a department for two or more periods.for two or more periods.

Similar to the problem Similar to the problem of wanting to record of wanting to record several addresses for an several addresses for an employee: we want to employee: we want to record record several values of several values of the descriptive attributes the descriptive attributes for each instance of this for each instance of this relationship. relationship.

name

Employees

ssn lot

Works_In2

from to

dname

budgetdid

Departments

dnamebudgetdid

name

Departments

ssn lot

Employees Works_In3

Durationfrom to

Page 28: Introduction to Database Design Entity Relationship Model

Entity vs. RelationshipEntity vs. Relationship First ER diagram OK if First ER diagram OK if

a manager gets a a manager gets a separate discretionary separate discretionary budget for each dept.budget for each dept.

What if a manager gets What if a manager gets a discretionary budget a discretionary budget that covers that covers all all managed depts?managed depts? Redundancy Redundancy of of dbudget, dbudget,

which is stored for each which is stored for each dept managed by the dept managed by the manager.manager.

Manages2

name dnamebudgetdid

Employees Departments

ssn lot

dbudgetsince

Employees

since

name dnamebudgetdid

Departments

ssn lot

Mgr_Appts

Manages3

dbudgetapptnum

- Misleading: suggests dbudget tied to managed dept.

Page 29: Introduction to Database Design Entity Relationship Model

Binary vs. Ternary Binary vs. Ternary RelationshipsRelationships

agepname

DependentsCovers

name

Employees

ssn lot

Policies

policyid cost

Beneficiary

agepname

Dependents

policyid cost

Policies

Purchaser

name

Employees

ssn lot Consider Figure 1 - What does it Consider Figure 1 - What does it

depict?depict? Additional constraints:Additional constraints:

A policy cannot be owned jointly by two A policy cannot be owned jointly by two employeesemployees

Every policy must be owned by some Every policy must be owned by some employeeemployee

Dependents is a weak entity set - uniquely Dependents is a weak entity set - uniquely identified by policyIdidentified by policyId

Page 30: Introduction to Database Design Entity Relationship Model

Binary vs TernaryBinary vs Ternary

Constraint 1: Add a key constraint on Policies Constraint 1: Add a key constraint on Policies with respect to Coverswith respect to Covers Side effect: policy can cover only one dependentSide effect: policy can cover only one dependent

Constraint 2: Total participation constraint on Constraint 2: Total participation constraint on PoliciesPolicies Ok if each policy covers at least one dependentOk if each policy covers at least one dependent

Constraint 3: Introduce an indentifying Constraint 3: Introduce an indentifying relationship setrelationship set

Page 31: Introduction to Database Design Entity Relationship Model

Better SolutionBetter Solutionagepname

DependentsCovers

name

Employees

ssn lot

Policies

policyid cost

Beneficiary

agepname

Dependents

policyid cost

Policies

Purchaser

name

Employees

ssn lot

Page 32: Introduction to Database Design Entity Relationship Model

Are you awake?Are you awake?

ER Group ExerciseER Group Exercise

Page 33: Introduction to Database Design Entity Relationship Model

Class (ISA) Hierarchies Class (ISA) Hierarchies

As in C++ or As in C++ or Java, attributes Java, attributes are inheritedare inherited

If we declare A If we declare A ISA B, every A ISA B, every A entity is also entity is also considered to considered to be a B entity.be a B entity.

Page 34: Introduction to Database Design Entity Relationship Model

ISA Hierarchy ConstraintsISA Hierarchy Constraints

Overlap Constraints: Can Joe be both an Overlap Constraints: Can Joe be both an employee and a customer? (Allowed/Disallowed)employee and a customer? (Allowed/Disallowed)

Does every employee entity also have to be an Does every employee entity also have to be an officer or teller or secretary entity? (Yes/No)officer or teller or secretary entity? (Yes/No)

Reasons for using ISA:Reasons for using ISA: To add attributes specific to a subclassTo add attributes specific to a subclass To identify entities that participate in a relationshipTo identify entities that participate in a relationship

Page 35: Introduction to Database Design Entity Relationship Model

AggregationAggregation

Used when we have Used when we have to model a to model a relationship involving relationship involving (entitity sets and) a (entitity sets and) a relationship setrelationship set.. AggregationAggregation allows allows

us to treat a us to treat a relationship set as an relationship set as an entity set for entity set for purposes of purposes of participation in participation in (other) relationships.(other) relationships.

Aggregation vs. ternary relationship: Monitors is a distinct relationship, with a descriptive attribute. Also, can say that each sponsorship is monitored by at most one employee.

budgetdidpid

started_on

pbudgetdname

until

DepartmentsProjects Sponsors

Employees

Monitors

lotname

ssn

since

Page 36: Introduction to Database Design Entity Relationship Model

Case Study (from Text Book)Case Study (from Text Book)

See HandoutSee Handout Addition to earlier exercise.Addition to earlier exercise.

Page 37: Introduction to Database Design Entity Relationship Model

Summary of Conceptual Summary of Conceptual DesignDesign

Conceptual design Conceptual design follows follows requirements analysisrequirements analysis, , Yields a high-level description of data to be stored Yields a high-level description of data to be stored

ER model popular for conceptual designER model popular for conceptual design Constructs are expressive, close to the way people think Constructs are expressive, close to the way people think

about their applications.about their applications.

Basic constructs: Basic constructs: entitiesentities, , relationshipsrelationships, and , and attributesattributes (of entities and relationships). (of entities and relationships).

Some additional constructs: Some additional constructs: weak entitiesweak entities, , ISA ISA hierarchieshierarchies, and , and aggregationaggregation..

Note: There are many variations on ER model.Note: There are many variations on ER model.

Page 38: Introduction to Database Design Entity Relationship Model

Summary of ER (Contd.)Summary of ER (Contd.)

Several kinds of integrity constraints can be Several kinds of integrity constraints can be expressed in the ER model: expressed in the ER model: key constraintskey constraints, , participationparticipation constraintsconstraints, and , and overlap/covering overlap/covering constraintsconstraints for ISA hierarchies. Some for ISA hierarchies. Some foreign key foreign key constraints constraints are also implicit in the definition of a are also implicit in the definition of a relationship set.relationship set. Some constraints (notably, Some constraints (notably, functional dependenciesfunctional dependencies) cannot ) cannot

be expressed in the ER model.be expressed in the ER model. Constraints play an important role in determining the best Constraints play an important role in determining the best

database design for an enterprise.database design for an enterprise.

Page 39: Introduction to Database Design Entity Relationship Model

Summary of ER (Contd.)Summary of ER (Contd.)

ER design is ER design is subjectivesubjective. There are often many . There are often many ways to model a given scenario! Analyzing ways to model a given scenario! Analyzing alternatives can be tricky, especially for a large alternatives can be tricky, especially for a large enterprise. Common choices include:enterprise. Common choices include: Entity vs. attribute, entity vs. relationship, binary or n-ary Entity vs. attribute, entity vs. relationship, binary or n-ary

relationship, whether or not to use ISA hierarchies, and relationship, whether or not to use ISA hierarchies, and whether or not to use aggregation.whether or not to use aggregation.

Ensuring good database design: resulting Ensuring good database design: resulting relational schema should be analyzed and refined relational schema should be analyzed and refined further. FD information and normalization further. FD information and normalization techniques are especially useful.techniques are especially useful.

Page 40: Introduction to Database Design Entity Relationship Model

Summary of Symbols Used Summary of Symbols Used in E-R Notationin E-R Notation

Page 41: Introduction to Database Design Entity Relationship Model

Summary of Symbols (Cont.)Summary of Symbols (Cont.)

Page 42: Introduction to Database Design Entity Relationship Model

Alternative E-R NotationsAlternative E-R Notations

Page 43: Introduction to Database Design Entity Relationship Model

UMLUML

UML: Unified Modeling LanguageUML: Unified Modeling Language UML has many components to graphically UML has many components to graphically

model different aspects of an entire model different aspects of an entire software systemsoftware system

UML Class Diagrams correspond to E-R UML Class Diagrams correspond to E-R Diagram, but several differences.Diagram, but several differences.

Page 44: Introduction to Database Design Entity Relationship Model

Summary of UML Class Diagram Summary of UML Class Diagram NotationNotation

Page 45: Introduction to Database Design Entity Relationship Model

UML Class Diagrams UML Class Diagrams (Contd.)(Contd.)

Entity sets are shown as boxes, and attributes are shown within the Entity sets are shown as boxes, and attributes are shown within the box, rather than as separate ellipses in E-R diagrams.box, rather than as separate ellipses in E-R diagrams.

Binary relationship sets are represented in UML by just drawing a Binary relationship sets are represented in UML by just drawing a line connecting the entity sets. The relationship set name is written line connecting the entity sets. The relationship set name is written adjacent to the line. adjacent to the line.

The role played by an entity set in a relationship set may also be The role played by an entity set in a relationship set may also be specified by writing the role name on the line, adjacent to the entity specified by writing the role name on the line, adjacent to the entity set. set.

The relationship set name may alternatively be written in a box, The relationship set name may alternatively be written in a box, along with attributes of the relationship set, and the box is along with attributes of the relationship set, and the box is connected, using a dotted line, to the line depicting the relationship connected, using a dotted line, to the line depicting the relationship set.set.

Non-binary relationships cannot be directly represented in UML -- Non-binary relationships cannot be directly represented in UML -- they have to be converted to binary relationships.they have to be converted to binary relationships.

Page 46: Introduction to Database Design Entity Relationship Model

UML Class Diagram Notation UML Class Diagram Notation (Cont.)(Cont.)

*Note reversal of position in cardinality constraint depiction

Page 47: Introduction to Database Design Entity Relationship Model

UML Class Diagrams (Contd.)UML Class Diagrams (Contd.) Cardinality constraints are specified in the form Cardinality constraints are specified in the form l..hl..h, where , where l l

denotes the minimum and denotes the minimum and h h the maximum number of the maximum number of relationships an entity can participate in.relationships an entity can participate in.

Beware: the positioning of the constraints is exactly the Beware: the positioning of the constraints is exactly the reverse of the positioning of constraints in E-R diagrams.reverse of the positioning of constraints in E-R diagrams.

The constraint 0..* on the The constraint 0..* on the EE22 side and 0..1 on the side and 0..1 on the EE1 side 1 side means that each means that each EE2 entity can participate in at most one 2 entity can participate in at most one relationship, whereas each relationship, whereas each EE1 entity can participate in 1 entity can participate in many relationships; in other words, the relationship is many many relationships; in other words, the relationship is many to one from to one from EE2 to 2 to EE1.1.

Single values, such as 1 or * may be written on edges; The Single values, such as 1 or * may be written on edges; The single value 1 on an edge is treated as equivalent to 1..1, single value 1 on an edge is treated as equivalent to 1..1, while * is equivalent to 0..*.while * is equivalent to 0..*.