Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Introduction to
Database Design
Dr. Kanda Runapongsa
Dept of Computer Engineering
Khon Kaen University
2
Overview
What are the steps in designing
a database?
Why is the ER model used to
create an initial design?
What are the main concepts in
the ER model?
What are guidelines for using
the ER model effectively?
3
Database Design Process
The process can be divided into six steps
The ER model is most relevant to the first three
Requirement Analysis
Conceptual database design
Logical database design
Schema refinement
Physical database design
Application and security design
4
Requirement Analysis
Understand what data is to be stored
Know what applications must be built
on top of the database
Understand what operations are most
frequent and subject to performance
requirements
Summary: Find out what the users
want from the database
5
Conceptual Database Design
Create a simple description of the
data that closely matches how users
and developers think of the data
This step uses the ER model which is
one of several high-level or semantic,
data models used in the database
design
The initial design must be translated
into a data model supported by a
commercial system
6
Logical Database Design
Choose a DBMS to implement the
database design
Convert the conceptual database
design into a database schema in the
data model of the chosen DBMS
Here, we consider only relation
DBMSs
The task is to convert an ER schema
into a relational database schema
7
Schema Refinement
Analyze the collection of relations in
the relational database schema to
identify potential problems and to
refine it
Schema refinement can be guided by
some elegant and powerful theory
For example, we can apply the theory of
normalizing relations to reduce the
redundancy of data stored in the
database
8
Physical Database Design
Consider typical expected workloads
that the database must support and
further refine the database design to
ensure that it meets desired
performance criteria
For example, we may build indexes
on some tables and cluster some
tables into the same location on the
disk
9
Application and Security Design
Any software project that involves a DBMS must consider aspects of the application that go beyond the database itself
We must describe the role of each entity in every process as part of a complete workflow for that task
For each role, we must identify the parts of the database that must be accessible and the parts of the database that must not be accessible
10
Conceptual DesignWhat are the entities and relationships in
the enterprise?
What information about these entities and relationships should we store in the database?
What are the integrity constraint or business rules that hold?
A database schema in the ER model can be represented by using ER diagrams
We can map an ER diagram into a relational schema
11
ER Model Basics: Entity (1/2)
Entity: a real-world object distinguishable from other objects
An entity is described (in DB) using a set of attributes
Entity set: a collection of similar entities
All entities in an entity set have the same set of attributes; this is what we mean by similar
Each entity set has a key
Each attribute has a domain
12
ER Model Basics: Entity (2/2)
A key is a minimal set of attributes
whose values uniquely identify an
entity in the set
There could be more than one
candidate key
If so, we designate one of them as the
primary key
Which one should we design?
13
Example: Entity
ER diagram of an
entity “Employees”
Entity set “Employees”
ssn namelot
111 A 12
222 B 24
Entity “111 A 12” is similar to Entity “222 B 24” since both of them are described using the same set of attributes
Employees
ssnname
lot
The key of this entity
is “ssn” which is
underlined
14
ER Model Basics: Relationship (1/2)
A relationship is an association
among two or more entities, e.g.,
Khun Mana works in the department
of Computer Engineering
Relationship Set: Collection of similar
relationships
15
ER Model Basics: Relationship (2/2)
An n-ary relationship set R relates n entities sets E1…En; each relationship in R involves entities E1, …, En
Same entity set could participate in different relationship sets
Employees works in which department
Employees supervise which other employees
Same entity different “roles” in same set
Employees supervise subordinates
Employees are supervised by supervisors
16
Example1: Relationships
A relationship between “Employees” and “Departments”
“since” is a descriptive attribute of this relationship
A relationship must be uniquely identified by the participating entities
lot
dname
budgetdid
sincename
Works_In DepartmentsEmployees
ssn
17
Example2: Relationships
Same entity set could participate
in different relationship sets, or in
different “roles” in same set.
Reports_To
lot
name
Employees
super-
visor
ssn
Works_In DepartmentsEmployees
Manages
super-visee
18
Key Constraints (1/2)Consider Works_In: An employee can
work in many departments; a department can have many employees‟
In contrast, each dept has at most one manager, according to the key constraintin Manages
Many kinds of relationships
One-to-one
One-to-many
Many-to-one
Many-to-many
19
Key Constraints (2/2)
1-to-1 1-to Many Many-to-1
dname
budgetdid
since
lot
name
ssn
ManagesEmployees Departments
Many-to-Many
20
Participation Constraints
Does every department have a manager?
If so, this is a participation constraint: the participation of Departments in Manager is said to be total (vs. partial)
Each employee works in at least one department and that each department has at least one employee the participation of both Employees and Departments in Works_In is total
A participation that is not total is said to be partial
21
Example: Participation Constraints
Employees and Departments have total
participation for relationship “Works_In”
For relationship “Manages”, Departments have
total participation but Employees have partial
participation
lot
name dname
budgetdid
sincename dname
budgetdid
since
Manages
since
DepartmentsEmployees
ssn
Works_In
22
Weak Entities
A weak entity can be identified
uniquely only by considering the
primary key of another (owner) entity
Owner entity set and weak entity set
must participate in a one-to-many
relationship set (one owner, many weak
entities)
Weak entity set must have total
participation in this identifying
relationship set
23
Example: Weak Entities
“Employees” is an owner entity set
and “Dependents” is a weak entity set
“Dependents” have total participation
in the relationship
lot
name
agepname
DependentsEmployees
ssn
Policy
cost
24
ISA („is a‟) Hierarchies (1/3)
In superclass-subclass in C++,
attributes in superclass are inherited
to subclass
If we declare A ISA B, every A entity
is also considered to be a B entity
Hourly_Emps ISA Employees
Overlap constraints: Can Chai be an
Hourly_Emps as well as a
Contract_Emps entity?
(Allowed/disallowed)
25
ISA („is a‟) Hierarchies (2/3)
Covering constraints: Does every
Employees entity also have to be an
Hourly_Emps or a Contact_Emps
entity? (Yes/no)
Reasons for using ISA:
To add descriptive attributes specific to
a subclass
To identify entities that participate in a
relationship
26
ISA („is a‟) Hierarchies (3/3)
Contract_Emps
name
ssn
Employees
lot
hourly_wages
ISA
Hourly_Emps
contractid
hours_worked
27
Aggregation
Used when we have to model a relationship involving (entity sets and) a relationship set
Aggregation allows us to treat a relationship set as an entity set for purposes of participation in (other) relationships
Aggregation vs. ternary relationship
Monitors is a distinct relationship, with a descriptive attribute
Also, can say that each sponsorship is monitored by at most one employee
28
Example: Aggregation
budgetdidpid
started_on
pbudget
dname
until
DepartmentsProjects Sponsors
Employees
Monitors
lotname
ssn
since
29
Conceptual Design Using the ER Model
Design choices:
Should a concept be modeled as an entity or an attribute?
Should a concept be modeled as an entity or a relationship?
Identifying relationships: Binary or ternary? Aggregation?
Constraints in the ER model:
A lot of data semantics can (and should) be captured
30
Entity vs. Attribute (1/2)Should address be an attribute of
Employees or an entity (connected to Employees by a relationship)?
Depends upon the use we want to make of address information, and the semantics of the data
If we have several addresses per employee, address must be an entity (since attributes cannot be set-valued)
If the structure (city, street, etc.) is important, e.g., we want to retrieve employees in a given city, address must be modeled as an entity (since attribute values are atomic)
31
Works_In4 does not allow an employee to work in a department for two or more periods.
Similar to the problem of wanting to record several addresses for an employee: We want to record several values of the descriptive attributes for each instance of this relationship. Accomplished by introducing new entity set, Duration.
name
Employees
ssn lot
Works_In4
from to
dname
budgetdid
Departments
dname
budgetdidname
Departments
ssn lot
Employees Works_In4
Durationfrom to
Entity vs. Attribute (2/2)
32
First ER diagram OK if a manager gets a separate discretionary
budget for each dept.
What if a manager gets a discretionary budget that covers all managed depts? Redundancy: dbudget
stored for each dept managed by manager.
Misleading: Suggests dbudget associated with department-mgr combination.
Manages2
name dname
budgetdid
Employees Departments
ssn lot
dbudgetsince
dname
budgetdid
DepartmentsManages2
Employees
namessn lot
since
Managers dbudget
ISA
This fixes the
problem!
Entity vs. Relationships
33
agepname
DependentsCovers
name
Employees
ssn lot
Policies
policyid cost
Bad design
Beneficiary
agepname
Dependents
policyid cost
Policies
Purchaser
name
Employees
ssn lot
Better design
Binary vs. Ternary Relationships (1/3)
34
Binary vs. Ternary Relationships (2/3)
If each policy is owned by just 1 employee, and each dependent is tied to the covering policy, the first diagram is inaccurate
What are the additional constraints in the 2nd diagram?
Each policy is owned by at most one employee
Each dependent is covered by at most one policy
35
Binary vs. Ternary Relationships (3/3)
An example in the other direction: a ternary relation Contracts relates entity sets Parts, Departments, and Suppliers, and has descriptive attribute qty.
No combination of binary relationships is an adequate substitute:
S „can-supply‟ P, D „needs‟ P, and D „deals-with‟ S does not imply that D has agreed to buy P from S
How do we record qty?
36
Summary of ER (1/4)
Conceptual design follows requirement analysis
Yields a high-level description of data to be stored
ER model popular for conceptual design
Constructs are expressive, close to the way people think about their applications
Basic constructs: entities, relationships, and attributes (of entities and relationships)
37
Summary of ER (2/4)
Some additional constructs: weak entities, ISA hierarchies, and aggregation
Several kinds of integrity constraints can be expressed in the ER model: key constraints, participation constraints, and overlap/covering constraints for ISA hierarchies
Some constraints (notably, functional dependencies) cannot be expressed in the ER model
38
Summary of ER (3/4)
ER design is subjective. There are
often many ways to model a given
scenario!
Analyzing alternatives can be tricky,
especially for a large enterprise.
Common choices include:
Entity vs. attribute, entity vs.
relationship, binary or n-ary relationship,
whether or not to use ISA hierarchies,
and whether or not to use aggregation
39
Summary of ER (4/4)
Ensuring good database design:
resulting relational schema should be
analyzed and refined further
Functional dependency information and
normalization techniques are especially
useful
40
Reference
Database Management System 3rd,
R. Ramakrishnan and J. Gehrke