1
How to build ER diagrams
• Will use an example of a bank database from Elmasri as an example
• Banks have branches
• Customers have accounts.– May be joint Accounts– Possibly multiple accounts.
• Eg: savings account, checking account
• Customers may also take out loans
2
How to build ER diagrams
• Identify entities– Including weak entities
• Identify attributes
• Identify relationships– Some attributes might become relationships
• Determine structure of relationships– Partial/total– cardinality
3
How to build ER diagrams
• Identify attributes of relationships– Might move attributes of entities
• Figure out keys
• Details of attributes– Simple/composite– Single valued/multi-valued
• Role names for recursive relationships
4
Elmasri FIGURE 7.21: ER DIAGRAM FOR A BANK DATABASE
© The Benjamin/Cummings Publishing Company, Inc. 1994, Elmasri/Navathe, Fundamentals of Database Systems, Second Edition
5
ER to Relational: Regular entities• ER to Relational:
– Given an ER diagram : high level view of database
– Build a Relational Schema
• For regular entity E, create a relation R. Attributes ?• Will include the attributes of E.
– Eg: DEPT
• Primary key in R ?• Pick one of the keys of E as the primary key for R. • If the chosen key of E is composite ?• Then R will also have the same composite key.
6
Regular entities
CREATE TABLE DEPT( DNAME VARCHAR(10) NOT NULL,
DNUMBER INTEGER NOT NULL,PRIMARY KEY (DNUMBER),UNIQUE (DNAME));
• What are other regular entities ?• Create tables EMPLOYEE. PROJECT.
– Primary keys ?
• EMPLOYEE : SSN • PROJECT : PNUMBER
Composite attributes
• Composite attribute:– Eg: Name: LName, MI, FName
• Have as different columns in tableCREATE TABLE EMP
( FNAME VARCHAR(15)MINIT CHAR,LNAME VARCHAR(15) SSN CHAR(9));
8
Weak Entity
• Weak entity WK with owner entity OW. How?
• Create a relation R and include all attributes of WK as attributes of R. PK of R?
• The PK of R is the combination of the PK of OW and the partial key of the weak entity WK.
• In addition, include as a foreign key of R the primary key attribute of OW. If OW deleted?
• When OW is deleted, WK must also be deleted.
9
Weak EntityCREATE TABLE DEPENDENT (
ESSN CHAR(9) NOT NULL,
DNAME VARCHAR(15),
SEX CHAR,
PRIMARY KEY (DNAME, ESSN),
FOREIGN KEY (ESSN) REFERENCES EMPLOYEE (SSN),
ON DELETE CASCADE)
10
Many-many relationship
• many-many relationship R between entities E and F.– Eg: WORKSON
• Can R be represented in the table for E?• No: one E can be connected to many F.
– For same reason, can’t be in table for F. How to do?
• Create a new table to represent R.– What will be the PK of this table?
• Composite PK: the PK from each table– Eg: SSN, PNUMBER
11
Many-many relationship• Attributes of relationship type R?
• Will become columns of table R– Eg: HOURS in WORKSON tableCREATE TABLE WorksOn (
ESSN CHAR(9),
PNO INTEGER,
HOURS DECIMAL,
PRIMARY KEY (ESSN, PNO),
FOREIGN KEY (ESSN)
REFERENCES EMPLOYEE (SSN),
FOREIGN KEY (PNO)
REFERENCES PROJECT(PNUMBER)
12
One-many relationship• Eg: WORKSFOR• 1:N relationship R, between E (1) and F (N).
– Eg: 1 for DEPT, N for EMP
• Options?• Create a new table or • Put R info in the table for F: advantage is we won’t
have to look up 2 tables– PK will stay the same.
– Foreign Key to E
– Include attributes of R in this table
• Store DNUMBER or DNAME in EMP table?• Store PK from other table
13
One-many, recursive relationshipCREATE TABLE EMP
( ESSN CHAR(9), BDATE DATE, DNO INTEGER, SUPERSSN CHAR(9), PRIMARY KEY (ESSN), FOREIGN KEY (DNO) REFERENCES DEPT (DNUMBER), FOREIGN KEY (SUPERSSN) REFERENCES EMPLOYEE (SSN),);
• Recursive relationship: supervision: is having supervisor in supervisee table better than having supervisee in supervisor table?
• Yes: one-many
• Foreign Key to PK in same table
14
1-1 relationship• 1:1 relationship R, between E and F
– Eg: MANAGES ,1 for EMP, 1 for DEPT
• Create a new table or
• Put R info in E table or in F table– Could put in EMP or DEPT. Which is better?
• DEPT: every DEPT has a Manager, but not every EMP is a manager. Difference?
• Won’t need NULLS if in DEPT table
• Generally put in side of total participation
• Attribute of R (eg: MgrStartDate) also here
15
1-1 relationshipCREATE TABLE DEPT
( DNAME VARCHAR(10) NOT NULL,DNUMBER INTEGER NOT NULL,MGRSSN CHAR(9),MGRSTARTDATE CHAR(9),PRIMARY KEY (DNUMBER),UNIQUE (DNAME),FOREIGN KEY (MGRSSN) REFERENCES EMP (SSN) );
• Is it possible that even for a 1-1 relationship we might want a separate table?
• Yes, if only a few instances of relationship and not total on either side. Avoids NULLs.
• Disadvantage of separate table: lots of JOINS.
16
Multi-valued attributes
• Eg: DEPT_LOCATIONS. How to do?• Can it go in DEPT table?• No: for one department can have multiple values• Needs a separate table. Primary key?• PK: DNUMBER, DLOCATION. Foreign key?• FK: DNUMBER• Multi-valued attributes need separate table
– Composite PK
17
FIGURE 3.7Result of mapping the COMPANY ER schema into a relational schema.
18
Participation Constraints
• Can we capture total participation in relational schema?– Eg: every DEPT has to have a manager
– Eg: every EMP has to work in a DEPT
– Can this be done in relational schema ?
• No : can’t capture this in relational schema . In SQL ?• By specifying NOT NULL
CREATE TABLE EMP( ESSN CHAR(9),
BDATE DATE,DNO INTEGER NOT NULL);
19
Ternary relationships: Elmasri Eg• Elmasri FIGURE 7.17 (a) SUPPLY relationship
• How to do?
• Create a new table. Primary keys?
• PK: composite: the PK of all the 3 entities.
• FK: to all 3 entities
20
Elmasri FIGURE 9.4: Mapping the n-ary relationship type SUPPLY
21
Correspondence between ER and relational models Table 9.1
22
Build relational schema for the bank database from Figure 7.21
© The Benjamin/Cummings Publishing Company, Inc. 1994, Elmasri/Navathe, Fundamentals of Database Systems, Second Edition
23
Informal Design Guidelines for Databases
• Relational database design– Create "good" relation schema– Criteria for "good" relation schema?
• First we discuss informal guidelines.
• Then discuss formal concepts of functional dependencies and normal forms– 1NF (First Normal Form)– 2NF (Second Normal Form)– BCNF (Boyce-Codd Normal Form)
24
Informal Guidelines for Relations
• Each tuple in a relation should represent one entity or relationship instance.
• Attributes of different entities (EMPLOYEE, DEPARTMENT, PROJECT) should not be mixed in the same relation
• Only foreign keys used to refer to other entities
• Basic design principle: Design a schema such that– What each table stands for should be easy to explain
– Semantics of attributes should be easy to interpret/explain.
How to design a database
• Build the ER diagram.– Figure out what data is being stored– The connections between different data
• Translate ER diagram to relational schema – keeping informal guidelines in mind
• Run through normalization checks
26
Eg of bad relational design• Eg: Students pay same amount for same course
– Tuition across different courses varies
• Suppose store information in a single table:– Which students take which courses.– What is the tuition for a course
Howmuch ( studentid, course# , tuition)
Stid c# tuition
123 622 1100
234 622 1100
345 640 1200
• Problems with this design ?
27
Eg of bad relational design
Redundancy: tuition repeated for students taking the same course. Problem with redundancy ?
Wasted storage. Because of the duplication, can get inconsistency:
If we make changes to tuition 622’s tuition 1150 in one place 1100 in another.
28
Modification Anomalies• Modification anomalies – unexpected and undesirable
behavior when making changes• Update Anomaly: if 622’s tuition changed to 1140 ?• Have to make this change in many places.
– Same information is in multiple rows
• Deletion Anomaly: if 345 leaves ?• We lose tuition info for 640.
– Lose info we would like to keep
• Insertion Anomaly: Can insert 632’s fee is 1175 ?• No : because no students registered for 632
– Can’t store info we would like to store
• How to fix ?
29
Relational Design/Modification Anomalies
• Problem: too much info in one table– Has to be split. – How to split ?
Whotakes( stid,course#), Cost( course#,tuition)
• Does this solve all the problems:– Redundancy– Update anomaly– Insertion anomaly– Deletion anomaly
30
Non-loss decomposition• Problem: how do we know when a table can be
split into two tables without loss of information? • Non-loss decomposition: no info loss if table
split up– what does this mean in specific terms
• If we break table up, should be able to recover original table with join– without spurious tuples
• new rows which were not there in original table.
• Lossy decomposition : we lose info when we split table. Eg: spurious tuples when we do join
31
Non-loss/lossy decomposition eg(S#,Status,City) broken up into
(S#,Status) and (S#, City).
• Is this a lossy or non-loss decompositionCan recover original table with join. How do we know this ?
•S# is primary key
•Non-loss but no advantage of doing this. Disadvantage ?
•Need a join if we wanted to answer:– Which cities have S with status 20
32
Non-loss/lossy decomposition eg• (S#,Status,City) broken up into :
(S#,Status) and (Status, City).
S Status City
S1 10 SF
S2 20 NY
S3 20 LA
• Will the join (over status) of the two new tables give back the original table?
S Status
S1 10
S2 20
S3 20
Status City
10 SF
20 NY
20 LA No: join will also
have spurious tuplesS2 20 LAS3 20 NY
33
Normalization• Want to design/modify tables to control redundancy
– To avoid modification anomalies and wasted storage. – This process is called Normalization
• Look at a given table structure and see if it can be improved.
• Is there potential for anomalies because too much info in one table ?
• How to break up ? What do we want ?• Non-loss decomposition
• Any downside to normalization?
34
Normalization• By splitting up tables, may lead to loss of
efficiency. Why ?
• Because more joins needed
• Normalization is a guide– Don’t always have to follow– But have to understand
• Works through stages, called Normal Forms
– each is stronger than the previous
– i.e. the higher the normal form, the stronger the guarantee of less redundancy.