©Chisholm Institute
Database Design
1
What is a Database? A collection of data that is organised in a predictable
structured waystructured way Any organised collection of data in one place can be
considered a database Examples
filing cabinet library
2
floppy disk
©Chisholm Institute
What is Data? The heart of the DBMS. Two kinds
Collection of information that is stored in the database.
A Metadata, information about the database. Also known as a data dictionary.
3
Relational Data Model
A relational database is perceived as a collection of tables.
Each table consists of a series of rows & columns.
Tables (or relations) are related to each other by sharing a common characteristic. (EG a customer or product
4
(E m ptable)
A table yields complete physical data independence.
©Chisholm Institute
Features of the relational data model
Logical and Physical separated
Simple to understand Easy to use Simple to understand. Easy to use.
Powerful nonprocedural (what, not how) language toaccess data.
Uniform access to all data.
Rigorous database design principles
5
Rigorous database design principles.
Access paths by matching data values, not by following fixed links.
Relation A 2-dimensional table of values with these properties: No duplicate rows Rows can be in any order
Terminology
y Columns are uniquely named by Attributes Each cell contains only one value
Employee Job Manager
Jack Secretary Jill
Jill Executive Bozo
Bozo Director
6
The special value is NULL which implies that there is nocorresponding value for that cell. This may mean the value does notapply or that it is unavailable. Entire rows of NULLs are notallowed.
Bozo Director
Lulu Clerk Jill
©Chisholm Institute
TupleC l f d i l i
Terminology
Commonly referred to as a row in a relation. Eg:
Attribute• A name given to a column in a relation Each column must have a
Jack Clerk Jill
7
• A name given to a column in a relation. Each column must have a unique attribute. This are often referred to as the fields.
Employee Job Manager
A pool of atomic values from which cells a given column take their values. Each attribute has a domain.
Attributes may share domains
Terminology: Domain
m y m
Typist ManagerClerk........
Tom Mary Bozo Kali........
Here again we use the
Attribute Domain
Employee Person Name
Job Job Name
Manager Person Name
8
An attribute value (a value in a column labelled by the attribute)must be from the corresponding domain or may be NULL ( ).
same domain as above in employee.
©Chisholm Institute
A Relational Schema is a named set of attributes. This refers to the structure only of a relation. It is derived from the traditional set notation displayed below
Terminology:Relation Schema
EMPLOYEE = { Employee, Job, Manager }
This is usually written in the modified version for database purposes:
EMPLOYEE( Employee, Job, Manager ) referring to the Table
9
EMPLOYEEEmployee Job Manager
An Integrity Constraint is a condition that prescribes whatvalues are allowable in a relation. This permits the restriction of the
type of value that can be placed in a particular cell. Eg. only numbers for telephone numbers
Terminology:Integrity Constraint and Domain Constraint
numbers for telephone numbers
The Domain Constraint is a condition on the allowable values for an attribute.
e.g. Salary < $60,000
Employee Job Manager Salary
10
EMPLOYEEThis restricts thesalary to be under
a set value.
Employee Job Manager Salary
Jack Secretary Jill 25,000
Jill Executive Bozo 40,000
Bozo Director 50,000
Lulu Clerk Jill 30,000
©Chisholm Institute
A condition that no value of an attribute or set of attributes be repeated in a relation.
e.g. Employee(the attribute) has only unique values in EMPLOYEE (the relation)
Terminology:Key Constraint
EMPLOYEE (the relation). The following relation violates this constraint:
EMPLOYEE
Jack appears twice. This means that
Employee Job Manager Salary
Jack Secretary Bozo 25,000
11
This violates the Key Constraint
Jack Secretary Jill 25,000
Jill Executive Bozo 40,000
Bozo Director 50,000
Lulu Clerk Jill 30,000
An attribute (or set of attributes) to which a key constraint applies is called a key ( or candidate key). Every relation schema must have a key.
EMPLOYEE Another possible key
Terminology:Key Constraint
Key
Another possible key. The combination of Job and manager is also unique
Employee Job Manager Salary
Jack Secretary Bozo 25,000Kim Secretary Jill 25,000
Jill Executive Bozo 40,000
Bozo Director Bozo 50,000Lulu Clerk Jill 30,000
12
If a key constraint applies to a set of attributes, it is called a composite or Concatenated Key. Otherwise it is a simple key.
Simple Key Composite Key:
©Chisholm Institute
A key cannot have a NULL ( ) value.
Terminology:Key Constraint
For example, If we change the table so that the Employee Bozo does not have a manager then Job+Manager cannot be a key.
Employee Job Manager Salary
Jack Secretary Bozo 25,000
K J ll 25 000
13
Kim Secretary Jill 25,000
Jill Executive Bozo 40,000
Bozo Director 50,000
Lulu Clerk Jill 30,000
A primary key is a special preassigned key that can always be used to uniquely identify tuples. We have to choose a Primary Key for every Relation. We must consider all of the Candidate Keys and choose between them
Terminology:Key Constraint
all of the Candidate Keys and choose between them. Employee is a primary key for EMPLOYEE is usually
written as:EMPLOYEE( Employee, Job, Manager, Salary )
Here we have chosenthe Simple Key Employee
Employee Job Manager Salary
Jack Secretary Bozo 25,000
14
the Simple Key Employee Over the concatenated
option of both Job and Manager
Kim Secretary Jill 25,000
Jill Executive Bozo 40,000
Bozo Director Bozo 50,000
Lulu Clerk Jill 30,000
©Chisholm Institute
A Database is more than multiple tables you must be able to “relate” them
Cus-code Cus-Name Area-Code Phone Agent-Code10010 Ramus 615 844-2573 50210011 Dunne 713 894-1238 50110012 Smith 615 894-2205 50210013 Olowaski 615 894-2180 50210014 Orlando 615 222-1672 50110015 O’Brian 713 442-3381 50310016 Brown 615 297-1226 50210017 Williams 615 290-2556 50310018 Farris 713 382-7185 501
15
10019 Smith 615 297-3809 503
Agent-Code Agent-Name Agent-AreaCode Agent-Phone501 Alby 713 226-1249502 Hahn 615 882-1244503 Okon 615 123-5589
The link is through the Agent-Code
A Relational Database is just a set of Relations.For example
Terminology: Relational Database
Employee Job Manager Salary
Jack Secretary Bozo 25,000EMPLOYEE
JOB Job Salary
Secretary 25,000
Kim Secretary Jill 25,000
Jill Executive Bozo 40,000
Bozo Director 50,000
Lulu Clerk Jill 30,000
16
Which Attribute do you think relates these two tables together?
y ,
Secretary 25,000
Executive 40,000
Director 50,000
Clerk 30,000
©Chisholm Institute
A Relational Database Schema a set of Relation Schemas, together with a set of Integrity Constraints.
For example the Relations that you have been looking at
Terminology:Relational Database Schema
For example the Relations that you have been looking atwith the headings
EMPLOYEE
JOB
are usually written as
Employee Job Manager Salary
Job Salary
17
are usually written asEMPLOYEE(Employee, Job, Manager)
JOB(Job, Salary)
Notice how the Primary Keys are underlined
This constraint says that –All the values in one column should also appear in another column.Look at the table below. Every entry in the Job column of the Employeetable must appear in the Job column of the Job table
Terminology :Referential Integrity Constraint
table must appear in the Job column of the Job table
EMPLOYEE JOBFK PK Employee Job Manager
Jack Secretary BozoKim Secretary Jill
Job Salary
Secretary 25,000S t 25 000
18
Kim Secretary JillJill Executive Bozo
Bozo DirectorLulu Clerk Jill
Secretary 25,000Executive 40,000Director 50,000
Clerk 30,000
FKPK
©Chisholm Institute
Referential Integrity Constraint
Why does the following relational database violate the referential integrity constraints?
EMPLOYEE JOB
Job Salary
Director 50,000Clerk 30,000
Employee Job Manager
Jack Secretary BozoKim Secretary JillBozo DirectorLulu Clerk Jill
EMPLOYEE JOBFK PK
19
In other words, Why can’t Employee(Job) be a Foreign Key to Job(Job), or Employee(Manager) be a Foreign Key to Employee(Employee)?
Click here for the answers
FKPK
Why Use Relational Databases
Their major advantage is they minimise the d t t th d t i b f need to store the same data in a number of
places
This is referred to as data redundancy
20
©Chisholm Institute
Example of Data Redundancy (1)
21
Example of Data Redundancy (2)
The names and addresses of all students are b i i t i d i th lbeing maintained in three places
If Owen Money moves house, his address needs to be updated in three separate places
Consider what might happen if he forgot to
22
m g pp f f glet library administration know
©Chisholm Institute
Example of Data Redundancy (3)
23
Example of Data Redundancy (4)
Data redundancy results in: f d d l wastage of storage space by recording duplicate
information
difficulty in updating information
inaccurate out of date data being maintained
24
inaccurate, out-of-date data being maintained
©Chisholm Institute
Other Advantages of Relational Databases
Flexibilityl h (l k ) l l d f d relationships (links) are not implicitly defined by
the data Data structures are easily modified Data can be added, deleted, modified or
queried easily
25
Summary of Some Common Relational Terms
Entity - an object (person, place or thing) that we wish to store data aboutwish to store data about
Relationship - an association between two entities Relation - a table of data Tuple - a row of data in a table Attribute - a column of data in a table
Primary Key an attribute (or group of attributes) that
26
Primary Key - an attribute (or group of attributes) that uniquely identify individual records in a table
Foreign Key - an attribute appearing within a table that is a primary key in another table
©Chisholm Institute
Network Diagrams
27
Terminology: Network Diagram
Referential Integrity constraints can easily be represented by arrows FK PK. The arrow points from the Foreign Key to the matching Primary Key
EMPLOYEE(Employee, Job, Manager) JOB(Job, Salary)
A relational database schema with referential integrity constraints canalso be represented by a network diagram. A Referential IntegrityConstraint is notated as an arrow labeled by the foreign key. You mustalways write the label of the Foreign Key on the arrow. Sometimes thes tt ib t h s diff t titl s i diff t t bl s
g y y
28
same attribute has different titles in different tables.
EMPLOYEE JOBJob
Manager Network DiagramNotice here, the label is Manager and not Employee.
©Chisholm Institute
Personnel Database: Consider the following Tables
ASSIGNMENT SKILL
E_NUMBER P_NUMBER AREA
1001 26713 Stock Market 1002 26713 Taxation
PRIOR_JOB EXPERTISE
E_NUMBER PRIOR_TITLE E_NUMBER SKILL
1001 Junior consultant 1001 Stock market 1001 Research analyst 1001 Investments 1002 Junior consultant 1002 Stock market 1002 Research analyst 1003 Stock market
PROJECT
NAME P_NUMBER MANAGER ACTUAL_COST EXPECTED_COST
New billing system 23760 Yates 1000 10000Common stock issue 28765 Baker 3000 4000Resolve bad debts 26713 Kanter 2000 1500New office lease 26511 Yates 5000 5000Revise documentation 34054 Kanter 100 3000Entertain new client 87108 Yates 5000 2000New TV commercial 85005 Baker 10000 8000
1002 26713 Taxation 1003 23760 Investments 1003 26511 Management1004 26511 1004 287651005 23760
1002 Research analyst 1003 Stock market 1003 Junior consultant 1003 Investments 1004 Summer intern 1004 Taxation
1005 Management
29
EMPLOYEE TITLE
NAME E_NUMBER DEPARTMENT E_NUMBER CURRENT_TITLE
Kanter 1111 Finance 1001 Senior consultant Yates 1112 Accounting 1002 Senior consultant Adams 1001 Finance 1003 Senior consultant Baker 1002 Finance 1004 Junior consultant Clarke 1003 Accounting 1005 Junior consultant Dexter 1004 Finance Early 1005 Accounting
Personnel Database Schema
Not FK, we will look at this later
What are the connecting Foreign Keys to Primary Keys?
ASSIGNMENT (E_NUMBER, P_NUMBER)
PRIOR_JOB (E_NUMBER, PRIOR_TITLE)
EXPERTISE (E_NUMBER, SKILL)
TITLE (E NUMBER CURRENT TITLE )
SKILL (AREA)
PROJECT (NAME, P_NUMBER, MANAGER, ACTUAL_COST, EXPECTED_COST )later
30
TITLE (E_NUMBER, CURRENT TITLE )
EMPLOYEE (NAME, E_NUMBER, DEPARTMENT)
©Chisholm Institute
PROJECT EMPLOYEESKILL
Personnel Database Network Diagram
Once you have produced your Schema and identified the Primary andForeign Keys you can create the Network Diagram.The Network Diagramshows each of the tables with their links. Each of the Tables (Relations)are represented in a rectangle as shown. They are then connected byarrows that show the FKs pointing to the PKs, The arrow head pointstowards the PK, while the FK name written is the same as the attribute ofth t bl th t h th FK i it
31
TITLE PRIOR_JOBEXPERTISE ASSIGNMENT
the table that has the FK in it.
Personnel Database Network Diagram
PROJECT EMPLOYEESKILL PROJECT EMPLOYEESKILL
32
TITLE PRIOR_JOBEXPERTISE ASSIGNMENT
©Chisholm Institute
Summary: Questions
What is a Relational Database?
What is a relation?
What are Constraints?
What is a Schema?
33
What is a Network Diagram and why is it used?
Summary: Answers
A relational database is based on the relational data model.It is one or more Relations(Tables) that are Related to each other
A relation is a table composed of rows (tuples) and columns, satisfying 5 properties• No duplicate rows• Rows can be in any order• Columns are uniquely named by Attributes• Each cell contains only one value• No null rows.
34
Constraints are central to the correct modeling of business information. Here we have seen them limit the set up of your tables: Referential Constraint
The Network Diagram is used to navigate complex database structures. It is a compact way to show the relationships between Relations (Tables)
©Chisholm Institute
Activities Consider the following relational database
hschemas.Suppliers(suppId, name, street, city,state)Part(partId,partName,weight,length,composition)Products(prodId, prodName,department)Supplies(partId,suppId)Uses(partId,prodId)
M k s n bl ss mpti ns b t th m nin f tt ib t nd
35
Make reasonable assumptions about the meaning of attribute and relations, identify the primary and foreign keys and draw a network diagram showing the relations and foreign keys.
Answer
P d
Supplies
Supplier Part
Uses
Product
36
©Chisholm Institute
Show the foreign keys on the network diagramsOrdersOrdnum ordDate custNumb12489 2/9/91 124
Customer
SalesRep
custNumb custName Address Balance credLim Slsnumber
124 Adams 48 oak st 418.68 500 3
Slsnumber Name address totCom commRate
37
Part
Slsnumber Name address totCom commRate
3 Mary 12 Way 2150 .05
Part Desc onHand IT wehsNumb unitPrice
AX12 Iron 1.4 HW 3 17.95
OrLineordNum Part ordNum quotePrice
38
©Chisholm Institute
Answer
SalesRep
OrLine
Part
Customer
SalesRep
SlsNumberPart
39
Orders
CustNumb orLine
Activities What problems many arise from this table?
h d d d hWhat data redundancies are there?What changes would you make? (hint make
another table.What if I wanted to search by surname?
40
©Chisholm Institute
Activities What is wrong with this table?
41
Functional Dependence FDD
42
©Chisholm Institute
Functional Dependency Diagrams
AA FUNCTIONAL DEPENDENCY DIAGRAM is a way ofrepresenting the structure of information needed tosupport a business or organization
It can easily be converted into a design for a relational
43
database to support the operations of the business.
Data Analysis and Database Design Using Functional Dependency Diagrams
1 Th f D l i i FDD 1. The steps of Data Analysis in FDD are1.1 Look for Data Elements1.2 Look for Functional Dependencies1.3 Represent Functional Dependencies in a
diagram1 4 Eli i R d d F i l
44
1.4 Eliminate Redundant Functional Dependencies
2. Data Design, after we have our final version of the FDD2.1 Apply the Synthesis Algorithm
©Chisholm Institute
Starting points for drawing functional dependency
diagramsTo start the process of constructing our FDD we do the following:
We must Understand the data We Examine forms, reports,data entry and output screens
etc… We Examine sample data
To start the process of constructing our FDD we do the following:
45
We consider Enterprise (business) rules We examine narrative descriptions and conduct interviews. We apply our Experiences/Practice and that of others
Enterprise RulesWhat are Enterprise / Business Rules?An enterprise rule (in the context of data analysis) is astatement made by the enterprise (organisation, company,ff h ) h h d officer in charge etc.) which constrains data in some way.
Functional dependencies are the most important type ofconstraint on data and are often expressed in the form ofenterprise rules.
e.g
46
No two employees may have the same employee number.An order is made by only one customer
An employee can belong to only one department at a time.
©Chisholm Institute
Drawing FDDs - Data Elements
We often refer to Data Elements during the FDD process A data element is a elementary piece of recorded A data element is a elementary piece of recorded
information Every data element has a unique name. A data element is either a
Label, e.g PersonName, Address,
47
gBulidingCode, or
Measurement, e.g. Height, Age, Date A data element must take values that can be written
down.
Functional Dependency Diagrams
Using the Method ofDecomposition
Universal Relation
1NF
TablesONF
Sample DataEliminateRepeating
Groups
Attribute& Functional
Dependencies
Given theProblem
Functional
OR, here is the same process using the FDD approach
48
Now we have the Database Design
2NF Relation
3NF Relation
Method ofSynthesis
Eliminate Part Key
Dependencies
Eliminate Non KeyDependencies
FunctionalDependency
Diagram
©Chisholm Institute
Data Element Examples
Here are some examplesP N h l ff ll G E d PersonName has values Jeff, Jill, Gio, Enid
Address has values 1 John St, 25 Rocky Road Height has values 171cm, 195cm Age has values 21,52,93,2 Date has values 20th May 1947, 2nd March 1997 JobName has values Manager Secretary Clerk
49
JobName has values Manager, Secretary, Clerk Manager might not be a data element, but
ManagerName could be. It could be a value of another data element e.g. JobName
Drawing FDDs Data Elements
Start drawing the Functional Dependency Diagram byrepresenting the Data Elements. A Data Element isrepresented by its name placed in a box: D t El trepresented by its name placed in a box:Every data element must have a unique name in thefunctional dependency diagram.A data element cannot be composed of other data
elements i.e.it cannot be broken down into smaller components
Data Element
50
m mpA Data Element is also known as an ATTRIBUTE,
because it generally describes a property of some thing which we will later call an ENTITY
©Chisholm Institute
A functional Dependency is a relationship between Attributes.
A B
Drawing FDDs –Using Elements
It is shown as an arrow e.g A B
It means that for every value of A, there is only one value for B It reads “A determines B”.
A is called a determinant attribute
51
A is called a determinant attribute.
B is called the dependent attribute.
Data Element ExamplesHere are some examples of finding the Data Elements on a typical formSurname . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
On a form gives rise to the element
Surname
52
CREDIT CARD Bankcard Mastercard Visa Other
CreditCardType
On a form gives rise to the element
©Chisholm Institute
Functional Dependency Examples
Students and their family names“Each student (identified by student number) has only one ( f y ) yfamily name”
Students FamilyName1 Smith2 Jones3 Smith4 Andrews
53
Considering the rules stated above we should be able to draw a FDD for this. What are the elements of interest?
FDDs AnswerStudents FamilyName
1 Smith2 Jones2 Jones3 Smith4 Andrews
Students determine FamilyName
(or FamilyName depends on Students)
Data elements of interest are Student# and FamilyName.
54
(or FamilyName depends on Students)
Each student has exactly one family name, but the name could be the name of many students.
So FamilyName does not determine Student# e.g. “Smith is the name of students 1 and 3
Students FamilyName
©Chisholm Institute
FDDs ExamplesEmployees and the departments
they work for.Department Name Accounting Department Name Salesp gEmployee Number 11
231
pEmployee Number 45
27
In this example the tables are representing some interesting data f th b sin ss W s th t Empl s ith th ID n mb s 11 2
Enterprise Rule: “Each employee works on only one department”
55
of the business. We see that Employees with the ID numbers 11,2 and 31 all work in the Accounting Dept and that Employees with the ID numbers 45 and 27 work in the Sales Dept.
Do you think that you could draw an FDD to represent this? Have a go and then check your answers
FDD AnswersEmployees and the departments
they work for.Department Name AccountingE l N b 11
Department Name SalesEmployee Number 45Employee Number 11
231
Employee Number 4527
Employee# DeptName
Data elements of interest are Employee# and DeptName”
Employee# DeptName
56
p y p11 Acc2 Acc
45 Sales31 Acc27 Acc
So we could make this following Table
©Chisholm Institute
FDDs ExamplesThe quantity of parts held in a warehouse
and their suppliers“Parts are uniquely identified by part numbers”
“Suppliers are uniquely identified by Supplier Names”“A part is supplied by only one supplier”
Parts Suppliers Name QOH1 Wang Electronics 232 Cumberland Enterprises 803 Wang Electronics 44 Roscoe Pty Ltd 58
A part is supplied by only one supplier“A part is held in only one quantity”
57
4 Roscoe Pty. Ltd 58Part# determines SupplierName & Part# determines QOH
Parts SupplierName
Parts QOH
Should QOH be a determinant? No, common sense tells us that is not a reliable choice. We could have had repeating values
FDDs ExamplesStudents and their subjects enrolled.“Each student is given a unique student number”“A subject is uniquely identified by its name”
“A student may choose several subjects”A student may choose several subjectsData element of interest are
Student# and SubjectName
Student
SubjectName
Student SubjectName1 History1 Geography1 Mathematics1 History2 E li h
58
SubjectName
There us no functional dependency here.
Student# does not determine SubjectName,
nor does SubjectName determine Student#
2 English2 English3 Mathematics3 English 4 French4 Geography
©Chisholm Institute
FDDs ExamplesResults obtained by each student for
each subject.
“Each student is given a unique student b ”number”
“A subject is uniquely identified by its name”
“A student may choose several subjects”
“A student is allocated a result for each subject”
“Each student has only one name.”
59
Data elements are
Student#, StudentName, SubjectName and Grade
FDDs ExamplesResults obtained by each student for each
subject.Student Student
NameSubject Name Grade
1 Smith History A1 Smith History A1 Smith Geography B1 Smith Mathematics A2 Jones History C2 Jones English C3 Smith English A3 Smith Mathematics A
60
3 Smith Mathematics A4 Andrews English D4 Andrews French C4 Andrews Geography C
Try and construct an FDD for this table considering the given Business Rules and the Data Elements
©Chisholm Institute
FDDs ExamplesResults obtained by each student
for each subject.We can see that there is only one and only one student name for
h t d t b th h th i ht b th
Student # StudentName
each student number, even though there might be more than one student with the same name. So….
But the subject grade for any student cannot be determined by the subject name or the student# by itself. A student can have many grades depending on the subject How can we cater for
61
many grades depending on the subject. How can we cater for this?
FDDs AnswerResults obtained by each student
for each subject.
We need to combine the two Elements to say that there is
Student StudentName
We need to combine the two Elements to say that there is one and only one grade for a student doing a particular subject. Here then is the complete diagram
62
SubjectName Grade
This is called the Composite Determinant
©Chisholm Institute
FDDs ExamplesCustomer Orders
Order Part# CustomerName AddressOrder Part# CustomerName Address454 12 David Smith 1 John St, Hawthorn454 23 David Smith 1 John St, Hawthorn455 32 Emily Jones 45 Grattan St, Parkville455 49 Emily Jones 45 Grattan St, Parkville455 54 Emily Jones 45 Grattan St, Parkville456 12 Mary Ho 44 Park St Hawthorn
63
456 12 Mary Ho 44 Park St, Hawthorn456 54 Mary Ho 44 Park St, Hawthorn
Validating functional dependenciesUsing simple data and populating the table, check there is only one value of the
dependent.
FDDs Examples“Orders is uniquely identified by its names”
“Customers are uniquely identified by their names”
“A customer has only one address”
“An order belongs to only one customer”
“A part may be ordered only once one each order”
Order Parts Ordered CustomerName Address454 23, 12 David Smith 1 John St, Hawthorn
64
Order CustomerName
455 54, 49, 32 Emily Jones 45 Grattan St, Parkville456 54, 12 Mary Ho 44 Park St, Hawthorn
Address
Part#
©Chisholm Institute
FDDs ExamplesEmployees and their tax files
numbers“Each employee has a unique employee
number”
“Each employee has a unique tax file number ”
Employee TaxFile#1 1024-53212 3456-32943 8246-71064 8861-6750
Employee# Taxfile#
Employee# determines taxfile#
Taxfile# determines Employee#
65
4 8861 67505 1234-4765
Taxfile# Employee#
Taxfile# Employee#
Alternative keys
Obtain Tutorial 1 from your tutor.
66
©Chisholm Institute
Functional Dependency DiagramsDiagrams
Database DesignLet’s look at the process of converting the FDD into a schema. We have a 12 step process to do so, that has an iterative component to it (loop).
67
iterative component to it (loop).The 12 steps are outlined in the next series of slides.
Functional Dependency Diagram Preparation
1 R t h d t l t b1. Represent each data element as a box.2. Represent each functional dependency by an arrow.3. Eliminate augmented dependencies.4. Eliminate transitive dependencies.5. Eliminate pseudo-transitive dependencies.
B thi t i t ti tt ib t h ld h b
68
By this stage, intersecting attributes should have been eliminated.
©Chisholm Institute
Deriving 3NF Schema: Synthesis Algorithm
6. Pick any (unmarked) arrow in the diagram.
7. Follow it back to its source, and write down the name of the source.
S8. Follow all arrows from the source data item,
and write down the names of their destinations.
S
A
69
S, A, B, C
S is now the key of a 3NF relation (S , A, B, C).
S
A
BC
S
A
B
Synthesis Algorithm: Deriving 3NF Schema
9. Mark all the arrows just processed.
C
U1 U2 U3
10. If there are any unmarked arrows in the diagram, go back to step 6.
11. Finally, determine the Universal Key. Any attribute which is notdetermined by any other attribute (ie. has no arrow going into it) is part ofthe Universal Key.
70
U1 U2 U3
12. If the universal key is not already contained in any of the above relations, make it into a relation. The universal key is the key of the new relation.
©Chisholm Institute
A Fully Worked Example
We will now work from a given set of forms to produce an FDD then use the 12 steps to produce the Schema. The forms that p p ffollow show the time spent by a particular employee on a particular project. They contain details of the employee along with details of the project. In addition they also state the hours that the employee has spent on any one project to date. This is important to the FDD. Notice also that the employee can have many previous titles and have a number of skills. This also has to be dealt with in the FDD and then later after we h d th nth i t hni t t th S h m H
71
have used the synthesis technique to create the Schema. Have a good look at the forms on the next 2 slides and try to develop the FDD yourself.
EMPLOYEE ______________________________________________________________________________________________________________NAME E_NUMBER DEPARTMENT LOCATION CURRENT TITLE PRIOR_TITLESSKILLS
Personnel Database Forms 1
SKILLS_______________________________________________________________________________________________________________Adams 1001 Finance 9th Floor Senior consultant Junior consultant Stock market
Research analyst Investments ______________________________________________________________________________________________________________PROJECTS______________________________________________________________________________________________________________NAME TIME_SPENT P_NUMBER MANAGER ACTUAL_COST EXPECTED_COST ______________________________________________________________________________________________________________Resolve bad debts 35 26713 Kanter 2000 1500______________________________________________________________________________________________________________
72
We say that this table is in “zero normal form” (0NF)This is because the cells have multiple values, eg. Prior titles and Skills. The next slide shows forms that demonstrate that an employee can work on many projects.
©Chisholm Institute
EMPLOYEE __________________________________________________________________________________________________________NAME E_NUMBER DEPARTMENT LOCATION CURRENT TITLE PRIOR_TITLES
SKILLS__________________________________________________________________________________________________________Baker 1002 Finance 9th Floor Senior consultant Junior consultant Stock market
Research analyst _____________________________________________________________________________________________________________________
Personnel Database Forms 2
_PROJECTS__________________________________________________________________________________________________________NAME TIME_SPENT P_NUMBER MANAGER_NUM ACTUAL_COST EXPECTED_COST __________________________________________________________________________________________________________Res bad debts 18 26713 Kanter 2000 1500__________________________________________________________________________________________________________
________________________________________________________________________________________________________________EMPLOYEE _________________________________________________________________________________________________________NAME E_NUMBER DEPARTMENT LOCATION CURRENT TITLE PRIOR_TITLES
SKILLS
73
_________________________________________________________________________________________________________Clarke 1003 Accounting 8th Floor Senior consultant Junior consultant Stock market
Investments _________________________________________________________________________________________________________
PROJECTS_________________________________________________________________________________________________________NAME TIME_SPENT P_NUMBER MANAGER_NUM ACTUAL_COST EXPECTED_COST _________________________________________________________________________________________________________New billing system 26 23760 Yates 1000 10000New office lease 10 26511 Yates 5000 5000___________________________________________________________________________________________________________________________
EXPECTED_COST
Personnel Database FD Diagram
From the forms given we can produce the following FDD
TIME_SPENT
_
ACTUAL_COST
MANAGER_NUM
PROJECT_NAME
P_NUMBER
EMPLOYEE_NAMEPRIOR_TITLE
74
LOCATION
CURRENT_TITLEE_NUMBER
SKILL
DEPARTMENT_NAME
©Chisholm Institute
Personnel Database FD Diagram -Synthesis
Let us just consider the section of the FDD that looks at the project number as the determinant
EXPECTED_COST
ACTUAL_COST
MANAGER_NUM
PROJECT_NAME
P_NUMBER
looks at the project number as the determinant
75
By using the synthesis method we can choose an arrow, trace it back to the source, and gather together all of the attributes that the source points to. Try this and see if you can create the schema for this table.
Personnel Database FD Diagram - Synthesis
Again, if we choose another arrow that has not been chosen before and follow it back to the determinant we find DEPARTMENT NAME is d t min nt G th in ll f th DEPARTMENT_NAME is a determinant. Gathering all of the attributes that it points to we only have the location attribute. Hence this is a simple table consisting of DEPARTMENT_NAME as the Primary key and LOCATION as the only other attribute.
76
LOCATIONDEPARTMENT_NAME
So the tableDEPT(DEPARTMENT_NAME, LOCATION) is created
©Chisholm Institute
Personnel Database FD Diagram - Synthesis
EMPLOYEE NAME
CURRENT_TITLEE_NUMBER
EMPLOYEE_NAME
DEPARTMENT NAME
Likewise for the section of the FDD based around the E_NUMBER, creating the following table for the Employees details.
77
DE MEN _N ME
EMPLOYEE (EMPLOYEE_NAME, E_NUMBER, DEPARTMENT, CURRENT TITLE )
Personnel Database FD Diagram - Synthesis
Here we have a slightly more complicated one. The Time spent on the project is dependent on both the Project number and the Employee name as it is the time spent by a particular employee on a particular
TIME_SPENTP_NUMBER
name, as it is the time spent by a particular employee on a particular project. This is demonstrated by the boxing of both the above attributes together pointing to the TIME_SPENT
78
E_NUMBER
Try to create the Assignment table for this part of the FDD.When you think you have it have a look at ours and see if you are right.
©Chisholm Institute
TIME_SPENTP_NUMBER
Personnel DatabaseFD Diagram - Synthesis
E_NUMBER
The main difference here is that when choosing the arrow to follow back to the determinant we find that we have 2. This is OK, we just have to make sure that in the table both of them are the primary Key. We have a Composite Primary Key consisting P_NUMBER and E_NUMBER. When we
79
ASSIGNMENT (E_NUMBER, P_NUMBER, TIME_SPENT)
then gather up all of the attributes that they point to together we get TIME_SPENT. Hence the table is written as
See the composite primary key
Personnel Database FD Diagram - Universal Key
Now, the last part of the synthesis is often forgotten. We must collect up all of the attributes that do not have arrows pointing into them and place
P_NUMBERPRIOR TITLE
them in the one table called the Universal Key. Every attribute collected then becomes part of the composite Primary Key. In this case we have the following attributes inside the box below. Notice how Skill is there, as it sits by itself. Nothing is its determinant.
80
E_NUMBER
PRIOR_TITLE
SKILL
UK (E_NUMBER, P_NUMBER, PRIOR_TITLE, SKILL)
©Chisholm Institute
Foreign Keys In the Synthesis Algorithm, a foreign key will arise from any
attribute that is:A. both a determinant and part of another determinant,
ORORB. both a determinant and a dependent.
TIME_SPENT
P_NUMBER
ASSIGNMENT (E_NUMBER, P_NUMBER, TIME_SPENT)
A.
81LOCATION
E_NUMBER
DEPARTMENT_NAME
EMPLOYEE (E_NUMBER, DEPARTMENT_NAME)
DEPT(DEPARTMENT_NAME, LOCATION)
B.
ISA = Is A
In the case of the manager we say that the manager number is contained within the employee number
MANAGER_NUM
ISA
Every MANAGER value is a E_NUMBER value.contained within the employee number
82
E_NUMBER
Gives rise to a new Foreign Key
EMPLOYEE PROJECT MANAGER_NUM
©Chisholm Institute
PROJECT (NAME, P_NUMBER, MANAGER_NUM, ACTUAL_COST, EXPECTED_COST )
Personnel Database SchemaGenerated by Synthesis
ASSIGNMENT (E_NUMBER, P_NUMBER, TIME_SPENT)
UK (E_NUMBER, P_NUMBER, PRIOR_TITLE, SKILL)
This foreign key is a result of
MANAGER ISA E_NUMBER
83
EMPLOYEE (NAME, E_NUMBER, DEPARTMENT, CURRENT TITLE )
DEPT(DEPARTMENT, LOCATION)
Personnel Database Network Diagram Generated by Synthesis
DEPT
EMPLOYEE PROJECT
P_NUMBERE_NUMBER
DEPT
DEPARTMENT_NAME
MANAGER_NUM
84
ASSIGNMENT
UK
E_NUMBER + P_NUMBER
©Chisholm Institute
A Fully Worked Example
We now have to take care of the multi-valued areas such as skills and prior titles Our FDD synthesis takes care of everything up to that prior titles. Our FDD synthesis takes care of everything up to that. It converts the FDD to what we call “Third normal Form”. We know that an individual can have many skills and many Prior Titles. They can also work on many Projects. Knowing the Employee number will not tell us one and only one value of the Skills that they have. We show this on the extended FDD with a double arrow notation.The notation for such a relationship is shown here where E_NUMBER is a determinant for many values of skill. Consequently the resulting representation shown on the next slide can be constructed, giving rise
85
p , g gto the splitting of the UK to form three more relations
E_NUMBER
SKILL
Personnel DatabaseMultivalued Dependency-Decomposition
ASSIGN (E NUMBER MultiValued Dependency
E_NUMBER
PRIOR_TITLE
SKILL
MVDs
P_NUMBER,ASSIGN (E_NUMBER,
P_NUMBER)MultiValued Dependency
Employees are associated with Projects, Titles and Skills independently. There is no
direct relationship between Projects, Titles and Skills.
86
PRIOR_JOB (E_NUMBER, PRIOR_TITLE)
EXPERTISE (E_NUMBER, SKILL) Hence we have the three new relations ASSIGN, PRIOR_JOB and EXPERTISE
©Chisholm Institute
EXPECTED_COST
Personnel Database FD Diagram with MVDs and Inclusion
PROJECT_NAME
TIME_SPENTACTUAL_COSTMANAGER_NUM
P_NUMBER
EMPLOYEE_NAMEISA MVD
87LOCATION
CURRENT_TITLEE_NUMBERPRIOR_TITLE
SKILL
MVD
DEPARTMENT_NAME
PROJECT (NAME, P_NUMBER, MANAGER, ACTUAL_COST, EXPECTED_COST )
Final Personnel Database Schema
ASSIGNMENT (E_NUMBER, P_NUMBER, TIME_SPENT)
PRIOR_JOB (E_NUMBER, PRIOR_TITLE)
EXPERTISE (E_NUMBER, SKILL)
Decomposed from UK
88
EMPLOYEE (NAME, E_NUMBER, DEPARTMENT, CURRENT TITLE )
DEPT(DEPARTMENT, LOCATION)
©Chisholm Institute
Final Personnel Database Network Diagram
DEPT
EMPLOYEE PROJECT
DEPARTMENT_NAME
MANAGER_NUM
89
ASSIGNMENTPRIOR_JOBEXPERTISE
E_NUMBER P_NUMBERE_NUMBER
E_NUMBER
EXPECTED_COST
Personnel DatabaseFD Diagram - Synthesis
PROJECT NAME
ACTUAL_COST
MANAGER
OJE _N ME
P_NUMBER
Choosing any of the arrows and following it back leads you to the j t b (P N b ) Thi i th th P i K If th
90
PROJECT (PROJECT_NAME,P_NUMBER, MANAGER, ACTUAL_COST, EXPECTED_COST )
project number (P_Number). This is then the Primary Key. If you then gather all of the attributes that P_Number points to and place them in the brackets you get the table Project with P_Number as the primary Key.
©Chisholm Institute
Role Splitting In Functional Dependency Diagrams
In a Functional Dependency Diagram any group of attributes can be related in only one wayattributes can be related in only one way. For example, a pair of attributes can be related
by an FD or not. Sometimes data can be related in more one way.
For example, a department can have an employee as its head or as a member.
The member relationship is represented in the
91
The member relationship is represented in the FDD:
But the head relationship is represented in the FDD:
E_NUMBER DEPARTMENT_NAME
DEPARTMENT_NAME E_NUMBER
Role Splitting In Functional Dependency Diagrams
W c n ch s t split th E NUMBER tt ibut int E NUMBER nd We can choose to split the E_NUMBER attribute into E_NUMBER andHOD.
But the foreign key constraint that a Head of Department is an Employee is lost on the FDD.
E_NUMBER DEPARTMENT_NAME
FDD
92
HODISA
EMPLOYEE DEPTNetworkD DEPARTMENT_NAME
HOD
Synthesis
©Chisholm Institute
Role Splitting In FDDs Alternatively, we can choose to split the
DEPARTMENT_NAME attribute into EMPLOYING_DEPT andHEADED_DEPT.B h f k h E l But the foreign key constraint that an Employing Department must be a Headed Department is again lost on the FDD.
E_NUMBER EMPLOYING_DEPT
FDD
S nth sis
93
HEADED_DEPT ISA
EMPLOYEE DEPTNetworkD
EMPLOYING_DEPT
E_NUMBER
Synthesis
Role Splitting Example
Consider this example. We have the Employee p p ywith many Skills, Prior Titles, as before but we also have equipment that belongs to a particular employee, such as a computer and a fax. An employee can have many different pieces of equipment. It is worthwhile recognizing them on the diagram and then decomposing them into
ll l f h h
94
smaller relations as part of the schema
©Chisholm Institute
PRIOR_TITLE
SERIAL# DESCRIPTION
Suppose each item of equipment (identified by SERIAL#) belongs to an
employee.
CURRENT_TITLEE_NUMBER
EMPLOYEE_NAME
E
SKILL
UK
HOD
ISA
MVDs
95
LOCATIONDEPARTMENT_NAME
•MVDs not necessarily embodied in the UK.•Better to decompose on MVDs first. •MVDs partition attributes into independent sets.
Obtain Tutorial 2 from your tutor.
96
©Chisholm Institute
ENTITY RELATIONSHIPANALYSIS
In this area of the course we concentrate an another modelling technique called Entity Relationship Modelling (ERM or ER).
The first stage of this process will look at the following:ER Data Model and NotationSt E titi
97
Strong EntitiesDiscovering Entities, AttributesIdentifying EntitiesDiscovering Relationships
Critique of FD Analysis
We originally concentrated on the modelling technique called Functional Dependency Diagrams. They have limitations as follows:
Disadvantages of FDDDoes not represents real world objects, but only
data;
98
Cannot represent MVDs or specialization;Cannot represent multiple relationships without
artificial splitting of attributes;Entities fragmented during analysis;
©Chisholm Institute
Conceptual Data Analysis
By using the ER technique we have the following advantages:
Data Analysis from the User's Point of View Models the Real World I d d t f T h l
99
Independent of Technology Able to be validated in user terms
Entity Relationship Data Model Features
Th l l f sin this t p f m d llin is th t it The real value of using this type of modelling is that it considers the design in context to the environment where it comes from. We have these Entities that have there own identifying attributes, real things and real people. They can be observed in the environment. ERM has the following features:
Populations of Real World objects represented by Entities Objects have Natural Identity
100
Objects have Natural Identity Entities have Attributes which have values Entities related by Relationships Constraints Subtypes
©Chisholm Institute
Occurrences versus Entities
56 Jack Ackov56 28Jack Ackov Jill Hill
Entity Occurrences
Let’s consider these two instances. Here we have both Jack and Jill, aged 56 and 23 respectively. By themselves they exist as people in their environment. In this case we consider them to be two customers. If we wish to model them and all of the possible customers that we have
101
Entity InstancesObjects
pwe need to create an Entity Class for all possibilities.
Occurrences versus Entities56 28Jack Ackov Jill Hill Customer# CustName
CUSTOMER
Entity OccurrencesEntity InstancesObjectsThese are the Tuples of
Entity ClassesEntity TypesEntity SetsThis will convert to the schema
102
Customer# CustName
5628
Jack AckovJill Hill
CUSTOMER(Customer#, CustName)
pthe table below below with Customer# being
the Primary Key
©Chisholm Institute
5628
Jack AckovJill Hill
Here we have Jack and Jill placing orders for particular items of stock. They appear to order different amounts of
3
12
41
each. For instance Jack orders 3 bikes. Each item being ordered also has a Stock#, Price and Description. These are individual instances of the process so we need to be able to represent any
ibilit f thi i
103BikeCup of Tea Pussy Cat23 156 234150 25
possibility of this in our model. See how we do this on the next page.
5628
Jack AckovJill Hill
CUSTOMER
Customer# CustName
ORDERS Quantity
312
4 1
104
ITEM
Stock# DescPriceBike Cup of Tea Pussy Cat23 156 234150 25
©Chisholm Institute
Customer# CustName
56 Jack AckovCUSTOMER(Customer#, CustName)
Occurrences to Entities to Schemas
5628
Jack AckovJill Hill
Customer#
56562828
ORDERS(Customer#, Stock#, Quantity)Stock#
23156156234
Quantity
31241
105
Stock# Desc
23156234
BikeCup of TeaPussy Cat
ITEM(Stock#, Price, Desc)Price
501
25
ENTITIES
Entities are classes of objects about which we wish to store information Entities are classes of objects about which we wish to store information. Examples are:
People: Employees, Customers, Students,.....Places: Offices, Cities, Routes, Warehouses,...Things: Equipment, Products, Vehicles, Parts,....Organizations: Suppliers, Teams, Agencies, Depts,...Concepts: Projects, Orders, Complaints, Accounts,......
STRONG
106
Events: Meetings, Appointments.
WEAK
©Chisholm Institute
STRONG ENTITIES
An entity is Existence Independent if an instance can exist in isolation.For example, CUSTOMER is existence independent of ORDER,
but ORDER is existence dependent on CUSTOMER. The ORDER is by a particular customer for a/many particular item(s)
An entity is identified if each instance can be uniquely distinguished by itsattributes (or relationships).For example, CUSTOMER is identified by Customer#, PERSON
is id tifi d b N Add ss D B ORDER is id tifi d b
107
is identified by Name+Address+DoB, ORDER is identified by Customer#+Date+Time.
STRONG ENTITIES
An entity is STRONG if it can be identified by its (own) immediateattributes. Otherwise it is weak.For example, CUSTOMER and PERSON are strong entities, but
ORDER is weak because it requires an attribute of another entity to identify it. ORDER would be strong if it had an Order#.
Existence independent entities are always strong.
108
©Chisholm Institute
The Method: How to Develop the ERM
Step1: Search for Strong Entities and Attributes Step2. Attach attributes and identify strong entities. Step3 Search for relationships Step3. Search for relationships. Step4. Determine constraints. Step5. Attach remaining attributes to entities and relationships. Step6. Expand multivalued attributes, and relationship attributes. Represent attributed relationships and/or multivalued
attributes in a Functional Dependency Diagram. Step7. Identify weak entities. Step8 Iterate steps 4 5 6 7 8 until no further expansion is possible
109
Step8. Iterate steps 4,5,6,7,8 until no further expansion is possible. Step9. Look for generalization and specialization; Analyze Cycles; Convert
domain-sharing attributes to entities.
Narrative&
F
1Search for
strong entitiesand attributes
Attributes
2Identifystrong
titi
The Method
The Method
Forms Entities
3Search for
relationships
Relationships
entities
Strong entities
4 & 5Determine
constraints andattach attributes
7Identify
weak entitiesIdentified
weak entities
110
Entity-RelationshipDiagram6
Expand attributedrelationships and/or
multivalued attributes
Weak Entities
6’Represent attributed
relationships and/or multivalued attributesas Functional Dependencies
FunctionalDependency
Diagrams
©Chisholm Institute
Step1: Search for Strong Entities and Attributes
1 Entitiesrelevant nounsmany instancesmany instanceshave properties (attributes or relationships) identifiable by properties
2 Strong Entities independent existence identifiable by own single-valued attributes
•3 Attributes
111
3 Attributes–printable names, measurements–domain of values–no properties–dependent existence
A worked example finding strong Entities
A customer is identified by a customer#. A customer has a
name and an address. A customer may order quantities customer may order quantities of many items. An item may
be ordered by many customers. An item is
identified by a stock#. An item has a description and a price. A stock item may have
many colours. Any item
Here we have a scenario. Try to firstly identify all of the strong entities followed and all of the attributes. Can you also identify a weak entity? Are there any attributes that you have missed?
112
Narrative
ordered by a customer on the same day is part of the same
order
©Chisholm Institute
Worked Example Continued
Let us take and place it around the nouns. These lead us to what we will consider to be A customer is identified by a the strong entities. If we then place the around items that we think would be the attributes, we can see if if any of the identified Entities are strong. You will notice that the item has a description, price, colour and stock # and a
t h t
A customer is identified by a customer#. A customer has a
name and an address. A customer may order quantities of
many items. An item may be ordered by many customers. An item is identified by a stock#.
An item has a description and a
113
customer has a customer number, name, and address. These a Existence Independent Entities, and hence they must be strong.
Narrative
price. A stock item may have many colours. Any item ordered by a customer on the same day is
part of the same order
Worked Example ContinuedWe have our Entities and the attributes displayed before us. Customer and Item are strong entities as they are Existence Independent. What about Order?
O d t b
Conceptual Schema
CUSTOMER ITEM
Address Customer# Date
Order cannot be identified completely by any of its own attributes. It is dependent on the attributes of the other 2 entities to be identified. An order is made up of a
114
Description
Price
Quantity Stock#
Customer Name
ORDERColour
An order is made up of a customer ordering an item. We need the customer# and the item# to identify the order
©Chisholm Institute
Step2. Identify Strong Entities.
We now attach the attributes that belong to each of the Strong Entities. Notice that there are some left that belong to neither Customer or Item We will look at this later
Conceptual Schema
ITEMCUSTOMERCustomer#
Price
AddressCustName
Stock#
Desc
Colour
Customer or Item. We will look at this later.
115
CustName
DateQty
Both Customer and Item have what we call a Natural Identity
Another Example of the Difference Between Weak and Strong Entities
H is n th x mpl f mm n n th t Here is another example of a common occurrence that demonstrates the difference between a strong entity and a weak entity
A strong entity is identified by its own attributes.Bidders make purchases of goods at the auction.
BIDDER and a GOOD have independent existence, hence are strong, but PURCHASE requires attributes of
d
116
BIDDER and GOOD. The Purchase is the identified by the Bibbers name and the Goods description. These are 2 attributes that belong to both the Bidder and the Good respectively.
©Chisholm Institute
Additional Rules for Entities
For an Entity to exist we have the following additional rules: There must be more than one instance of an entity.
The company provides superannuation for its workers.Here there is only one instance of COMPANY so it is not
a valid entity.We do not model anything that only has one instance
Each instance of an entity must be potentially distinguishable by its properties.
117
Members send five dollars to the association.A dollar does not normally have distinguishing
attributes.
Step3. Search for Relationships.
We can now identify Relationships that have the following properties: Relationships
Have associate entities Are relevant
must be worth recording Can be"structural" verbs in the narrative
persistent, rather than transient relationships Can be "abstract" nouns in the narrative
nonmaterial connections, eg. Enrolment
118
Can be verbalizable in the narrativeeg. Student EnrolledIn Unit
Have 2 (binary)or more associated entities.(3-Ternary, up to n-ary for n associated entities)
©Chisholm Institute
Relationships:
A relationship must be relevant. It should indicatea structural, persistent (extending over time)p ( g )association between entities.Students enrol in units selected from the
handbook. A relationship should not usually indicate a
procedural event (one that occurs momentarily,then is forgotten.).
119
th n s forgott n.).Students read about units selected from the
handbook.
Relationships and the Worked Example.
We can now deal with the order. The order is a relationship between the Customer and the Item. It is for a set Quantity on a given Date.
Conceptual Schema
ITEMCUSTOMER
Customer#Price
Stock#
DescORDERS
120
AddressCustName Colour
DateQty
©Chisholm Institute
Entity Relationship Analysis 2We will now concentrate on the following areas of good ERMCardinality and Participation ConstraintsCardinality and Participation ConstraintsExpanding to Weak EntitiesIdentifying Weak EntitiesDerived Attributes and RelationshipsTernary Relationships
121
These are Steps 4,5 & 6 from the Original Diagram
Strong entities Unattched AttributesUnidentifiedweak
Relationships
4 & 5Determine
constraints andattach attributes
7Identify
weak entities
Identifiedweak entities
weak entities
122
Entity-RelationshipDiagram
6Expand attributed
relationships, domain sharing &
multivalued attributes
Weak Entities
©Chisholm Institute
Step4. Determine constraints: Cardinality(How many participate
To complete this we “fix a single instance at one end and ask how many (one or many) are involved at the other end”.L k t th l ti ship h th C st m O d s
ORDERS
Look at the relationship where the Customer Orders an Item. Consider a single Customer. Can they order many items at the one time? Yes We have seen this. So we position a crows foot (<) at the point where the line touches the Entity Item. We then ask if an Item can be ordered by many Customers? Yes So agin we place a crows foot at the Customers end.
123
CUSTOMER ITEM
From left to right-A Cust can order many Items
From right to left- An Item can be ordered by many Cust
Step4. Determine constraints: Cardinality.
Again to complete this task we “Fix a single instance at one end and ask how
( ) l d h CUSTOMERmany (one or many) are involved at the other end”.All of the Customers live in a City. A Customer can only live in one City(unless they are politicians) In this case we must place a single straight line (|) at the intersection of the relationship line and the Entity City. However,
LIVES IN
124
a city can have many Customers. We show this by placing crows foot (>) at the end near the Customer
CITY
©Chisholm Institute
Step4. The Resulting ER with the Cardinality Constraints in Place
CUSTOMER ITEMORDERS
CUSTOMER ITEMMany CUSTOMERs
can ORDER anITEM.
Many ITEMs can be
ORDERed by a
CUSTOMER.LIVES INMany CUSTOMERs
can LIVE IN aCITY.
{Colour}
An ITEM can have many
125
CITYA CUSTOMER can
LIVE IN only one CITY.
manyColours.
Step4.Determine constraints: Participation.
Again, we “Fix a single instance at one end and ask if any must (might or must) be involved at the other end”.We ask “Does the Customer have to order an Item? Well We ask Does the Customer have to order an Item? Well, some would say that they do not they are not Customers! But we know that we must be able to recognise our Customers even though at present they do not have an order with us. So, in this case they do not have to place an order. This is then not mandatory, and we show it by placing the O beside the cardinality constraint. An Item does not have to be on an order as well, so it also gets the O notation.
126
CUSTOMER ITEMORDERS
, g
©Chisholm Institute
Step4.Determine constraints: Participation.
This is also the case for the Customer living in th Cit D s th st m h t li in
CUSTOMER
LIVES IN
in the City. Does the customer have to live in the City? In this case Yes, as we class all areas as being within a City. Hence we place the “|” symbol beside the cardinality constraint next to the Entity City. The next one is difficult. Does a City have to have a Customer living in it. You might think No here, but are you prepared to record all of
127CITY
LIVES IN, y p p fthe cities in the world just to make sure? Common sense tells us that we have to make this mandatory so we only keep a record of the cities where our Customers live.
Step4. The Resulting ER with the Participation Constraints in Place
ORDERSCUSTOMER ITEM
An ITEM might be ordered by a CUSTOMER. A CUSTOMER might
order a ITEM.LIVES INA CITY must have a CUSTOMER LIVing
IN it
128
CITY
IN it.
A CUSTOMER must LIVE IN a
CITY.
©Chisholm Institute
Step4. Determine constraints: Validation by Population.
CUSTOMER ITEMORDERS
LIVES INCust#
Stock#{Colour}An important method of
evaluating the proposed model is to populate with instances
129
CITY
LIVES IN Stock#
CityName
is to populate with instances that demonstrate that the constraints that you have identified will work.
Step4. Tables Created to Validate
CUSTOMER ITEMORDERSORDERS
LIVES IN
Cust# Stock#122312
13
77778899
CityName Cust#Ayr 12
Cust#Stock#
{Colour}
130
CITY
13y
AyrTully
2313
CityName
ColourStock#PinkBlue
7777
©Chisholm Institute
Step5. Attach remaining attributes to entities and relationships.
In the previous lectures we looked at a worked problem with a Customer ordering an Item. Here we were able to identify Entities from the narration. N t ls list d th tt ib t s hi h h l d s id tif th St E titi s Next we also listed the attributes which helped us identify the Strong Entities. We noticed that there were some Attributes, Qty and Date, left that could not be attached to any of the strong entities. They, in fact, belong to the Relationship that was associated with the two Entities.
Customer# Stock#
131
ITEMCUSTOMERCustomer#
Price
AddressCustName
DescColour
DateQty
ORDERS
Step5. Attach remaining attributes to entities and relationships.
The quantity attribute cannot be attached to the Customer, as the Customer will order different quantities of various items at any time. It cannot also be attached to the Item. It must therefore be attached to the relationship between them, being th d This is ls th sit ti f th
132
the order. This is also the situation for the Date that the order was placed.
©Chisholm Institute
Step5. Attach remaining attributes to entities and relationships.
Conceptual Schema
ITEMCUSTOMER
Customer#Price
Add
Stock#
DescDateQty
ORDERS
133
AddressCustName {Colour}
DateQty
Step6.Expand multi-valued attributes, domain sharing attributes and binary
relationship attributes.
Once we have identified the Strong Entities, Relationships and attached all Attributes to either the Relationships and attached all Attributes to either the Strong Entities or Relationships, we are required to expand the diagram as much as possible to permit us to complete the process. This requires us to move in 2 directions. We must first look at all of the binary relationships to see what the cardinality constraints are between them. If they are “many-to-many” they
134
y y y ymust be carefully considered and expanded where appropriate.
We then must look at what we call Multi-valued Attributes and Domain Sharing Attributes. The process is shown on the following diagram.
©Chisholm Institute
Step6 Entity-RelationshipDiagram
M Multi valued Attributes
Expand relationships
with attributes
Many-to-many Relationships with Attributes
Multi-valued AttributesDomain Sharing Attributes
ExpandMulti-valued anddomain sharing
attributes
135
Dependent Entities
Characteristic EntitiesAssociative Entities
Step6In the worked example we have a Many-to-Many relationship with 2 attributes . When we have a Many-to-Many relationship with attached attributes we are required to create an Associative Entity that bridges the 2 Entities.
Conceptual Schema
ITEMCUSTOMER
Customer#Price
Stock#
DescORDERS
136
AddressCustName
Desc
{Colour}DateQty
©Chisholm Institute
Step6
Between Customer and Item we create the Weak (Associative) Entity called Order. We have to redo the constraints. A customer can place many orders or none. An order can come from only one customer, and must be from a customer. An order is for many
Customer# Stock#
MAKES
Associative Entity
, f f yitems and must be for at least one item, and an item can be on many orders but does not have to appear on an order. These have all been placed in the diagram shown below in their correct position.
137
ITEMCUSTOMERPrice
AddressCustName
Desc
DateQty
ORDERMAKES FOR
Step6
We have also noticed that an item can come in many colours. This is a multi-valued attribute. We can show this in our extended diagram by having a relationship between the Item and the Colour, where colour is the only attribute of the entity In this case we are also saying that the
ITEMCUSTOMER
Customer#
Price
Stock#
DORDER
MAKES FOR
Associative Entity
the only attribute of the entity. In this case we are also saying that the colour of the item is optional (IE natural if requested) and that the only colours to be recorded are those that are used.
138
AddressCustName
Desc
Colour
DateQty
COLOUR
HAS
Characteristic Entity
©Chisholm Institute
Step6. Expand domain sharing attributes.
Managers supervise Workers. All employees are residents of a City. Employees who live in different cities from their managers get a special allowancemanagers get a special allowance.
MANAGER WORKERSUPERVISES
City City
Allowance
CITY
OF
Characteristic Entity
139
MANAGERSUPERVISES
CityName
Allowance
OF OF
WORKER
Step7. Identify weak entities.Clarify the notion of instance.Weak entities are often ambiguous and difficult to
agree onagree on.Attributes may be part of a key for a weak entity, but
at least one (one-must) relationship for identification is required. So when we convert this into a table it will require one of the PKs from the strong entities as part of its own composite PK.
Validation, not design.
140
The purpose of identification is not to allocate a primary key, but to validate the concept. We have to be able to justify the concept of the relationship in the real world.
Never invent keys. I know that it is tempting but you must reflect the business as it is.
©Chisholm Institute
Step7. Identify weak entities.
C t l S hConceptual Schema
ITEMCUSTOMER
Customer#
Price
AddressCustName
Stock#
Desc
DateQty
ORDER
MAKES
FOR
HAS
141
Colour
COLOURAn ORDER is uniquely identified by the CUSTOMER and the Date.
Step7. Identify weak entities.
C t l S hConceptual Schema
ITEMCUSTOMER
Customer#
Price
AddressCustName
Stock#
Desc
DateQty
ORDER
MAKES
FOR
HAS
142
Colour
COLOURHere we still have the relationship between Order and Item that is many to many with attributes. We must expand this.
©Chisholm Institute
Step8. Iterate until no further expansion is possible.
An intersection entity isone that is identified byonly by its relationships.
We introduce the weak entity orderline that for one item. It is fully dependent on the
Conceptual Schema
ITEM
Customer#
Price
Stock#
Desc
Date
Qt
ORDER
MADE BY
FOR
HAS
ORDERLINEHAS
on y y ts r at onsh ps.for one item. It is fully dependent on the attributes of Order and Item to be identified
143
CUSTOMER
Customer#
AddressCustName
Colour
Qty
COLOUR
MADE BY HAS
An ORDERLINE is identified by an ITEM on an ORDER.
Example 2 Ted’s Computer courses is a company that
ff b f t t li t offers a number of computer courses to client companies. A client may request several courses at one time.
A course has a course code, a description and a list of resources required.E h b f it bl
144
Every course has a number of suitable instructors who are qualified to deliver it. An instructor has a name, address, and a telephone.
©Chisholm Institute
Example 2 cont. A client company can request that a course
begin on any date nominated This course can be begin on any date nominated. This course can be offered repeatedly on many dates.
The cost of the course is negotiated for each offering.
Ted’s Company requires the details of all the attendees.
145
When a course is offered an instructor is assigned from the list of instructors for that course.
Example 2 cont.3 Each course is offered as a series of usually 4
f h i E h i h ti d four hour sessions. Each session has a time and a place, again negotiated with the client.
Develop and E-R Diagram for the database application.
The next slide show you the forms filled out h i ff d
146
when a course is offered.
©Chisholm Institute
Course Specifications Form
147
Course Offering Form
148
©Chisholm Institute
Course Attendance Form
149
Solution 2 - 1 First get the major entities;
l Client CourseResourcesInstructorsAttendees
150
Session
©Chisholm Institute
Solution 2 - 2 Lets look at the client.
151
Solution 2 - 3 Lets look at the Course.
152
©Chisholm Institute
Solution 2 - 4 Look at the relationship between the client and
th the course.
153
Solution 2 - 5 The course entity has resources which is a
lti l d d l d i d lt ith multiple dependency value and is dealt with as follows.
154
©Chisholm Institute
Solution 2 - 6 Next step is the Cardinality Constraints of the
th titithree entities
155
Solution 2 - 7 Lets include the Instructor entity
156
©Chisholm Institute
Solution 2 - 8 For simplicity we are only going to add the
tt ib t th t i kattributes that are primary keys. A client can order a course on a set date.
Where does the date belong?
157
Solution 2 - 9 Lets look at the attendees entity. The
tt d tt d t d b attendees attend a course requested by a client. This is a binary relationship.
158
©Chisholm Institute
Solution 2 - 10 This binary relationship creates a weak entity
ll d C Off icalled Course Offering.
159
Solution 2 - 11 There is one more entity that needs to be
dd d S iadded, Sessions.
160
©Chisholm Institute
Solution 2 - 12 So far it looks like.
161
Solution 2 - 13 Now lets look at the many to many relationships
th t till ththat are still there.
162
©Chisholm Institute
Solution 2 - 14 Creating weak entities.
ResourcesUsed
Resources Qty
Course
CourseId
CourseOffering
Attendances
163
teachingStaff
Consultant/Instructor
nameAttendees
Solution 2 - 15 Now lets do the Network Diagram.
164
©Chisholm Institute
Solution 2 - 16 The database schema.
165