46
Relational Databases Entity Relationship Model Database Design A2 Teacher Up skilling LECTURE 1

Relational Databases Entity Relationship Model Database Design A2 Teacher Up skilling LECTURE 1

Embed Size (px)

Citation preview

Relational Databases

Entity Relationship Model

Database Design

A2 Teacher Up skilling

LECTURE 1

Introduction

• Lectures + Practical

• Wednesday 5.30pm-8.30pm

• ECS1/02/014

• Minor Lab

• Staff

• Narelle Allen [email protected]

• Neil Anderson [email protected]

• Angela Allen [email protected]

• Philip Hanna [email protected]

• Craig Dooley [email protected]

www.computingatqueens.co.uk

• General information / contacts

• Teaching Resources:

• Week by week overview

• Weekly slides & exercises

• Comment & Discussion

• Feedback for us

• Problems & successes you’ve had

• Supporting each other / learning from each other

• Teaching Community

• Sharing materials / resources you have developed

• Portal Support: Craig Dooley [[email protected]]

What’s to come today?

• Relational Databases

• Database Schema

• Keys

• Database Design

• Entity Relationship Model

• Normalisation

Goals: Design, Implement & Use

• Design a database

• Using database design methodologies and database theories to

create the best structure for a relational database

• Implementing and maintaining a database using a data definition

language (DDL) – we will use SQL for data definition

• Defining the structure of a database

• Making changes to the structure

• Access a database system (access and make changes to the contents)

using a data manipulation(DML), commonly called a query language

(we will use SQL for data manipulation) via C#

• Populating the database

• Updating

• Querying

Why learn about databases?

• Incredibly prevalent

• Websites, telecommunications systems, banking systems, video

games, health records, censuses, search engines, just about any

other software system or electronic device that maintains some

amount of persistent information.

• Properties making them exceptionally useful and convenient

• Persistence, reliability, efficiency, scalability, concurrency

control, safety, data abstractions, high-level query languages

What is a Database?

• A database is a store of structured data which can be accessed

via a high-level query language.

• We use a DBMS (Database Management System) to create,

maintain and access/query a database containing information about a

particular application

• Collection of structured data

• Set of programs/commands to create and maintain the database

structure for storing data and to access the data

• An environment that is both convenient and efficient to use

• An example of a DBMS is MySQL.

Structure of Relational Database

• A relational database basically consists of a number of tables,

each of which is called a relation instance, or simply relation.

• Each table contains a number of rows and a number of columns.

• Each row is called a tuple.

• Each column is called an attribute.

JonesSmithCurryLindsay

customer_name

MainNorthNorthPark

customer_street

HarrisonRyeRyePittsfield

customer_city

attributes(or columns)

tuples(or rows)

Customer Relation

Account Relation

Depositor Relation

• Notice how the Depositor relation, links customer names

(customer_name attribute from the Customer relation) to account

numbers (account_number attribute from the Account relation). This

is to show that a certain customer made a deposit towards a certain

account.

Database Schema

• The schema of a table/relation is the the structure of the

table/relation, called a relation schema.

• The structure of a relational database is specified by a database

schema, which contains a set of relation schemas.

• Each relation schema consists of a relation name and a number of

attributes.

• Each attribute has a particular attribute type (similar to data types

in programming), that is, a domain of values.

• Relation schema describes the structure and semantics of a relation.

Example:

Customer(customer_name, customer_street, customer_city)

Attribute Values

• The set of allowed values for each attribute is called the domain of

the attribute

• Attribute values are (normally) required to be atomic; that is,

indivisible

• E.g. the value of an attribute can be an account number,

but cannot be a set of account numbers

• The special value null is a member of every domain

Keys – Primary Key

• Each tuple in a relation/table needs to be uniquely identified, e.g., a

tuple may represent a student record, a module, or an employee

record

• Simply put, a key is an attribute that identifies a unique tuple of each

possible relation r(R), e.g., K = {customer_name}.

• Primary key: an attribute or set of attributes that is chosen as the

principal means of identifying tuples within a relation

• Should choose an attribute whose value never, or very rarely,

changes, e.g., national insurance number or customer_id

For instance, email address is unique, but may change

• We normally underline the primary key

For instance, instructor(ID, name, dept_name, salary)

Keys – Foreign Key

• The attribute of a relation schema attribute is called a foreign key if it

corresponds to the primary key of another relation schema.

E.g. customer_name and account_number attributes in depositor are

foreign keys that are the primary keys of customer and account

respectively.

• Only values occurring in the primary key attribute of the referenced

relation may occur in the foreign key attribute of the referencing

relation. This is known as Referential Integrity.

Referential Integrity

• Referential Integrity is a set of constraints imposed by a Relational

Database Management System that prevents users from having

inconsistent data.

• In our example, the Depositor Relation has 2 foreign keys

(customer_name and account_number) that reference the primary

keys for the Customer and Account relation respectively.

• Through referential integrity, we cannot add a row to the Depositor

relation that contains an account number that does not exist in the

Account relation. We also cannot add a customer name to the

Depositor relation if that customer does not exist in the Customer

relation.

Referential Integrity (cont)

• Furthermore, referential integrity may also specify that when you

delete a primary key record from a certain table, any foreign key

records linked to that primary key from a different table are also

deleted.

• In our case, if you delete a Customer record from the Customer

relation, then all of the records in the Depositor relation that

references that Custome are also deleted. This is known as a cascade

delete.

Schema Diagram

• A Schema diagram shows the connections between each of the

relation schemas.

Entity Relationship Model

• Given an application problem, we need to create a data model to

capture the data and relationships between data specified in the

given problem

• Already know how to use a relational model

• But how do we get it in the first place

• Create an Entity-Relationship model by designing an E-R diagram – a

graphical representation of entities and relationships between entities

• We can then convert the E-R diagram into a relational model through

abiding by the rules of normalization.

Entity Relationship Modelling

• In terms of an E-R model, a database can be modeled as:

• a collection of entities,

• relationship among entities.

• An entity is an object that exists and is distinguishable from other

objects.

• Example: specific person (e.g., John, Mary), company (e.g., IBM,

Microsoft), event (e.g., car accidents, traffic jams)

• Entities have attributes that uniquely characterize them

• Example: people have names and addresses

• An entity set is a set of entities of the same type that share the

same properties.

• Example: set of all persons, companies, trees, holidays

Entity Sets (Instructor, Student)

ID name salary

76766 Crick 72000

45565 Katz 75000

10101 Srinivasan 65000

98345 Kim 80000

76543 Singh 80000

22222 Einstein 95000

….. …….. …..

instructor

ID name Tot_cred

98988 Tanaka 120

12345 Shankar 32

10128 Zhang 102

76543 Brown 58

76653 Aoi 60

23121 Chavez 110

44553 Peltier 56

….. ……. …

student

What does this set represent?

instructor_ID student_ID

76766 98988

45565 12345

45565 10128

10101 76543

98345 76653

76543 23121

22222 44553

….. …..

advisor

This table represents a relationship set, which contains a set of advisor relationships between instructors and students.

Relationship Set (Advisor)

ID name salary

76766 Crick 72000

45565 Katz 75000

10101 Srinivasan 65000

98345 Kim 80000

76543 Singh 80000

22222 Einstein 95000

….. …….. …..

instructor

ID name Tot_cred

98988 Tanaka 120

12345 Shankar 32

10128 Zhang 102

76543 Brown 58

76653 Aoi 60

23121 Chavez 110

44553 Peltier 56

….. ……. …

student

Each of the links represents an advisor relationship between an instructor and a student.

Note instructor Katz, is an advisor to 2 students.

Mapping Cardinality Constraints

• Express the number of entities in one entity set to which another

entity in another entity set can be associated via a relationship set.

• The mapping cardinality must be one of the following types:

• One to one

• One to many

• Many to one

• Many to many

One-to-One Mapping

• One to one mapping means that

• one entity on the left can be associated with at most one entity

on the right and

• one entity on the right can be associated with at most one entity

on the left

• For example, each instructor has a most one advisee and each

student has at most one advisor

One-to-One Mapping Example

ID name salary

76766 Crick 72000

45565 Katz 75000

10101 Srinivasan 65000

98345 Kim 80000

76543 Singh 80000

22222 Einstein 95000

….. …….. …..

instructor

ID name Tot_cred

98988 Tanaka 120

12345 Shankar 32

10128 Zhang 102

76543 Brown 58

76653 Aoi 60

23121 Chavez 110

44553 Peltier 56

….. ……. …

student

One-to-Many Mapping

• One to many mapping means that

• one entity on the left can be associated with many entities

(possibly 0) on the right and

• one entity on the right can be associated with at most one entity

on the left

• For example, each instructor has several advisees (possibly 0)

and each student has at most one advisor

One-to-Many Mapping Example

ID name salary

76766 Crick 72000

45565 Katz 75000

10101 Srinivasan 65000

98345 Kim 80000

76543 Singh 80000

22222 Einstein 95000

….. …….. …..

instructor

ID name Tot_cred

98988 Tanaka 120

12345 Shankar 32

10128 Zhang 102

76543 Brown 58

76653 Aoi 60

23121 Chavez 110

44553 Peltier 56

….. ……. …

student

Many-to-One Mapping

• Many to one mapping means that

• One entity on the left can be associated with at most one entity

on the right and

• one entity on the right can be associated with many entities

(possibly 0) on the left

• For example, each instructor has at most one advisee and each

student has several advisors (possibly 0)

Many-to-One Mapping Example

ID name salary

76766 Crick 72000

45565 Katz 75000

10101 Srinivasan 65000

98345 Kim 80000

76543 Singh 80000

22222 Einstein 95000

….. …….. …..

instructor

ID name Tot_cred

98988 Tanaka 120

12345 Shankar 32

10128 Zhang 102

76543 Brown 58

76653 Aoi 60

23121 Chavez 110

44553 Peltier 56

….. ……. …

student

Many-to-Many Mapping

• Many to many mapping means that

• one entity on the left can be associated with many entities

(possibly 0) on the right and

• one entity on the right can be associated with many entities

(possibly 0) on the left

• For example, each instructor has several advisees (possibly 0)

and each student has several advisors (possibly 0)

Many-to-Many Mapping Example

ID name salary

76766 Crick 72000

45565 Katz 75000

10101 Srinivasan 65000

98345 Kim 80000

76543 Singh 80000

22222 Einstein 95000

….. …….. …..

instructor

ID name Tot_cred

98988 Tanaka 120

12345 Shankar 32

10128 Zhang 102

76543 Brown 58

76653 Aoi 60

23121 Chavez 110

44553 Peltier 56

….. ……. …

student

ER Diagrams

• Rectangles represent entity sets.

• Diamonds represent relationship sets.

• Attributes listed inside entity rectangle

• Underline indicates primary key attributes

• Lines link entity sets to relationship sets

Representing Cardinality Constraints

• We express cardinality constraints by drawing either a directed line

(), signifying “one” or an undirected line (—), signifying

“many” between the relationship set and the entity set.

• One-to-one relationship:

• A student is associated with at most one instructor via the

relationship advisor

• A student is associated with at most one department via

stud_dept

One-to-One Relationship

• one-to-one relationship between an instructor and a student

• an instructor is associated with at most one student via advisor

• and a student is associated with at most one instructor via

advisor

One-to-Many Relationship

• one-to-many relationship between an instructor and a student

• an instructor is associated with several (possibly 0) students via

advisor

• a student is associated with at most one instructor via advisor,

Many-to-One Relationship

• In a many-to-one relationship between an instructor and a student,

• an instructor is associated with at most one student (possibly 0)

via advisor,

• and a student is associated with several (possibly 0) instructors

via advisor

Many-to-Many Relationship

• An instructor is associated with several (possibly 0) students via

advisor

• A student is associated with several (possibly 0) instructors via

advisor

Normalization

• Normalization is the process of organizing data in a database into an

appropriate design.

• Normalization is important as it imposes a set of rules that when

abided ensures that our database design is good (minimizes data

duplication and redundancy).

• In this course, we will consider 1st, 2nd and 3rd Normal Form.

• The forms are progressive, so in order to be in 2nd Normal Form, the

database must also satisfy the rules for 1st Normal form and so on.

• You should strive to have a database that is in 3rd Normal form.

1st Normal Form

• A database is in 1st Normal form if it satisfies the following conditions:

• Does not contain any repeating groups

• All attributes are atomic (i.e. indivisible units)

• Suppose we had a Student relation that was not in 1st Normal form

because not all attributes are atomic. Let’s assume that the Student

attribute is the primary key.

1st Normal Form (cont.)

• We would separate this data into multiple rows so that are attributes

are atomic. Now we have a Student table in 1st Normal Form.

2nd Normal Form

• A database is in 2nd Normal form if it satisfies the following conditions:

• It is in 1st Normal form.

• There are no partial dependencies on any of the columns

(attributes) of the primary key.

• In our Student relation, the Age attribute depends upon only the

Student attribute. Therefore we will extract the primary key and the

partial dependency attribute (Subject) to a new table. These extracted

attributes will form a composite primary key (Student, Subject) in the

new table.

2nd Normal Form (cont.)

• Now we have 2 relations, the Student relation and the Subject

relation.

3rd Normal Form

• A database is in 3rd Normal form if it satisfies the following conditions:

• It is in 2nd Normal form.

• All non-primary fields are dependent on the primary key.

• Consider we had a Student detail table with a Student_id as the

primary key.

• In this table, the street, city and state attributes depend upon the Zip

attribute, which is not the primary key. Therefore this fails the

condition that all non-primary fields are dependent on the primary

key.

3rd Normal Form

• To apply 3rd Normal Form to this table, we move the attributes that are

not dependent upon the primary key to a new table, along with the

attribute that they are actually dependent upon.

• In our example, we move the street, city and state attributes to a new

table, with the Zip attribute as the primary key. We will call this new

table the Address Table.

What’s to come next time

• Week 2

• Querying database

• SQL

(Structured Query

Language)