Upload
oliver-turner
View
220
Download
0
Tags:
Embed Size (px)
Citation preview
Introduction
• Lectures + Practical
• Wednesday 5.30pm-8.30pm
• ECS1/02/014
• Minor Lab
• Staff
• Narelle Allen [email protected]
• Neil Anderson [email protected]
• Angela Allen [email protected]
• Philip Hanna [email protected]
• Craig Dooley [email protected]
www.computingatqueens.co.uk
• General information / contacts
• Teaching Resources:
• Week by week overview
• Weekly slides & exercises
• Comment & Discussion
• Feedback for us
• Problems & successes you’ve had
• Supporting each other / learning from each other
• Teaching Community
• Sharing materials / resources you have developed
• Portal Support: Craig Dooley [[email protected]]
What’s to come today?
• Relational Databases
• Database Schema
• Keys
• Database Design
• Entity Relationship Model
• Normalisation
Goals: Design, Implement & Use
• Design a database
• Using database design methodologies and database theories to
create the best structure for a relational database
• Implementing and maintaining a database using a data definition
language (DDL) – we will use SQL for data definition
• Defining the structure of a database
• Making changes to the structure
• Access a database system (access and make changes to the contents)
using a data manipulation(DML), commonly called a query language
(we will use SQL for data manipulation) via C#
• Populating the database
• Updating
• Querying
Why learn about databases?
• Incredibly prevalent
• Websites, telecommunications systems, banking systems, video
games, health records, censuses, search engines, just about any
other software system or electronic device that maintains some
amount of persistent information.
• Properties making them exceptionally useful and convenient
• Persistence, reliability, efficiency, scalability, concurrency
control, safety, data abstractions, high-level query languages
What is a Database?
• A database is a store of structured data which can be accessed
via a high-level query language.
• We use a DBMS (Database Management System) to create,
maintain and access/query a database containing information about a
particular application
• Collection of structured data
• Set of programs/commands to create and maintain the database
structure for storing data and to access the data
• An environment that is both convenient and efficient to use
• An example of a DBMS is MySQL.
Structure of Relational Database
• A relational database basically consists of a number of tables,
each of which is called a relation instance, or simply relation.
• Each table contains a number of rows and a number of columns.
• Each row is called a tuple.
• Each column is called an attribute.
JonesSmithCurryLindsay
customer_name
MainNorthNorthPark
customer_street
HarrisonRyeRyePittsfield
customer_city
attributes(or columns)
tuples(or rows)
Depositor Relation
• Notice how the Depositor relation, links customer names
(customer_name attribute from the Customer relation) to account
numbers (account_number attribute from the Account relation). This
is to show that a certain customer made a deposit towards a certain
account.
Database Schema
• The schema of a table/relation is the the structure of the
table/relation, called a relation schema.
• The structure of a relational database is specified by a database
schema, which contains a set of relation schemas.
• Each relation schema consists of a relation name and a number of
attributes.
• Each attribute has a particular attribute type (similar to data types
in programming), that is, a domain of values.
• Relation schema describes the structure and semantics of a relation.
Example:
Customer(customer_name, customer_street, customer_city)
Attribute Values
• The set of allowed values for each attribute is called the domain of
the attribute
• Attribute values are (normally) required to be atomic; that is,
indivisible
• E.g. the value of an attribute can be an account number,
but cannot be a set of account numbers
• The special value null is a member of every domain
Keys – Primary Key
• Each tuple in a relation/table needs to be uniquely identified, e.g., a
tuple may represent a student record, a module, or an employee
record
• Simply put, a key is an attribute that identifies a unique tuple of each
possible relation r(R), e.g., K = {customer_name}.
• Primary key: an attribute or set of attributes that is chosen as the
principal means of identifying tuples within a relation
• Should choose an attribute whose value never, or very rarely,
changes, e.g., national insurance number or customer_id
For instance, email address is unique, but may change
• We normally underline the primary key
For instance, instructor(ID, name, dept_name, salary)
Keys – Foreign Key
• The attribute of a relation schema attribute is called a foreign key if it
corresponds to the primary key of another relation schema.
E.g. customer_name and account_number attributes in depositor are
foreign keys that are the primary keys of customer and account
respectively.
• Only values occurring in the primary key attribute of the referenced
relation may occur in the foreign key attribute of the referencing
relation. This is known as Referential Integrity.
Referential Integrity
• Referential Integrity is a set of constraints imposed by a Relational
Database Management System that prevents users from having
inconsistent data.
• In our example, the Depositor Relation has 2 foreign keys
(customer_name and account_number) that reference the primary
keys for the Customer and Account relation respectively.
• Through referential integrity, we cannot add a row to the Depositor
relation that contains an account number that does not exist in the
Account relation. We also cannot add a customer name to the
Depositor relation if that customer does not exist in the Customer
relation.
Referential Integrity (cont)
• Furthermore, referential integrity may also specify that when you
delete a primary key record from a certain table, any foreign key
records linked to that primary key from a different table are also
deleted.
• In our case, if you delete a Customer record from the Customer
relation, then all of the records in the Depositor relation that
references that Custome are also deleted. This is known as a cascade
delete.
Entity Relationship Model
• Given an application problem, we need to create a data model to
capture the data and relationships between data specified in the
given problem
• Already know how to use a relational model
• But how do we get it in the first place
• Create an Entity-Relationship model by designing an E-R diagram – a
graphical representation of entities and relationships between entities
• We can then convert the E-R diagram into a relational model through
abiding by the rules of normalization.
Entity Relationship Modelling
• In terms of an E-R model, a database can be modeled as:
• a collection of entities,
• relationship among entities.
• An entity is an object that exists and is distinguishable from other
objects.
• Example: specific person (e.g., John, Mary), company (e.g., IBM,
Microsoft), event (e.g., car accidents, traffic jams)
• Entities have attributes that uniquely characterize them
• Example: people have names and addresses
• An entity set is a set of entities of the same type that share the
same properties.
• Example: set of all persons, companies, trees, holidays
Entity Sets (Instructor, Student)
ID name salary
76766 Crick 72000
45565 Katz 75000
10101 Srinivasan 65000
98345 Kim 80000
76543 Singh 80000
22222 Einstein 95000
….. …….. …..
instructor
ID name Tot_cred
98988 Tanaka 120
12345 Shankar 32
10128 Zhang 102
76543 Brown 58
76653 Aoi 60
23121 Chavez 110
44553 Peltier 56
….. ……. …
student
What does this set represent?
instructor_ID student_ID
76766 98988
45565 12345
45565 10128
10101 76543
98345 76653
76543 23121
22222 44553
….. …..
advisor
This table represents a relationship set, which contains a set of advisor relationships between instructors and students.
Relationship Set (Advisor)
ID name salary
76766 Crick 72000
45565 Katz 75000
10101 Srinivasan 65000
98345 Kim 80000
76543 Singh 80000
22222 Einstein 95000
….. …….. …..
instructor
ID name Tot_cred
98988 Tanaka 120
12345 Shankar 32
10128 Zhang 102
76543 Brown 58
76653 Aoi 60
23121 Chavez 110
44553 Peltier 56
….. ……. …
student
Each of the links represents an advisor relationship between an instructor and a student.
Note instructor Katz, is an advisor to 2 students.
Mapping Cardinality Constraints
• Express the number of entities in one entity set to which another
entity in another entity set can be associated via a relationship set.
• The mapping cardinality must be one of the following types:
• One to one
• One to many
• Many to one
• Many to many
One-to-One Mapping
• One to one mapping means that
• one entity on the left can be associated with at most one entity
on the right and
• one entity on the right can be associated with at most one entity
on the left
• For example, each instructor has a most one advisee and each
student has at most one advisor
One-to-One Mapping Example
ID name salary
76766 Crick 72000
45565 Katz 75000
10101 Srinivasan 65000
98345 Kim 80000
76543 Singh 80000
22222 Einstein 95000
….. …….. …..
instructor
ID name Tot_cred
98988 Tanaka 120
12345 Shankar 32
10128 Zhang 102
76543 Brown 58
76653 Aoi 60
23121 Chavez 110
44553 Peltier 56
….. ……. …
student
One-to-Many Mapping
• One to many mapping means that
• one entity on the left can be associated with many entities
(possibly 0) on the right and
• one entity on the right can be associated with at most one entity
on the left
• For example, each instructor has several advisees (possibly 0)
and each student has at most one advisor
One-to-Many Mapping Example
ID name salary
76766 Crick 72000
45565 Katz 75000
10101 Srinivasan 65000
98345 Kim 80000
76543 Singh 80000
22222 Einstein 95000
….. …….. …..
instructor
ID name Tot_cred
98988 Tanaka 120
12345 Shankar 32
10128 Zhang 102
76543 Brown 58
76653 Aoi 60
23121 Chavez 110
44553 Peltier 56
….. ……. …
student
Many-to-One Mapping
• Many to one mapping means that
• One entity on the left can be associated with at most one entity
on the right and
• one entity on the right can be associated with many entities
(possibly 0) on the left
• For example, each instructor has at most one advisee and each
student has several advisors (possibly 0)
Many-to-One Mapping Example
ID name salary
76766 Crick 72000
45565 Katz 75000
10101 Srinivasan 65000
98345 Kim 80000
76543 Singh 80000
22222 Einstein 95000
….. …….. …..
instructor
ID name Tot_cred
98988 Tanaka 120
12345 Shankar 32
10128 Zhang 102
76543 Brown 58
76653 Aoi 60
23121 Chavez 110
44553 Peltier 56
….. ……. …
student
Many-to-Many Mapping
• Many to many mapping means that
• one entity on the left can be associated with many entities
(possibly 0) on the right and
• one entity on the right can be associated with many entities
(possibly 0) on the left
• For example, each instructor has several advisees (possibly 0)
and each student has several advisors (possibly 0)
Many-to-Many Mapping Example
ID name salary
76766 Crick 72000
45565 Katz 75000
10101 Srinivasan 65000
98345 Kim 80000
76543 Singh 80000
22222 Einstein 95000
….. …….. …..
instructor
ID name Tot_cred
98988 Tanaka 120
12345 Shankar 32
10128 Zhang 102
76543 Brown 58
76653 Aoi 60
23121 Chavez 110
44553 Peltier 56
….. ……. …
student
ER Diagrams
• Rectangles represent entity sets.
• Diamonds represent relationship sets.
• Attributes listed inside entity rectangle
• Underline indicates primary key attributes
• Lines link entity sets to relationship sets
Representing Cardinality Constraints
• We express cardinality constraints by drawing either a directed line
(), signifying “one” or an undirected line (—), signifying
“many” between the relationship set and the entity set.
• One-to-one relationship:
• A student is associated with at most one instructor via the
relationship advisor
• A student is associated with at most one department via
stud_dept
One-to-One Relationship
• one-to-one relationship between an instructor and a student
• an instructor is associated with at most one student via advisor
• and a student is associated with at most one instructor via
advisor
One-to-Many Relationship
• one-to-many relationship between an instructor and a student
• an instructor is associated with several (possibly 0) students via
advisor
• a student is associated with at most one instructor via advisor,
Many-to-One Relationship
• In a many-to-one relationship between an instructor and a student,
• an instructor is associated with at most one student (possibly 0)
via advisor,
• and a student is associated with several (possibly 0) instructors
via advisor
Many-to-Many Relationship
• An instructor is associated with several (possibly 0) students via
advisor
• A student is associated with several (possibly 0) instructors via
advisor
Normalization
• Normalization is the process of organizing data in a database into an
appropriate design.
• Normalization is important as it imposes a set of rules that when
abided ensures that our database design is good (minimizes data
duplication and redundancy).
• In this course, we will consider 1st, 2nd and 3rd Normal Form.
• The forms are progressive, so in order to be in 2nd Normal Form, the
database must also satisfy the rules for 1st Normal form and so on.
• You should strive to have a database that is in 3rd Normal form.
1st Normal Form
• A database is in 1st Normal form if it satisfies the following conditions:
• Does not contain any repeating groups
• All attributes are atomic (i.e. indivisible units)
• Suppose we had a Student relation that was not in 1st Normal form
because not all attributes are atomic. Let’s assume that the Student
attribute is the primary key.
1st Normal Form (cont.)
• We would separate this data into multiple rows so that are attributes
are atomic. Now we have a Student table in 1st Normal Form.
2nd Normal Form
• A database is in 2nd Normal form if it satisfies the following conditions:
• It is in 1st Normal form.
• There are no partial dependencies on any of the columns
(attributes) of the primary key.
• In our Student relation, the Age attribute depends upon only the
Student attribute. Therefore we will extract the primary key and the
partial dependency attribute (Subject) to a new table. These extracted
attributes will form a composite primary key (Student, Subject) in the
new table.
3rd Normal Form
• A database is in 3rd Normal form if it satisfies the following conditions:
• It is in 2nd Normal form.
• All non-primary fields are dependent on the primary key.
• Consider we had a Student detail table with a Student_id as the
primary key.
• In this table, the street, city and state attributes depend upon the Zip
attribute, which is not the primary key. Therefore this fails the
condition that all non-primary fields are dependent on the primary
key.
3rd Normal Form
• To apply 3rd Normal Form to this table, we move the attributes that are
not dependent upon the primary key to a new table, along with the
attribute that they are actually dependent upon.
• In our example, we move the street, city and state attributes to a new
table, with the Zip attribute as the primary key. We will call this new
table the Address Table.