Concepts and Terminology
Introduction to Database
Concepts and Terminology
A Database is an organised collection of logically related data. A
Database may be of any size and complexity.
Concept of Data
Data are raw facts that could be recorded and stored on computer
media.
Information is data that have been converted into a context meaningful
to some end-users.
Data Processing
Data processing involves calculating,
comparing, sorting, classifying and,
converting data into information.
Concept of Logical Data: Entity and Attribute (1/3)
An entity is a person, place, device, event, or concept.
An entity class (or entity type) is a collection of entities with common
properties.
An entity instance is a single occurrence of an entity class.
An attribute is a property of an entity class.
2.1 D. Concept of Logical Data: Entity and Attribute (2/3)
Fig.2.4 Examples of entity classes and entity instances
2.1 D. Concept of Logical Data: Entity and Attribute (3/3)
Fig.2.5 An entity class with two entity instances
2.2 Structure of Data in a Database
In a database, data are logically organised into characters, fields,
records, tables and databases.
2.2 A. Hierarchical Structure of Data (1/3)
The most basic logical data element is character, which consists of a
single letter, digit or special symbol.
A field represents an attribute of a certain entity. It consists of a group
of characters.
A record is a set of related fields.
Fig.2.6 A student record
2.2 A. Hierarchical Structure of Data (2/3)
A table (or file) is a group of related records, representing an entity
class. It is made up of rows and columns.
Fig.2.7 A student table. (The underlined item is the primary key.)
2.2 A. Hierarchical Structure of Data (3/3)
A database is an organised
collection of logically related
tables.
In a relational database, tables
are also called relations.
A relationship is the link between
two relations.
Fig.2.8 Some entities and relationshipsin a school database
2.2 B. Keys (1/4)
Keys are used to organize, access and maintain database. There are
three types of keys:
• primary keys• foreign keys• candidate keys
2.2 B. Keys (2/4)
A primary key is a field or combination of fields that uniquely and
minimally identify a particular record in a table. The key value must be
unique and non-empty.
Fig.2.9 Examples of tables with a primary key
2.2 B. Keys (3/4)
A foreign key is a field in one table that matches a primary key value in
another table.
Tables are linked by relationships which are set up by designing tables
with foreign keys and primary keys.
Fig.2.10 Examples of foreign keys
2.2 B. Keys (4/4)
A Candidate key is any field that could serve as a primary key.
Fig.2.11 Examples of candidate keys
Non-key Field
Any field which is not a primary key or a candidate key is called a non-key field.Therefore, Name, Address, Phone_Num are non-key fields.
2.3 Common Data Types (1/2)
The most commonly used data types used in databases are:
• Character / string• number • date/time
2.3 B. Numbers (2/3)
Table 2.2 Types of number used in a standard database
2.3 C. Data-and-Time
Date-and-time are stored internally as real numbers and displayed
according to specified formats.
Relational Database
Part A. Introduction to Database
Copyright 2005 Radian Publishing Co.20/?25
4.1 A. File-Processing in School
File-processing approach often focuses on the data processing needs
of individual departments, instead of evaluating the overall needs.
Fig.4.1 Two file-processing systems used in the same school
Copyright 2005 Radian Publishing Co.21/?25
4.1 B. Disadvantages of File-Processing System
Disadvantages of File-Processing System:
1. Program-data Dependency
2. Duplication of Data (Data Redundancy)
3. Limited Data Sharing
4. Files are often incompatible with one another
5. Excessive Programming Effort
Copyright 2005 Radian Publishing Co.22/?25
4.2 The Database Approach
Database technology, to be more accurate, relational database
technology, addresses to the problems associated with traditional file-
processing approach.
Fig.4.2 A single database is used by users in the same school
Copyright 2005 Radian Publishing Co.23/?25
4.2 A. Data Model (1/2)
The first step in converting to a database approach Is to develop a list
of high-level entities that support the operation of the school.
Fig.4.3 Entities in the school
Copyright 2005 Radian Publishing Co.24/?25
4.2 A. Data Model (2/2)
The second step is to develop a data model. A data model is a detailed
specification of the overall structure of data, showing how the entities
are related. Entity-relationship (E-R) diagram is a common tool.
Fig.4.3 E-R-diagram
Copyright 2005 Radian Publishing Co.25/?25
4.2 B. Relational Database (1/6)
The characteristics of a relational database are as follows:
1. Data structure
Data are organised in the form of rows and columns, i.e. tables, each
representing an entity class.
2. Data manipulation
SQL commands are used to manipulate data stored in the tables.
3. Data integrity
Constraints are included to ensure that there is no loss of data integrity.
Copyright 2005 Radian Publishing Co.26/?25
4.2 B. Relational Database (2/6)
The characteristics of tables in a relational database are as follows:
1. Every table must have a primary key, which is unique and non-empty
2. Attribute values are taken from a well-defined domain
3. A foreign key must match with a primary key in another table
4. Each table of a database must have a unique name
5. Each field of a table must have a unique name
Copyright 2005 Radian Publishing Co.27/?25
4.2 B. Relational Database (3/6)
The characteristics of tables in a relational database (continue):
6. Multi-valued attributes are not allowed
7. Each row is unique
8. The sequence of fields is insignificant.
9. The sequence of records is insignificant.
Copyright 2005 Radian Publishing Co.28/?25
4.2 B. Relational Database (4/6)
Fig.4.4 Tables designed for the school
Copyright 2005 Radian Publishing Co.29/?25
4.2 B. Relational Database (5/6)
Fig.4.5 Sample of the database (a) Relationships between tables
Copyright 2005 Radian Publishing Co.30/?25
4.2 B. Relational Database (6/6)
Fig.4.5 Sample of the database (b) Tables in the database
Copyright 2005 Radian Publishing Co.31/?25
4.2 C. Integrity Constraints (2/5)
A domain is a set of values that may be assigned to an attribute.
A domain constraint defines the followings:
• data type• size (or length)
Copyright 2005 Radian Publishing Co.32/?25
4.2 C. Integrity Constraints (3/5)
Entity integrity constraints require that every table has a primary key,
which is unique and non-empty.
Copyright 2005 Radian Publishing Co.33/?25
4.2 C. Integrity Constraints (4/5)
Referential integrity constraints require that if there is a foreign key in
one table either each foreign key value match a primary key value in
another table or the foreign key value is null.
Fig.4.8 Specifying foreign keys in a SQL statement
Copyright 2005 Radian Publishing Co.34/?25
4.2 D. Advantages of Database Approach (1/2)
Advantages of Database Approach:
1. Program-Data independency
2. Reduced Data Duplication
3. Improved Data Sharing
4. Improved Data Accessibility
Copyright 2005 Radian Publishing Co.35/?25
4.2 D. Advantages of Database Approach (2/2)
Example for Advantage 4. Improved Data Accessibility
Fig.4.9 Retrieving student records of class 6A
Fig.4.10 Retrieving student records without club enrollment
E-R Diagrams
Part B. Database Design
Copyright 2005 Radian Publishing Co.37/?25
Chapter 6 E-R Diagrams
An entity-relationship (E-R) diagram is a graphical representation of
data for an organisation at conceptual level.
Fig.6.1 Symbols used in E-R diagrams
Copyright 2005 Radian Publishing Co.38/?25
6.1 Relationships
The requirements for an entity are:
1. one or more attributes
2. many possible distinct instances.
A relationship is an association between two entities based on a key
attribute.
Do not confuse relationship with relation: A relation is a table. A relationship links up two tables.
Copyright 2005 Radian Publishing Co.39/?25
6.1 A. Identifying Entities and Relationships (2/2)
Fig.6.2 Some daily-life examples of entities and relationship
Copyright 2005 Radian Publishing Co.40/?25
6.1 B. Relationship Cardinality (2/7)
Fig.6.3 Symbols for maximum cardinality
Copyright 2005 Radian Publishing Co.41/?25
6.1 B. Relationship Cardinality (3/7)
Fig.6.4 Examples of three basic cardinality
Copyright 2005 Radian Publishing Co.42/?25
6.1 B. Relationship Cardinality (4/7)
Fig.6.4 Examples of three basic cardinality
Copyright 2005 Radian Publishing Co.43/?25
6.1 B. Relationship Cardinality (5/7)
Fig.6.4 Examples of three basic cardinality
Copyright 2005 Radian Publishing Co.44/?25
6.1 B. Relationship Cardinality (6/7)
There are four possible cardinalities:
• Mandatory one & Mandatory many
An instance is mandatory if there exists at least one instance in the
relationship.
• Optional one & Optional many
An instance is optional if the instance may or may not exist.
Copyright 2005 Radian Publishing Co.45/?25
6.1 B. Relationship Cardinality (7/7)
Fig.6.5 All possible cardinalities
Copyright 2005 Radian Publishing Co.46/?25
6.1 C. Sample Relationships (1/3)
Fig.6.6 Examples of relationships
Copyright 2005 Radian Publishing Co.47/?25
6.1 C. Sample Relationships (2/3)
Copyright 2005 Radian Publishing Co.48/?25
6.1 C. Sample Relationships (3/3)
Fig.6.6 Examples of relationships
Copyright 2005 Radian Publishing Co.49/?25
6.1 D. Degree of Relationship
The degree of a relationship is the number of entity classes
participating in that relationship.
Binary relationships (degree 2) involve two entities and are the most
common. Unary relationships (degree 1) involve one entity.
Fig.6.10 Unary relationship
Copyright 2005 Radian Publishing Co.50/?25
6.1 E. Multiple Relationships
In some situations, there are more than one relationship between two
entities.
Fig.6.12 Multiple relationships
Copyright 2005 Radian Publishing Co.51/?25
6.3 Relationships with Attributes (2/3)
Fig.6.16 A relationship withan attribute
Fig.6.17 A relationshipconverted into an entity
Copyright 2005 Radian Publishing Co.52/?25
6.3 Relationships with Attributes (4/3)
Fig.6.18 Equivalent E-R diagram
Copyright 2005 Radian Publishing Co.53/?25
6.4 Resolving E-R Diagram for Relational DB
Resolution is a process of converting an E-R diagram into a form that
makes it possible to transform into a relational database.
We shall discuss the resolutions of three types of relationships: • binary M:N relationship• multi-valued attribute • unary M:N relationship
Copyright 2005 Radian Publishing Co.54/?25
6.4 A. Resolving Binary M:N Relationship (1/2)
An M:N relationship must be converted into multiple 1:M relationships.
The technique is to create a new entity to replace the original M:N
relationship.
Fig.6.21 Many-to-many relationship
Copyright 2005 Radian Publishing Co.55/?25
6.4 A. Resolving Binary M:N Relationship (2/2)
Fig.6.24 Resolved E-R diagram
Copyright 2005 Radian Publishing Co.56/?25
6.4 B. Resolving Multi-valued Attribute (1/2)
Multi-valued fields are not allowed in relational database.
The technique is to create a new entity to represent the multi-valued
attribute.
Fig.6.27 Entity with a multi-valued attribute
Copyright 2005 Radian Publishing Co.57/?25
6.4 B. Resolving Multi-valued Attribute (2/2)
Fig.6.28 Resolved relationship
Copyright 2005 Radian Publishing Co.58/?25
6.4 C. Resolving Unary M:N Relationship (1/2)
Unary M:N relationship must be resolved by creating a new entity.
Fig.6.32 An unary M:N relationship
Copyright 2005 Radian Publishing Co.59/?25
6.4 C. Resolving Unary M:N Relationship (2/2)
Fig.6.33 Resolved relationship
Database Schema
Part B. Database Design
Copyright 2005 Radian Publishing Co.61/?25
7.1 Database Schema and Notations (1/2)
A database schema describes the structures of tables and the
relationships among these tables in a logical database.
Fig.7.1 Text description of a schema.
Copyright 2005 Radian Publishing Co.62/?25
7.1 Database Schema and Notations (2/2)
Fig.7.1 Graphical representation of a schema.
Copyright 2005 Radian Publishing Co.63/?25
7.2 Transforming E-R Diagrams into Schemas
An entity is mapped into a table;
An attribute is mapped into a field;
Key attributes become the primary keys.
Copyright 2005 Radian Publishing Co.64/?25
7.2 A. Transforming Entities with Composite Attributes (2/2)
Fig.7.2 Transforming a simple entity
Copyright 2005 Radian Publishing Co.65/?25
7.2 B. Transforming Entities with Multi-valued Attribute (1/2)
The rule to transform multi-valued attribute is to create a new table to
represent each multi-valued attribute and add a foreign key to each
new table to link to the primary key of the original table.
Fig.7.5 Multi-valued attribute before resolution
Copyright 2005 Radian Publishing Co.66/?25
7.2 B. Transforming Entities with Multi-valued Attribute (2/2)
Fig.7.7 Schema transformed from an entity with a multi-valued attribute
Fig.7.6 Multi-valued attribute after resolution
Copyright 2005 Radian Publishing Co.67/?25
7.2 C. Transforming Dependent Entities (1/2)
Weak entities are dependent entities that cannot exist alone without the
other entity.
The rule to transform a weak entity is to add a foreign key to the weak
entity to link to the primary key of the identifying table. The relationship
is always mandatory on the one-side.
Copyright 2005 Radian Publishing Co.68/?25
7.2 C. Transforming Dependent Entities (2/2)
Fig.7.11 Weak entity – CHILDREN
Fig.7.12 Transformed schema
Copyright 2005 Radian Publishing Co.69/?25
7.2 D. Transforming 1:1 Binary Relationship (1/2)
The rule to transform an 1:1 binary relationship is to add a foreign key
to the table on the optional side to link to the primary key of the table on
the mandatory side.
Copyright 2005 Radian Publishing Co.70/?25
7.2 D. Transforming 1:1 Binary Relationship (2/2)
Fig.7.15 One-to-one relationship between EMPLOYEE and DEPARTMENT
Fig.7.16 Schema for 1:1 relationship
Copyright 2005 Radian Publishing Co.71/?25
7.2 E. Transforming 1:M Binary Relationship (1/2)
The rule to transform an 1:M binary relationship is to add a foreign key
to the table on the many-side to link to the primary key of the table on
the one-side. This rule applies to all possible cardinalities.
Copyright 2005 Radian Publishing Co.72/?25
7.2 E. Transforming 1:M Binary Relationship (2/2)
Fig.7.20 Schema for 1:M relationship
Fig.7.19 One-to-many relationship
Copyright 2005 Radian Publishing Co.73/?25
7.2 F. Transforming Multiple Relationships (1/2)
The rule to transform multiple relationships is to transform individual
relationships.
Copyright 2005 Radian Publishing Co.74/?25
7.2 F. Transforming Multiple Relationships (2/2)
Fig.7.23 Multiple relationshipsbetween two entities
Fig.7.24 A schema with two relationships
Copyright 2005 Radian Publishing Co.75/?25
7.2 G. Transforming M:N Binary Relationship (1/2)
The rule to transform an M:N binary relationship is to resolve the M:N
relationship into two or more 1:M relationships. A new table is created
with two foreign keys added to the new table to link to the primary keys
of the original tables. This rule applies to all possible cardinalities.
Fig.7.27 M:N relationship
Copyright 2005 Radian Publishing Co.76/?25
7.2 G. Transforming M:N Binary Relationship (2/2)
Fig.7.28 Resolved relationship
Fig.7.29 Schema for the resolved relationship
Copyright 2005 Radian Publishing Co.77/?25
7.2 H. Transforming 1:M Unary Relationship (1/2)
The rule to transform an 1:M unary relationship is to add a foreign key to
the table to link to the primary key of the same table (other instances or the
same instance). This rule applies to all possible cardinalities.
Copyright 2005 Radian Publishing Co.78/?25
7.2 H. Transforming 1:M Unary Relationship (2/2)
Fig.7.34 An one-to-many unary relationship
Fig.7.35 Schema for the 1:M relationship
Copyright 2005 Radian Publishing Co.79/?25
7.2 I. Transforming M:N Unary Relationship (1/2)
The rule to transform an M:N unary relationship is to create a new table
and add two foreign keys to the new table, with each key linking to the
primary key of the given table. This rule applies to all possible
cardinalities.
Fig.7.38 Rail fare
Copyright 2005 Radian Publishing Co.80/?25
7.2 I. Transforming M:N Unary Relationship (2/2)
Fig.7.39 An M:N unary relationship
Fig.7.40 Schema for the M:N unary relationship
Chapter 8 Normalisation
Part B. Database Design
Copyright 2005 Radian Publishing Co.82/?25
Chapter 8 Normalisation
A well-structured table is a table without data redundancy and allows
users to insert, modify or delete the rows in a table without errors or
inconsistencies.
Copyright 2005 Radian Publishing Co.83/?25
8.1 Consequences of Data Redundancies (1/2)
The major problem of a poorly designed database is data redundancy
that leads to anomalies.
Fig.8.1 A poorly designed table
Copyright 2005 Radian Publishing Co.84/?25
8.1 Consequences of Data Redundancies (2/2)
An anomaly is an error or inconsistency occurred when users attempt to update a table that contains redundant data.
There are three types of anomalies:
• Insertion anomaly. Adding a new instance of an entity requires data of other entities.
• Deletion anomaly. Deleting a record with an intention to remove an instance of an entity may cause loss of data of some other entities.
• Modification anomaly. Modifying a certain instance may require updating multiple records. If the updating is not thorough, the data will be inconsistent.
Copyright 2005 Radian Publishing Co.85/?25
8.2 Purposes of Normalisation (1/2)
Normalisation is the process of improving a logical database to make
the database simple, flexible and free of data redundancy.
Partial dependency means that one or more non-key attributes depend
on part of the primary key.
Transitive dependency occurs when a non-key attribute depends on
another non-key attribute.
Copyright 2005 Radian Publishing Co.86/?25
8.2 Purposes of Normalisation (2/2)
Table 8.1 Three stages of normalisation
Copyright 2005 Radian Publishing Co.87/?25
8.2 A. First Normal Form (1/3)
A table is in first normal form (1NF) if it does not contain multi-valued
attributes.
The rule to convert a table to 1NF is to create a table to represent each
multi-valued attribute and add a foreign key to each new table to link to
the primary key of the original table.
Copyright 2005 Radian Publishing Co.88/?25
8.2 A. First Normal Form (2/3)
Fig.8.2 A table with multi-valued attribute – Contact Name
Copyright 2005 Radian Publishing Co.89/?25
8.2 A. First Normal Form (3/3)
Fig.8.3 Tables in 1NF with multi-valued attribute removed
Copyright 2005 Radian Publishing Co.90/?25
8.2 B. Second Normal Form (1/4)
A table is in second normal form (2NF) if it is in 1NF and every non-key
attribute depends on the entire primary key.
The rule to convert a table to 2NF is to decompose the table into
smaller tables so that non-key attributes depends on the entire primary
key.
Copyright 2005 Radian Publishing Co.91/?25
8.2 B. Second Normal Form (2/4)
Fig.8.4 Table with partial dependency
Fig.8.4 Table with partial dependency
Copyright 2005 Radian Publishing Co.92/?25
8.2 B. Second Normal Form (3/4)
Fig.8.5 Tables in 2NF with partial dependency removed (1/2)
Copyright 2005 Radian Publishing Co.93/?25
8.2 B. Second Normal Form (4/4)
Fig.8.5 Tables in 2NF with partial dependency removed (2/2)
Copyright 2005 Radian Publishing Co.94/?25
8.2 C. Third Normal Form (1/3)
A table is in third normal form (3NF) if it is in 2NF and no transitive
dependencies exist. i.e. There are no dependency between non-key
fields.
The rule to convert a table to 3NF is to decompose the table into
smaller tables so that there are no dependency between non-key fields.
Copyright 2005 Radian Publishing Co.95/?25
8.2 C. Third Normal Form (2/3)
Fig.8.7 Table with transitive dependency
Copyright 2005 Radian Publishing Co.96/?25
8.2 C. Third Normal Form (3/3)
Fig.8.8 Tables in 3NF without transitive dependency