Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
CS 377
Database Systems
1
Relational Data Model
Li Xiong
Department of Mathematics and Computer Science
Emory University
Outline
� Relational Model Concepts
� Relational Model Constraints
� Relational Database and operations
2
Relational Model
� First formal database model
� Introduced by Codd in "A Relational Model for
Large Shared Data Banks," Communications of
the ACM, June 1970.
� First commercial implementations available in
3
� First commercial implementations available in
early 1980s
� Based on the concept of a mathematical relation
and has theoretical basis in set theory and first-
order predicate logic.
� Other models: hierarchical model, network model
INFORMAL DEFINITIONS
� RELATION: A table of values
� A relation may be thought of as a set of rows.
� A relation may alternately be thought of as a set of columns.
� Each row represents a fact that corresponds to a real-world entity or relationship.
� Table name and column names help interpret the meaning of the values
4
FORMAL DEFINITIONS – Relation Schema
� The table is called a relation, a row is a tuple, a column header is an attribute
� Relation Schema R (A1, A2, .....An)
� Made up of a relation name R and a set of attributes A1, A2, .....An
� Degree (or arity) of a relation is the number of attributes n of its relational schema
� E.g. STUDENT (Name, SSN, HomePhone, Address, OfficePhone, Age, GPA)
� Each attribute Ai has a domain dom(Ai) that defines the possible values of the
attribute by a data-type or a format
5
attribute by a data-type or a format
� E.g. The domain of SSN is the set of 9 digit numbers defined as: ddd-dd-dddd
where each d is a decimal digit.
FORMAL DEFINITIONS – Relation
� A relation (or relation state) r of the relation schema R (A1, A2,
.....An), r(R), is a set of tuples r = {t1, t2, ..., tm}
� A tuple t is an ordered set of n values t =<v1, v2, ..., vn>, where
each value vi, 1 ≤ i ≤ n, is an element of dom(Ai) or a special
NULL value
� E.g. <“Benjamin Bayer", 305-61-2435, 373-1616, “2918 Bluebonnet Lane”, null,
19, 3.21> is a tuple belonging to the STUDENT relation.
6
19, 3.21> is a tuple belonging to the STUDENT relation.
Mathematical Definitions� A relation r(R) is a mathematical relation of degree n on the domains
dom(A1), dom(A2), ..., dom(An), which is a subset of the Cartesian
product of the domains that define R:
r(R) ⊆ (dom(A1) × dom(A2) × ... × dom(An))
� The Cartesian product is the direct product of the sets of values of all domains: dom (A1) × dom (A2) × ... × dom(An)
� The total number of tuples in the Cartesian product is:
|dom (A )| × |dom (A )| × ... × |dom(A )|
7
|dom (A1)| × |dom (A2)| × ... × |dom(An)|
� Current relation state reflects only the valid tuples that represent a particular state
CHARACTERISTICS OF RELATIONS
� Ordering of tuples in a relation r(R) � A relation is a set of tuples which are not ordered
� Ordering of attributes� The attributes in R(A1, A2, ..., An) and the values in t=<v1, v2, ..., vn> are an ordered
list in our definition
� Alternative definition: tuple considered as a set of (<attribute>, <value>) pairs,
where each pair gives the value of the mapping from an attribute Ai to a value vi from dom(Ai)
� Values in a tuple
8
Values in a tuple� All values are considered atomic (flat relational model with first normal form
assumption) – what about multi-valued attributes and composite attributes?
� A special null value is used to represent values that are unknown or inapplicable to certain tuples.
DEFINITION SUMMARY
Informal Terms Formal Terms
Table Relation
Column Attribute
Row Tuple
9
Row Tuple
Values in a column Domain
Table Definition Relation Schema
Relational Model Notation� Relation schema R of degree n: R(A1, A2, ..., An)
� Relation names: Q, R, S
� Relations: q, r, s
� Tuples: t, u, v
� tuple t in a relation r(R): t = <v , v , ..., v >, v is the
10
� tuple t in a relation r(R): t = <v1, v2, ..., vn>, vi is the
value corresponding to attribute Ai
� Component values of tuples:
� t[Ai] and t.Ai refer to the value vi in t for attribute Ai
� t[Au, Aw, ..., Az] and t.(Au, Aw, ..., Az) refer to the subtuple
of values <vu, vw, ..., vz> from t corresponding to the
attributes specified in the list
Outline
� Relational Model Concepts
� Relational Model Constraints
� Relational Database and operations
11
Relational Model Constraints� Constraints
� Restrictions on the actual values in a database state
� Inherent model-based constraints or implicit constraints
� Inherent in the data model
� E.g. no duplicate tuples
12
� Schema-based constraints or explicit constraints
� Can be directly expressed in schemas of the data model
� Application-based or semantic constraints or business
rules
� Cannot be directly expressed in schemas, expressed and enforced by
application program
� E.g. the max. no. of hours per employee for all projects he or she
works on is 56 hrs per week
Schema-based constraints
� Domain constraints
� Key constraints
� Entity integrity constraints
� Referential integrity constraints
13
� Referential integrity constraints
Domain Constraints
� The value of each attribute A must be an atomic value
from the domain dom(A)
� Typical data types associated with domains:
� Numeric data types for integers and real numbers
Characters
14
� Characters
� Booleans
� Fixed-length strings
� Variable-length strings
� Date, time, timestamp
� Money
� Other special data types
Key Constraints
� No two tuples can have the same combination of
values for all their attributes.
� Superkey
� No two distinct tuples in any state r of R can have the
15
� No two distinct tuples in any state r of R can have the
same value for SK
� Key
� Superkey of R
� Removing any attribute A from K leaves a set of
attributes K that is not a superkey of R any more
Key Constraints and Constraints
on NULL Values (cont’d.)� Key satisfies two properties:
� Two distinct tuples in any state of relation cannot have
identical values for (all) attributes in key
� Minimal superkey
16
• Cannot remove any attributes and still have uniqueness
constraint in above condition hold
Key Constraints and Constraints
on NULL Values (cont’d.)� Candidate key
� Relation schema may have more than one key
� Primary key of the relation
� Designated among candidate keys
17
� Designated among candidate keys
� Underline attribute
� Other candidate keys are designated as unique keys
Key Constraints and Constraints on
NULL Values (cont’d.)
18
Key Constraints
� Superkey of R: A set of attributes SK of R such that no two tuplesin any valid relation instance r(R) will have the same value for SK.
� For any distinct tuples t1 and t2 in r(R), t1[SK] ≠≠≠≠ t2[SK].
� {Licence_number}, {License_number, Make}, {Engine_serial_number, Make}
� Key of R: A "minimal" superkey; that is, a superkey K such that removal of any attribute from K results in a set of attributes that is not a superkey.
� Key1 = {License_number}, Key2 = {Engine_serieal_number}
� Is {Engine_serial_number, Make} a key?
19
� Is {Engine_serial_number, Make} a key?
� If a relation has several keys, each is called a candidate key, and one is chosen arbitrarily to be the primary key. The primary key attributes are underlined.
Entity Integrity
� Entity Integrity: The primary key attributes PK of each relation schema R cannot have null values in any tuple of r(R).
t[PK] ≠≠≠≠ null for any tuple t in r(R)
� Primary key values are used to identify the individual tuples.
� Note: Other attributes of R may be similarly constrained to disallow null values, even though they are not members of the primary key.
20
null values, even though they are not members of the primary key.
Referential Integrity
� Referential integrity: a tuple in one relation that refers to another relation must refer to an existing tuple in that relation.
� Formally
A set of attributes FK in relation schema R1 is a foreign key of R1 that references relation R2 if:� FK have the same domains as the primary key attributes PK of R2
21
� The value of FK in the current state of R1 can be either: � (1) a value of PK in the current state of R2: t1[FK] = t2[PK].
� (2) a null
� R1 is the referencing relation and R2 is the referenced relation.
� A tuple t1 in R1 is said to reference a tuple t2 in R2 if t1[FK] = t2[PK].
� A referential integrity constraint can be displayed in a relational database schema as a directed arc from R1.FK to R2.
5.5
22
23
24
25
Outline
� Relational Model Concepts
� Relational Model Constraints
� Relational Database and operations
26
Relational Databases and Relational
Database Schemas� Relational database schema S
� Set of relation schemas S = {R1, R2, ..., Rm}
� Set of integrity constraints IC
� Relational database state
� Set of relation states DB = {r , r , ..., r }
27
� Set of relation states DB = {r1, r2, ..., rm}
� Each ri is a state of Ri such that the ri relation states satisfy integrity
constraints specified in IC
� Invalid state
� Does not obey all the integrity constraints
� Valid state
� Satisfies all the constraints in the defined set of integrity constraints
IC
Operations in a Relational Database� Basic operations that change the states of relations in
the database:
� Insert
� Delete
28
� Update (or Modify)
The Insert Operation� Provides a list of attribute values for a new tuple t
to be inserted into a relation R
� Can violate any of the four types of constraints
� Default option is to reject the insertion
29
The Delete Operation� Can violate only referential integrity
� If tuple being deleted is referenced by foreign keys from other tuples, e.g.
delete a tuple from department
� Restrict Reject the deletion
� Cascade Propagate the deletion by deleting tuples that reference the deleted
tuple
� Set null or set default Modify the referencing attribute values that cause the
violation
30
violation
The Update Operation
� Necessary to specify a condition on attributes of
relation
� Select the tuple (or tuples) to be modified
� If attributes to be updated not part of a primary key
31
� If attributes to be updated not part of a primary key
nor of a foreign key
� Usually causes no problems
� Updating a primary/foreign key
� Similar issues as with Insert/Delete
In-Class Exercise
Consider the following relations for a database that keeps
track of student enrollment in courses and the books adopted
for each course:
STUDENT(SSN, Name, Major, Bdate)
COURSE(Course#, Cname, Dept)
32
COURSE(Course#, Cname, Dept)
ENROLL(SSN, Course#, Quarter, Grade)
BOOK_ADOPTION(Course#, Quarter, Book_ISBN)
TEXT(Book_ISBN, Book_Title, Publisher, Author)
Draw a relational schema diagram specifying the foreign
keys for this schema.