Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
CS 377
Database Systems
1
Relational Algebra and Calculus
Li Xiong
Department of Mathematics and Computer Science
Emory University
ER Diagram of Company Database
2
3
4
5
Relational Algebra and Relational Calculus
� Previous lecture on relational model presented the structures and constraints for the relational model
� Relational Algebra� Formal foundation for relational model operations
� Basis for implementing and optimizing queries in RDBMS
� Basis for practical query languages such as SQL
6
� Basis for practical query languages such as SQL
� Relational Calculus� Formal declarative language for relational queries
Outline
� Relational Algebra� Unary Relational Operations
� Relational Algebra Operations From Set Theory
� Binary Relational Operations
� Additional Relational Operations
� Relational Calculus
7
� Relational Calculus� Tuple Relational Calculus
� Domain Relational Calculus
� Coming up� SQL
Relational Algebra
� Relational algebra is a mathematical language with a basic set of operations for manipulating relations.
� A relational algebra operation operates on one or more relations and results a new relation, which can be further manipulated using operations of the same algebra.
� A relational algebra expression is a sequence of relational
8
� A relational algebra expression is a sequence of relational algebra operations.
Relational Algebra Operations
9
Unary Relational Operations - Select
� SELECT Operation: select a subset of the tuples from a relation that satisfy a selection condition.
� Example: To select the EMPLOYEE tuples whose department number is four or those whose salary is greater than $30,000 the following notation is used:
σσσσDNO = 4 (EMPLOYEE)
σσσσSALARY > 30,000 (EMPLOYEE)
� Notation: σσσσ <selection condition>(R)
� Selection condition is a Boolean expression containing clauses in the form:
10
� Selection condition is a Boolean expression containing clauses in the form:
<attribute name> <comparison op> <constant value>
<attribute name> <comparison op> <attribute name>
SELECT Operation Properties
� The SELECT operation σσσσ <selection condition>(R) produces a relation S that has the same schema as R
� The SELECT operation σσσσ is commutative; i.e.,
σσσσ <condition1>(σσσσ < condition2> ( R)) = σσσσ <condition2> (σσσσ < condition1> ( R))
� A cascaded SELECT operation may be applied in any order; i.e.,
σσσσ <condition1>(σσσσ < condition2> (σσσσ <condition3> ( R))
11
σσσσ <condition1>(σσσσ < condition2> (σσσσ <condition3> ( R))
= σσσσ <condition2> (σσσσ < condition3> (σσσσ < condition1> ( R)))
� A cascaded SELECT operation may be replaced by a single selection with a conjunction of all the conditions; i.e.,
σσσσ <condition1>(σσσσ < condition2> (σσσσ <condition3> ( R))
= σσσσ <condition1> AND < condition2> AND < condition3> ( R)))
Unary Relational Operations - Project� PROJECT Operation: selects certain columns from the table and discards the
other columns.
� Example: To list each employee’s first and last name and salary
πLNAME, FNAME,SALARY(EMPLOYEE)
� Notation: π<attribute list>(R)
� Duplicate Elimination: the project operation removes any duplicate tuples, so the result of the project operation is a set of tuples
12
PROJECT Operation Properties
� The number of tuples in π <list> (R) is always less or equal to the number of tuples in R
� If the list of attributes includes a key of R, then the number of tuples is equal to the number of tuples in R
� π <list1> (π <list2> (R) ) = π <list1> (R) as long as <list2> contains the attributes in <list1>
13
Sequences of Operations and the
RENAME Operation� In-line expression:
� Sequence of operations:
16
� Rename attributes in intermediate results
� RENAME operation
� Examples
� ρ DEPT5_EMPS (σDNO = 5 (EMPLOYEE))
Relational Algebra Operations From
Set Theory � UNION Operation: denoted by R ∪∪∪∪ S, is a relation that includes all tuples
that are either in R or in S or in both R and S. � Duplicate tuples are eliminated.
� INTERSECTION operation: denoted by R ∩∩∩∩ S, is a relation that includes all tuples that are in both R and S.
� Set Difference (or MINUS) Operation: denoted by R - S, is a relation that includes all tuples that are in R but not in S.
Type Compatibility: The two operands must be “type compatible”.
17
� Type Compatibility: The two operands must be “type compatible”.
� The operands R(A1, A2, ..., An) and S(B1, B2, ..., Bn) must have the same number of attributes, and the domains of corresponding attributes must be compatible; that is, dom(Ai)=dom(Bi) for i=1, 2, ..., n.
� The resulting relation for R∪S, R ∩ S, or R-S has the same attribute names as R (by convention).
Relational Algebra Operations From
Set Theory - Properties� Notice that both union and intersection are commutative operations; that is
R ∪∪∪∪ S = S ∪∪∪∪ R, and R ∩∩∩∩ S = S ∩∩∩∩ R
� Both union and intersection can be treated as n-ary operations applicable to
any number of relations as both are associative operations; that is
R ∪∪∪∪ (S ∪∪∪∪ T) = (R ∪∪∪∪ S) ∪∪∪∪ T, and (R ∩∩∩∩ S) ∩∩∩∩ T = R ∩∩∩∩ (S ∩∩∩∩ T)
18
R ∪∪∪∪ (S ∪∪∪∪ T) = (R ∪∪∪∪ S) ∪∪∪∪ T, and (R ∩∩∩∩ S) ∩∩∩∩ T = R ∩∩∩∩ (S ∩∩∩∩ T)
� The minus operation is not commutative; that is, in general
R - S ≠ S – R
Relational Algebra Operations From Set
Theory – Cartesian Product
� CARTESIAN (or cross product) Operation: combine tuples from two relations. In general, the result of R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm) is a relation Q with degree n + m attributes Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order. The resulting relation Q has one tuple for each combination of tuples—one from R and one from S. � Hence, if R has nR tuples (denoted as |R| = nR ), and S has nS tuples, then
*
21
R R S
| R x S | will have nR * nS tuples.
� The two operands do NOT have to be "type compatible”
� Example:
FEMALE_EMPS ←←←← σσσσ SEX=’F’(EMPLOYEE)
EMPNAMES ←←←← ππππ FNAME, LNAME, SSN (FEMALE_EMPS)
EMP_DEPENDENTS ←←←← EMPNAMES x DEPENDENT
FEMALE_EMPS ← σ SEX=’F’(EMPLOYEE)
EMPNAMES ← π FNAME, LNAME, SSN (FEMALE_EMPS)
EMP_DEPENDENTS ← EMPNAMES x DEPENDENT
Binary Relational Operations - Join� JOIN Operation: the sequence of cartesian product followed by
select
� Notation: R <join condition>S
where R and S can be any relations that result from general relational algebra
expressions.
24
ACTUAL_DEPENDENTS ← EMPNAMES SSN=ESSN
DEPENDENTS
Retrieve the department and the manager’s information:
DEPT_MGR ← DEPARTMENT MGRSSN=SSN
EMPLOYEE
Variations of Join
� EQUIJOIN Operation� Involves join conditions with equality comparisons only.
� The result always have one or more pairs of attributes (whose names need not be identical) that have identical values in every tuple.
� NATURAL JOIN Operation *� Gets rid of the second (superfluous) attribute in an EQUIJOIN condition.
� Requires the two join attributes, or each pair of corresponding join attributes, have the same name in both relations.
28
the same name in both relations.
� If this is not the case, a renaming operation is applied first.
29
Proj_Dept <- Project * Department
Dept_Locs <- Department * Dept_Locations
Additional Relational Operations –
Outer Join� In NATURAL JOIN tuples without a matching (or related) tuple are
eliminated from the join result. Tuples with null in the join attributes are also eliminated.
� Outer joins can be used when we want to keep all the tuples in R or S, regardless of whether or not they have matching tuples in the other relation.
� The left outer join operation keeps every tuple in the first or left relation R in R S; if no matching tuple is found in S, then the attributes of S in
31
in R S; if no matching tuple is found in S, then the attributes of S in the join result are filled or “padded” with null values.
� The right outer join, keeps every tuple in the second or right relation S in the result of R S.
� A third operation, full outer join, denoted by keeps all tuples in both the left and the right relations
Outer Join ExampleEMPLOYEE left outer join (SSN=Mgr_SSN) DEPARTMENT
32
Binary Relational Operations - Division
� DIVISION Operation: R(Z) ÷ S(X), where X subset Z.
� Example: retrieve the SSN of employees who work on all the
projects that ‘John Smith’ is working on
� Let Y = Z - X (and hence Z = X ∪∪∪∪ Y). The result of DIVISION
33
� Let Y = Z - X (and hence Z = X ∪∪∪∪ Y). The result of DIVISION
is a relation T(Y) that includes a tuple t if tuples tR appear in R
with tR [Y] = t, and with tR [X] = ts for every tuple ts in S.
� For a tuple t to appear in the result T of the DIVISION, the values in t
must appear in R in combination with every tuple in S.