76
Relational Algebra

Relational Algebra - Teaching Labscsc343h/summer/content/lectures...Assignment •Example: •CSCoffering = ... F11 Frowning Widget STUFFED small 1234 F11 3.00 128 L12 Left-handed

Embed Size (px)

Citation preview

Relational Algebra

Relational database

• Relational database systems are expected to be equipped with a query language that can assist its users to query the database instances.

• There are two kinds of query languages

relational algebra and relational calculus.

Elementary Algebra

• You did algebra in high school

27y2 + 8y - 3

Operands

Operators

What is Relational Algebra?

• An algebra whose operands are relations or variables

that represent relations.

• Operators are designed to do the most common things

that we need to do with relations in a database.

• The result is an algebra that can be used as a query

language for relations.

Relational Algebra

• Operands: tables

• Operators:

choose only the rows you want

choose only the columns you want

combine tables

and a few other things

Example

• Movies(mID, title, director, year, length)

• Artists(aID, aName, nationality)

• Roles(mID, aID, character)

• Foreign key constraints:

–Roles[mID] ⊆ Movies[mID]

–Roles[aID] ⊆ Artists[aID]

Select: choose rows

• Notation: σc(R)

R is a table

Condition c is a boolean expression

It can use comparison operators and boolean operators

• The result is a relation

with the same schema as the operand

but with only the tuples that satisfy the condition

Select

SELECT

Selection Example

Examples

• Write queries to find:

All British actors

All movies from the 1970s

Project : choose columns

• Notation: π L (R)

R is a table.

L is a subset (not necessarily a proper subset) of

the attributes of R.

• The result is a relation

with all the tuples from R

but with only the attributes in L, and in that order

Project

About project

• Why is it called “project”?

• What is the value of πdirector (Movies)?

• Exercise: Write an RA expression to find the names of all directors of movies from the 1970s

• Now, suppose you want the names of all characters in movies from the 1970s. We need to be able to combine tables.

Project

• Wherever a project operation might “introduce”

duplicates, only one copy of each is kept.

Example

Name Age

Karim 20

Ruth 18

Minh 20

Sofia 19

Jennifer 19

Sasha 20

People πAge People

Age

20

18

19

Cartesian Product

• Notation: R1 x R2

• The result is a relation with

every combination of a tuple from R1 concatenated to a tuple from R2

• Its schema is every attribute from R followed by every attribute of S, in order

Cartesian Product

• How many tuples are in R1 x R2?

• Example: Movies x Roles

• If an attribute occurs in both relations, it occurs twice

in the result (prefixed by relation name)

Cartesian Product

Example

Combining Cross-product with selection

Example

• Query: Dr. Monk wonders whether he has to teach a multi-cultural group of students.

Multi-cultural class

Multi-cultural class

Multi-cultural class

ClassInfo=π Name, Country, GPA (σStudent.name=AlgoList.name AlgoList x Student)

Cartesian product can be inconvenient

• It can introduce nonsense tuples.

• You can get rid of them with selects.

• But this is so highly common, an operation was defined to make it easier: join.

Shortcut: Theta Join

Subtype of theta-join: Equijoin

Special case of equijoin: Natural Join

• Notation: R ⋈ S

• The result is defined by

taking the Cartesian product

selecting to ensure equality on attributes that are in

both relations (as determined by name)

projecting to remove duplicate attributes.

Natural Join

Natural Join

• The following examples show what natural join does when the tables have:

no attributes in common

one attribute in common

a different attribute in common

(Note that we change the attribute names for relation follows to set up these scenarios.)

Properties of Natural Join

• Commutative: R ⋈ S = S ⋈ R

(although attribute order may vary; this will matter later

when we use set operations)

• Associative: R ⋈ (S ⋈ T) = (R ⋈ S) ⋈ T

• So when writing n-ary joins, brackets are irrelevant.

We can just write: R1 ⋈ R2 ⋈ . . . ⋈ Rn

Assignment Operator

• Notation:

R = Expression

• Alternate notation:

R(A1, ..., An) = Expression

Lets you name all the attributes of the new relation

Sometimes you don‟t want the name they would

get from Expression.

R must be a temporary variable, not one of the

relations in the schema. i.e., you are not updating

the content of a relation.

Assignment

• Example:

• CSCoffering = σdept=‘csc’ Offering

• TookCSC(sid, grade) = πsid, grade(CSCoffering ⋈ Took)

• PassedCSC(sid) =πsid σgrade>50(TookCSC)

Assignment

• Assignment helps us break a problem down

• It also allows us to change the names of relations [and attributes].

• There is another way to rename things ...

Renaming Operator

Example of Renaming

Summary Of Operators

Set Operations on Relations

• R U S, the union of R and S, is the set of tuples that

are in R or S or both.

• R - S, the difference of R and S, is the set of tuples

that are in R but not in S.

• Note that R - S is different from S - R.

• R ∩ S, the intersection of R and S, is the set of tuples

that are in both R and S.

Condition for set operators

• Set operators can operate only on two union-compatible relations.

• Two relations are union-compatible if they have the same number of attributes and each attribute must be from the same domain

Union

Union example

• Query: list names of all people in the department

Union example

• Query: list names of all people in the department.

Difference

Difference Example

• Query: Who is registered in the Database course but

not in the Algorithms?

Intersection

• Query: Which courses are taught at both Universities?

Extended operators of Relational Algebra can be derived from core operators

Outer Join

Motivation

• Suppose we join R ⋈ S.

• A tuple of R which doesn't join with any tuple of S is

said to be dangling.

• Similarly for a tuple of S.

• Problem: We loose dangling tuples.

Outer Join

Outer join

• Preserves dangling tuples by padding them with a

special NULL symbol in the result.

Types of outer join

Left Outer Join

Example

RA is procedural

• An RA query itself suggests a procedure for constructing the result (i.e., how one could implement the query).

• We say that it is “procedural.”

• Some examples of Relational Algebra

Understand the database!

You need to understand the tables in a database and

how they relate to each other before you can work

with it.

Schema

• There are 5 tables:

1. BUYER(BUYERID, BUYERPHONE, BUYER_RATING)

2. POPART(PO#, PART#, PARTCOST, PARTQTY)

3. PART(PART#, PARTNAME, PARTTYPE, PARTSIZE)

4. SUPPLIER(SUPPLIER#, SUPPLIERNAME, SUPPLIERCITY, TIMEZONE)

5. SUPPLY(PART#, SUPPLIER#, BUYERID)

What is the primary key of the BUYER table?

• BUYER(BUYERID, BUYERPHONE, BUYER_RATING)

– BUYERID - identified by a single underline.

• This means that across all the records in the table,

BUYERID is unique.

Q. What is the primary key of POPART (pronounced : P-O-Part)?

• POPART(PO#, PART#, PARTCOST, PARTQTY)

• PO#, PART# - a concatenated or composite key

• Select and Project

What are the names of parts whose size is 'small'?

• We are given the criteria of „small‟ which is in the part table.

• We want to know names which is also in the part table.

• First do a select to eliminate the rows that are not „small‟

• and then a project to show just the names.

• select part where partsize = ‘small’ giving temp

• project temp over (partname) giving answer

Notes: Temp is a convenient name for the output of the select since it is not yet the final answer. Since the output of a statement is a new table, the new table can be used in subsequent statements so we can do the project on the temp table.

PART# PARTNAME PARTTYPE PARTSIZE

S11 Smiley Widget STUFFED small

F11 Frowning

Widget STUFFED small

Temp

PARTNAME

Smiley Widget

Frowning Widget

Answer

select part where partsize = ‘small’ giving temp

project temp over (partname) giving answer

• Select, Project and Join

What are the names of parts purchased on po# 1234?

• We know the po# which is in POPART and we want

to know partname which is in PART. Thus, we must

join the two tables.

• POPART(PO#, PART#, PARTCOST, PARTQTY)

• PART(PART#, PARTNAME, PARTTYPE,

PARTSIZE)

• Two tables can be joined if and only if they are

join compatible. Two tables are join compatible if

they have a common field.

• In this case, both tables have a part# field so they are

join compatible.

• (Note that it is the contents of the fields that must

be the same, not necessarily the names of the

fields. It is very convenient if the names of the fields

are the same when the contents are the same.)

• When two tables are joined, each record from table 1 is compared to each record from table 2.

• When a match is found, a record is written to new table which contains all fields from both table 1 and table 2.

• When PART and POPART are joined, the 3 records in PART are compared with all 3 records in POPART for a total of 9 comparisons.

• For example a 50,000 record table joined with a

1,000,000 record table requires 50 trillion

comparisons.

• If the join criteria (specifying the common field) is

left out, every record matches every record and

the new table will have 50 trillion records.

PART# PARTNAME PARTTYPE PARTSIZE

S11 Smiley Widget STUFFED small

F11 Frowning Widget STUFFED small

L12 Left-handed Widget STUFFED large

Part

PO# PART# PARTCOST PARTQTY

1233 L12 2.00 24

1234 L12 2.00 30

1234 F11 3.00 128

PART# PARTNAME PARTTYPE PARTSIZE PO# PART# PARTCOST PARTQTY

F11 Frowning Widget STUFFED small 1234 F11 3.00 128

L12 Left-handed

Widget STUFFED large 1233 L12 2.00 24

L12 Left-handed

Widget STUFFED large 1234 L12 2.00 30

POPART

The join of PART and POPART generates three matches so the resulting table is:

Newtable

What are the names of parts supplied by

suppliers in Hamilton?

• We know SUPPLIERCITY is in SUPPLIER and want to know

PARTNAME in PART. These two tables cannot be directly joined because they have no common field. However, we can get to PART by going through SUPPLY so we will involve that table in a join as well.

• BUYER(BUYERID, BUYERPHONE, BUYER_RATING)

• POPART(PO#, PART#, PARTCOST, PARTQTY)

• PART(PART#, PARTNAME, PARTTYPE, PARTSIZE)

• SUPPLIER(SUPPLIER#, SUPPLIERNAME, SUPPLIERCITY, TIMEZONE)

• SUPPLY(PART#, SUPPLIER#, BUYERID)

• select supplier where supplier city=‟Hamilton‟ giving

temp1

• join temp 1 and supply over supplier# giving temp2

• join temp2 and part over part# giving temp3

• project temp3 over (partname) giving answer

Thanks to Marina Barsky and Diane Horton for the material.