46
CS2300: File Structures and Introduction to Database Systems Lecture 9: Relational Model & Relational Algebra Doug McGeehan 1

CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

  • Upload
    others

  • View
    6

  • Download
    1

Embed Size (px)

Citation preview

Page 1: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

CS2300: File Structures and

Introduction to Database Systems

Lecture 9: Relational Model &

Relational Algebra

Doug McGeehan

1

Page 2: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

222

Brief Review

• Relational model conceptsInformal Terms Formal Terms

Table Relation

Column Attribute

Row Tuple

All possible values in a column Domain

Table Definition Schema of Relation

Populated Table Extension/State

Page 3: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

333

Brief Review

• Relational model concepts

• Keys and Superkeys

– e.g. Vehicle(VIN, Reg#, State, MPG, Odometer)

– Keys: {VIN} and {Reg#, State}

– Superkeys: {VIN, Odometer}

or {Reg#, State, MPG}

• They both include a key; one attribute can be removed

– Candidate keys: {VIN}, {Reg#, State}

– Primary key: {VIN}

Page 4: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

444

Brief Review

• Relational model concepts

• Keys and Superkeys

• Relational model constraints

– Domain / NOT NULL constraints (on attributes)

– Key constraints (on a single relation)

– Entity integrity constraint (on a single relation)

– Referential integrity constraint (on two relations)

Page 5: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Referential Integrity

A referential integrity constraint can be

displayed in a relational database schema as

a directed arc from R1.FK to R2.

SID Name Address SID Course Grade

R1 R2

5

Page 6: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Valid and Invalid State

• Valid state: A database state satisfies all

integrity constraints.

• Invalid state: A database state that does not

obey some integrity constraint(s).

SID Name Address

101 Alice Rolla

SID Course Grade

111 CS238 A

101 CS304 A

R1 R2Invalid state

6

Page 7: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Valid and Invalid State

• Valid state: A database state satisfies all

integrity constraints.

• Invalid state: A database state that does not

obey some integrity constraint(s).

SID Name Address

101 Alice Rolla

SID Course Grade

101 CS238 A

101 CS304 A

R1 R2Valid state

7

Page 8: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

8

Displaying a relational database

schema and its constraints

• Relation schema: displayed as a row of attribute names

• Name of the relation: above the attribute names

• Primary key attribute(s): underlined

• Foreign key (referential integrity) constraint:– A directed arc (arrow) from the foreign key attributes to the

referenced relation

– For clarity, can also point to the primary key of the referenced relation.

Page 9: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

How to identify a foreign key?

• Start from a Candidate Key

• Check if it is referenced by other

relations or even in the same relation

9

SID Name Address SID Course Grade

R1 R2

Page 10: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

10

Referencing and

referenced relations

can be the same

Page 11: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

11

ExerciseConsider the following relations for a database that keeps track of student

enrollment in courses and the books adopted for each course:

STUDENT(SSN, Name, Major, Bdate)

COURSE(Course#, Cname, Dept)

ENROLL(SSN, Course#, Quarter, Grade)

BOOK_ADOPTION(Course#, Quarter, Book_ISBN)

TEXT(Book_ISBN, Book_Title, Publisher, Author)

Draw a relational schema diagram specifying the foreign keys for

this schema.

11

Page 12: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Question

• Car(State, Reg#, SerialNo, Make, Model, Year)

– Car has two candidate keys: {state, reg#}, {SerialNo}

• Accident(SerialNo, date)

Is SerialNo a foreign key in Car or Accident relation?

Answer: SerialNo is a foreign key in Accident relation

because it is a candidate key in Car relation.

12

Page 13: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Operations of the Relational Model

• Operations can be categorized into

– Retrieval

• Query a database

– Updates

• Change the database

• Basic update operations for changing the database:

– INSERT a new tuple in a relation

– DELETE an existing tuple from a relation

– MODIFY an attribute of an existing tuple

13

Page 14: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Update Operations on Relations

• Integrity constraints should not be

violated by the update operations.

• Updates may propagate to cause

other updates automatically.

– This may be necessary to maintain

integrity constraints.

14

Page 15: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Example Relations

coursesidtid

os20220

calculus45010

db 20220

db24020 depttnametid

mathcohen10

cslevy20

yearsnamesid

3white240

3jones202

1adams450

S

T

R

15

Page 16: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Example Relations

coursesidtid

os20220

calculus45010

db 20220

db24020 depttnametid

mathcohen10

cslevy20

yearsnamesid

3white240

3jones200

1adams450

S

T

R

16

Page 17: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Example Relations

coursesidtid

os20020

calculus45010

db 20020

db24020 depttnametid

mathcohen10

cslevy20

yearsnamesid

3white240

3jones200

1adams450

S

T

R

17

Page 18: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Possible Violations for Insert Operation

• INSERT may violate any of the constraints:

– Domain constraint / NOT NULL constraint:

• A new attribute in new tuple is not in attribute’s domain

– Key constraint:

• Key attribute in new tuple already exists in its relation

– Entity integrity:

• Primary key value is NULL in the new tuple

– Referential integrity:

• Foreign key in new tuple references non-existent primary key

18

Page 19: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Possible Violations for Delete Operation

• DELETE may violate only referential integrity:

– If the primary key value of the tuple being deleted is

referenced from other tuples in the database

– Can be remedied by several actions

• Reject the deletion

• Propagate the change to the referencing tuples

• Set the foreign keys of the referencing tuples to NULL

when the referencing attributes are not part of the

primary key

– One of the above options must be specified during

database design for each foreign key constraint

19

Page 20: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Possible Violations for Modify Operation

• MODIFY may violate any of the constraints when

– Updating the primary key (PK)

• Similar to a DELETE followed by an INSERT

• Need to specify similar remedies to DELETE

– Updating a foreign key (FK)

• May violate referential integrity

– Updating an ordinary attribute (neither PK nor FK):

• Can only violate domain constraints

or NOT NULL constraint

20

Page 21: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

How to Deal With Violations?

• In case of integrity violation caused by any operation,

several actions can be taken:

– Reject the operation that causes the violation

– Correct the violation by triggering additional

updates

– Perform the operation but inform the user of the

violation

– Execute a user-specified error-correction routine

21

Page 22: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Relational Algebra

Chapter 8

22

Page 23: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

What is an “Algebra”?

• Mathematical system consisting of:

– Operands : variables or values from which

new values can be constructed.

– Operators : symbols denoting procedures

that construct new values from given

values.

23

Page 24: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

What is Relational Algebra?

• The relational model is an abstract

(mathematical) modeling of a table

• Relational algebra (RA): an algebra whose

operands are relations or variables that

represent relations.

– A (mathematical) query language for relations

– Query language = languages for writing questions

about the data

24

Page 25: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Why do we need to understand RA?

• Real queries are written in SQL, but are

translated by the query processor into

relational algebra

• Why?

– SQL is declarative, RA provides operations for

execution

– Optimization is easier in RA, since we can take

advantage of (provable) expression equivalences

25

Page 26: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

What you should know from this course

1. How to write queries in relational algebra

2. How to calculate the result of a relational

algebra expression over a set of relations

3. How to determine whether two relational

algebra expressions are equivalent

26

Page 27: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Relational Algebra

• Relational algebra has a collection of operators

on relations

• Operators may be unary or binary

• The output of an operator is a relation

– Another way to say this is that the algebra is “closed”

– Therefore, operators can be composed one on

another

27

Page 28: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Basic Operators

• Relational Algebra has 5 basic operators:

– Project (π)

– Select ()

– Union (U)

– Set difference (-)

– Cartesian product (X)

• Other operators can be defined using the basic ones:

Intersection, Join, Division …

• A useful syntactic operator: Rename

28

Page 29: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Example Relations

coursesidtid

os20220

calculus45010

db 20220

db24020 depttnametid

mathcohen10

cslevy20

yearsnamesid

3white240

3jones202

1adams450

S = Students

T = Teachers

R = Studies

S

T

R

29

Page 30: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

The Project Operation

• The Project operation is unary (i.e., it is

applied to a single relation)

• Denoted as: A1,…,An (R)

– A1,…,An are attributes

• Returns a new relation of only A1,…,An

from the original relation

• Output relation does not have a name

30

Page 31: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Example: Find the teacher’s ID (tid) for each course

coursetid

os20

calculus10

db20

coursesidtid

os20220

calculus45010

db20220

db24020

R

Less tuples in result. Why?

tid,course R

31

Page 32: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

The Project Operation

• Duplicate elimination

– The Project operation removes any duplicate tuples,

so the output is a valid relation

• When computing a projection on a relation with

n tuples:

– What is the minimum cardinality of the result?

– What is the maximum cardinality?

32

Page 33: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

The Project Operation

• list1 (list2 R) = list1 R

• Note:

– list2 must contain attributes in list1

– i.e. Intersection must not be empty

– Otherwise, the left-hand side is incorrect.

33

Page 34: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

The Select Operation

• Unary operator, written as C(R)

– C is a Boolean condition over each single

tuple in R

– e.g. name = ‘Doug’(Instructors)

• Returns the tuples that satisfy C

34

Page 35: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Example

• Return the courses taught by teacher number 20

35

coursesidtid

os20220

db20220

db24020

coursesidtid

os20220

calculus45010

db20220

db24020

Rtid = 20 R

Page 36: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

What types of Conditions can be used?

• The condition is made up of comparisons that

are connected using logical operators (and, or)

• Comparisons are between 2 attributes or

between an attribute and a constant

– attribute1 op attribute2 (e.g. balance<credit_limit)

– attribute1 op constant (e.g. tid=20)

Important! Conditions are evaluated a

single tuple at a time. 36

Page 37: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Types of Comparisons

• We can use any of the operators:

• When comparing an attribute with a

string, the string is written in single

quotes

= ,,,,,

)('' Tcohentname=37

Page 38: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

The Select Operation

• Examples

– tid = 20 and course = `os’ R

38

coursesidtid

os20220

calculus45010

db20220

db24020

R

What are the results

of the above queries?

Page 39: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

The Select Operation

• Examples

– tid = 20 and course = `os’ R

– tid = 10 and course = `db’ R

39

coursesidtid

os20220

calculus45010

db20220

db24020

R

What are the results

of the above queries?

Page 40: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

The Select Operation

• Examples

– tid = 20 and course = `os’ R

– tid = 10 and course = `db’ R

– tid = 20 or sid=202 or sid=450 R

40

coursesidtid

os20220

calculus45010

db20220

db24020

R

What are the results

of the above queries?

Page 41: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

The Select Operation

• c1 (c2 (R)) = c2 (c1 (R))

= c1 and c2 (R)

• When computing a selection on a relation

with n tuples,

– Minimum cardinality of result?

– Maximum cardinality of result?

41

Page 42: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Combining Selection and Projection

• What does this compute?

42

coursesidtid

os20220

calculus45010

db20220

db24020

Can we change

the order of

these two

operations?

course(tid = 20 R)

42

Page 43: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Example

• How would you find the names of the

third year students?

43

S yearsnamesid

3white240

3jones202

1adams450

sname

white

Jones

?

43

Page 44: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Example

• Find id of students named jones who are in

their first year or third year of studies

44

))(( )31('' Syearyearjonessnamesid ===

S yearsnamesid

3white240

3jones202

1adams450

sid

202

44

Page 45: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

Once Again

45

• What is in the result of the query when

we have this instance of S?

S yearsnamesid

3white240

3jones202

1jones450

sid

202

45045

))(( )31('' Syearyearjonessnamesid ===

Page 46: CS304: Database Systemsdjmvfb/courses/cs2300/static...Displaying a relational database schema and its constraints • Relation schema: displayed as a row of attribute names • Name

What’s Next

• More relational algebra operators …

• Project Phase 1 report due Monday

11:59pm

46