67
Prof. Kedem’s changes, if any, Prof. Kedem’s changes, if any, are marked in green, they are are marked in green, they are not copyrighted by the authors, not copyrighted by the authors, and the authors are not and the authors are not responsible for them. responsible for them. Prof. Shasha’s are marked in Prof. Shasha’s are marked in blue. blue.

Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

Prof. Kedem’s changes, if any, are Prof. Kedem’s changes, if any, are marked in green, they are not marked in green, they are not

copyrighted by the authors, and the copyrighted by the authors, and the authors are not responsible for them.authors are not responsible for them.

Prof. Shasha’s are marked in blue.Prof. Shasha’s are marked in blue.

Page 2: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.2Database System Concepts

!!Chapter 3: Relational ModelChapter 3: Relational Model Structure of Relational Databases

Relational Algebra

Brief Introduction toTuple Relational Calculus and Domain Calculus

Page 3: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.3Database System Concepts

Example of a RelationExample of a Relation

Page 4: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.4Database System Concepts

Basic StructureBasic Structure Formally, given sets D1, D2, …. Dn a relation r is a subset of

D1 x D2 x … x Dn The x represents cross-product (all possible n-tuples). Thus a relation is a set of n-tuples (a1, a2, …, an) where ai Di

Example: if

customer-name = {Jones, Smith, Curry, Lindsay}customer-street = {Main, North, Park}customer-city = {Harrison, Rye, Pittsfield}

Then r = { (Jones, Main, Harrison), (Smith, North, Rye), (Curry, North, Rye), (Lindsay, Park, Pittsfield)} is a relation over customer-name x customer-street x customer-city

Page 5: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.5Database System Concepts

Attribute TypesAttribute Types Each attribute of a relation has a name

The set of allowed values for each attribute is called the domain of the attribute

Attribute values are (normally) required to be atomic, that is, indivisible

• E.g. multivalued attribute values are not atomic

• E.g. composite attribute values are not atomic

The special value null is a member of every domain (not in pure relational algebra, and therefore we will not talk about it in this chapter)

Page 6: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.6Database System Concepts

Relation SchemaRelation Schema A1, A2, …, An are attributes

R = (A1, A2, …, An ) is a relation schema

E.g. Customer-schema = (customer-name, customer-street, customer-city)

r(R) is a relation on the relation schema R

E.g. customer (Customer-schema)

Page 7: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.7Database System Concepts

Relation InstanceRelation Instance The current values (relation instance) of a relation are

specified by a table

An element t of r is a tuple, represented by a row in a table

JonesSmithCurry

Lindsay

customer-name

MainNorthNorthPark

customer-street

HarrisonRyeRye

Pittsfield

customer-city

customer

attributes

tuples

Page 8: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.8Database System Concepts

RelationsRelations Order of tuples is irrelevant (tuples may be stored in an arbitrary order)

Tuple may appear several times (textbook is wrong)

E.g. account relation with unordered tuples

Page 9: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.9Database System Concepts

Relations in Relational AlgebraRelations in Relational Algebra Relational algebra deals with relations. Relations are sets of tuples

(rows) drawn from suitable domains. The order of rows and whether a row appears once or many times

does not matter The order of columns matters, but as our columns will generally be

labeled, we will be able to reconstruct the order even if the columns are permuted.

The following two relations are equal (in pure relational algebra though not in SQL databases)• A B

1 102 20

• B A20 210 120 220 2

©Zvi M. Kedem

Page 10: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.10Database System Concepts

DatabaseDatabase A database consists of multiple relations Information about an enterprise is broken up into parts, with

each relation storing one part of the information

E.g.: account : stores information about accounts depositor : stores information about which customer owns which account customer : stores information about customers

Storing all information as a single relation such as bank(account-number, balance, customer-name, ..)results in• repetition of information (e.g. two customers own an account)

• the need for null values (e.g. represent a customer without an account)

Normalization theory (Chapter 7) deals with how to design relational schemas. Or use entity-relationship diagrams.

Page 11: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.11Database System Concepts

The The customer customer RelationRelation

Page 12: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.12Database System Concepts

The The depositor depositor RelationRelation

Page 13: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.13Database System Concepts

Query LanguagesQuery Languages Language in which user requests information from the database.

Categories of languages

• procedural

• non-procedural

“Pure” languages:

• Relational Algebra

• Tuple Relational Calculus

• Domain Relational Calculus

Pure languages form underlying basis of query languages that people use.

Page 14: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.14Database System Concepts

Relational AlgebraRelational Algebra Procedural language

Six basic operators (not all independent)

• select

• project

• union

• set difference

• Cartesian product

• rename

The operators take two or more relations as inputs and give a new relation as a result.

Page 15: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.15Database System Concepts

Select Operation – ExampleSelect Operation – Example(choice of tuples based on conditions)(choice of tuples based on conditions)

• Relation r A B C D

1

5

12

23

7

7

3

10

A=B ^ D > 5 (r)A B C D

1

23

7

10

Page 16: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.16Database System Concepts

Select OperationSelect Operation

Notation: p(r) p is called the selection predicate Defined as:

p(r) = {t | t r and p(t)}

Where p is a formula in propositional calculus consisting of terms connected by : (and), (or), (not)Each term is one of:

<attribute> op <attribute> or <constant>

where op is one of: =, , >, . <. Example of selection:

branch-name=“Perryridge”(account)

Page 17: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.17Database System Concepts

Project Operation – ExampleProject Operation – Example(choice of attributes by listing)(choice of attributes by listing)

Relation r: A B C

10

20

30

40

1

1

1

2

A C

1

1

1

2

=

A C

1

1

2

A,C (r)

Page 18: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.18Database System Concepts

!!Project OperationProject Operation Notation:

A1, A2, …, Ak (r)

where A1, A2 are attribute names and r is a relation name. The result is defined as the relation of k columns, not necessarily

distinct Whether you remove duplicate rows from result or not, it is

up to you since relations are sets, and could have contained duplicate rows themselves

E.g. To eliminate the branch-name attribute of account account-number, balance (account)

Comment: better to use lower case instead of upper case , just as we use and not

Page 19: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.19Database System Concepts

Union Operation – ExampleUnion Operation – Example

Relations r, s:

r s:

A B

1

2

1

A B

2

3

rs

A B

1

2

1

3

Page 20: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.20Database System Concepts

Union OperationUnion Operation Notation: r s

Defined as:

r s = {t | t r or t s}

For r s to be valid.

1. r, s must have the same arity (same number of attributes)

2. The attribute domains must be compatible (e.g., 2nd column of r deals with the same type of values as does the 2nd column of s)

E.g. to find all customers with either an account or a loan customer-name (depositor) customer-name (borrower)

Page 21: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.21Database System Concepts

Set Difference Operation – ExampleSet Difference Operation – Example

Relations r, s:

r – s:

A B

1

2

1

A B

2

3

rs

A B

1

1

Page 22: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.22Database System Concepts

Set Difference OperationSet Difference Operation Notation r – s

Defined as:

r – s = {t | t r and t s}

Set differences must be taken between compatible relations.

• r and s must have the same arity

• attribute domains of r and s must be compatible

Page 23: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.23Database System Concepts

Cartesian-Product Operation-ExampleCartesian-Product Operation-Example

Relations r, s:

r x s:

A B

1

2

A B

11112222

C D

1010201010102010

E

aabbaabb

C D

10102010

E

aabbr

s

Page 24: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.24Database System Concepts

Cartesian-Product OperationCartesian-Product Operation Notation r x s

Defined as:

r x s = {t q | t r and q s}

Assume that attributes of r(R) and s(S) are disjoint. (That is, R S = ).

If attributes of r(R) and s(S) are not disjoint, then renaming must be used, in an obvious way.

• If c is a column in both R and S then it would be referred to as R.c in R and perhaps left as c in S.

Page 25: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.25Database System Concepts

Composition of OperationsComposition of Operations Can build expressions using multiple operations

Example: A=C(r x s)

r x s

A=C(r x s)

A B

11112222

C D

1019201010102010

E

aabbaabb

A B C D E

122

102020

aab

Page 26: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.26Database System Concepts

!!Rename OperationRename Operation Allows us to name, and therefore to refer to, the results of

relational-algebra expressions. Allows us to refer to a relation by more than one name.

Example:

x (E)

returns the expression E under the name X

If a relational-algebra expression E has arity n, then

x (A1, A2, …, An) (E)

returns the result of expression E under the name X, and with the

attributes renamed to A1, A2, …., An. It is better to think of it as another copy of the relation and not as an alternative new name for the original relation

Page 27: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.27Database System Concepts

Alternative notation for renamingAlternative notation for renaming We introduce it by means of an example:

Let E have two columns, A, B. We want to rename them to C and D

AC, B D(E)

So this projecting all the columns of E but using the arrow symbol to rename them

©Zvi M. Kedem

Page 28: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.28Database System Concepts

A Small ExampleA Small Example You may find this example useful to look at some queries

The example consists of 3 relations:

• PERSONAL(PERSON,SEX,AGE).

• This relation, whose key is PERSON, gives SEX and AGE for each PERSON.

• BIRTH(PARENT,CHILD).

• This relation, whose key is the pair PARENT,CHILD, gives information about who is a parent of whom. (Both mother and father would be generally listed).

• MARRIAGE(HUSBAND,WIFE,DATE).

• This relation whose key is the pair HUSBAND,WIFE, gives information about the DATE that the marriage took place.

©Zvi M. Kedem

Page 29: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.29Database System Concepts

A Sample InstanceA Sample Instance PERSONAL:

PERSN SEX AGEAlbert M 20Dennis M 40Evelyn F 20John M 60Mary F 40Robert M 60Susan F 40

BIRTH:PARNT CHILDDennis AlbertJohn MaryMary AlbertRobert EvelynSusan EvelynSusan Richard

MARRIAGE:HSBND WIFE DATEDennis Mary 4/3/68Robert Susan 1/1/70

©Zvi M. Kedem

Page 30: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.30Database System Concepts

A QueryA Query Produce the relation ANSWER(PERSON) consisting of all

women who are less or equal than 32 years old.

Note that all the information required can be obtained from looking at a single relation, PERSONAL.

ANSWER:= PERSON AGE 32 SEX='F' ( PERSONAL) .

PERSONEvelyn

©Zvi M. Kedem

Page 31: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.31Database System Concepts

A QueryA Query Produce a relation ANSWER(PARENT, DAUGHTER ) with the obvious

meaning.

Here, even though the answer comes only from the single relation BIRTH, we still have to check in the relation PERSONAL what the SEX of the CHILD is. To do that, we create the Cartesian product of the two relations: PERSONAL and BIRTH. This gives us “long tuples,'' consisting of a tuple in PERSONAL and a tuple in BIRTH.

For our purpose, the two tuples matched if PERSON in PERSONAL is CHILD in BIRTH and the SEX of the PERSON is F.

ANSWER:= PARENT, CHILD DAUGHTER CHILD = PERSON SEX = 'F’ (PERSONAL BIRTH ) .

PARNT DAUGHTERJohn MarySusan EvelynRobert Evelyn

©Zvi M. Kedem

Page 32: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.32Database System Concepts

A QueryA Query Produce a relation ANSWER(FATHER, DAUGHTER) with the obvious

meaning. Here we have to simultaneously look at two copies of the relation

PERSONAL, as we have to determine both the SEX of the PARENT and the SEX of the CHILD.

We need to have two distinct copies of PERSONAL. So we first do:

• PERSONAL1:=PERSONAL Then, the answer is:

ANSWER:= PARENT FATHER, CHILD DAUGHTER (PERSONAL BIRTH PERSONAL1)

where is the formula:PARENT = PERSONAL.PERSON CHILD = PERSONAL1.PERSON PERSONAL.SEX = 'M' PERSONAL1.SEX = 'F'

FATHR DAUGHTERJohn MaryRobert Evelyn

©Zvi M. Kedem

Page 33: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.33Database System Concepts

A QueryA Query Produce a relation:

ANSWER(FATHER_IN_LAW,SON_IN_LAW).

This is an exercise for you. Hint: you need to compute the Cartesian product of “many” relations, if you start with the ones originally given. Or you can use one or more of the relations you have computed already

FIL SILJohn Dennis

©Zvi M. Kedem

Page 34: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.34Database System Concepts

A QueryA Query Produce a relation:

ANSWER(FATHER_IN_LAW,CHILD_IN_LAW).

Solution: this is the union of

(FATHER_IN_LAW,SON_IN_LAW) and (FATHER_IN_LAW,DAUGHTER_IN_LA

©Zvi M. Kedem

Page 35: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.35Database System Concepts

A QueryA Query We use the symbol “” to denote renaming of attributes We use the symbol “” to denote assignment of relational

expression to a relationYou may find the above notation more convenient, I do

Produce a relation: ANSWER(GRANDPARENT,GRANDCHILD). We need to have two distinct copies of BIRTH because we need

to correlated BIRTH with itself. So we first do:• BIRTH1 BIRTH

ANSWER:= BIRTH.PARENT GRANDPARENT, BIRTH1.CHILD GRANDCHILD

BIRTHCHILD = BIRTH1.PARENT (BIRTH BIRTH1 )

GP GCJohn Albert

©Zvi M. Kedem

Page 36: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.36Database System Concepts

Relational Algebra Is Not UniversalRelational Algebra Is Not Universal

Can compute: ANSWER(GREATGRANDPARENT, GREATGRANDCHILD)

But, it is impossible in relational algebra to compute the relation ANSWER(ANCESTOR, DESCENDANT).

Why? The proof is a reasonably simple, but uses cumbersome

induction. The general idea is:

• Any relational algebra query is limited in how many relations or copies of relations it can refer to

• Computing arbitrary (ancestor,descendant) pairs cannot be done, if the query is limited in advance in such way—limited number of relations and copies of relations

©Zvi M. Kedem

Page 37: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.37Database System Concepts

!Transitive closure!Transitive closure Looking at this another way, we cannot do transitive closure of a

graph

A directed graph, is just a relation consisting of pairs of nodes with a schema, say, Arc(From,To): set of all arcs.

Cannot create in relational algebra a query returning Reachable(From,To): set of all paths.

Not a big deal. Either using a stored procedure language or an external programming language can form loops.To compute ancestor of some set X given a parent relation, simply do breadth first search.

©Zvi M. Kedem

Page 38: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.38Database System Concepts

Transitive Closure Through LoopsTransitive Closure Through Loops insert temp1”prince william of wales") Delete from ancestor

WHILE EXISTS(SELECT * FROM Temp1) BEGIN INSERT Ancestor SELECT * FROM Temp1;

INSERT Temp2 SELECT * FROM Temp1;

DELETE Temp1 FROM Temp1;

INSERT Temp1 SELECT Parental.parent FROM Parental, Temp2 WHERE Parental.child = Temp2.parent;

DELETE Temp2 FROM Temp2;

END

©Zvi M. Kedem

Page 39: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.39Database System Concepts

Transitive Closure Through LoopsTransitive Closure Through Loops Result in ancestor:

prince william of walescharlesdianaearl of spencerelizabeth IIphilip of edinburghprince andrew of greece

Reference: http://www.infoplease.com/spot/royal3.html

©Zvi M. Kedem

Page 40: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.40Database System Concepts

KeysKeys Let K R K is a superkey of R if values for K are sufficient to identify a

unique tuple of each possible relation r(R) by “possible r” we mean a relation r that could exist in the enterprise we are modeling.Example: {customer-name, customer-street} and {customer-name} are both superkeys of Customer, if no two customers can possibly have the same name.

In other words: any two tuples that are equal on K are equal (identical on all fields)

K is a candidate key if K is minimalExample: {customer-name} is a candidate key for Customer, since it is a superkey {assuming no two customers can possibly have the same name), and no subset of it is a superkey.

Page 41: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.41Database System Concepts

!!Banking Example Banking Example (marked keys)(marked keys)

branch (branch-name, branch-city, assets)

customer (customer-name, customer-street, customer-only)

account (account-number, branch-name, balance)

loan (loan-number, branch-name, amount)

depositor (customer-name, account-number)

borrower (customer-name, loan-number)

Page 42: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.42Database System Concepts

Example QueriesExample Queries

Find all loans of over $1200

amount > 1200 (loan)

Find the loan number for each loan of an amount greater than $1200

loan-number (amount > 1200 (loan))

Page 43: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.43Database System Concepts

Example QueriesExample Queries

Find the names of all customers who have a loan, an account, or both, from the bank

customer-name (borrower) customer-name (depositor)

Find the names of all customers who have a loan and an account at bank.

customer-name (borrower) customer-name (depositor)

Page 44: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.44Database System Concepts

Example QueriesExample Queries Find the names of all customers who have a loan at the Perryridge branch.

customer-name (branch-name=“Perryridge”

(borrower.loan-number = loan.loan-number(borrower x loan)))

Find the names of all customers who have a loan at the Perryridge branch but do not have an account at any branch of the bank. customer-name

(branch-name = “Perryridge”

(borrower.loan-number = loan.loan-number(borrower x loan)))

– customer-name(depositor)

Can combine “cascading” selection conditions, as shown above:

A(B) = A B

Page 45: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.45Database System Concepts

!!Example QueriesExample Queries Find the names of all customers who have a loan at the Perryridge branch.

Query 1

customer-name(branch-name = “Perryridge”

(borrower.loan-number = loan.loan-number(borrower x loan)))

Query 2

customer-name(loan.loan-number = borrower.loan-number(

(branch-name = “Perryridge”(loan)) x

borrower) )

If you hand-optimize queries, you may improve the performance substantially.

Page 46: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.46Database System Concepts

Formal DefinitionFormal Definition A basic expression in the relational algebra consists of either

one of the following:• A relation in the database

• A constant relation

Let E1 and E2 be relational-algebra expressions; the following are all relational-algebra expressions:

• E1 E2

• E1 - E2

• E1 x E2

p (E1), P is a predicate on attributes in E1

s(E1), S is a list consisting of some of the attributes in E1

x (E1), x is the new name for the result of E1

Page 47: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.47Database System Concepts

Additional OperationsAdditional Operations

We define additional operations that do not add any power to the

relational algebra, but that simplify common queries.

Set intersection

Natural join

Division (We will ignore because Dennis has never seen this used.)

Assignment

Page 48: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.48Database System Concepts

Set-Intersection OperationSet-Intersection Operation Notation: r s

Defined as:

r s ={ t | t r and t s }

Assume:

• r, s have the same arity

• attributes of r and s are union compatible

Note: r s = r - (r - s)

Page 49: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.49Database System Concepts

Set-Intersection Operation - ExampleSet-Intersection Operation - Example Relation r, s:

r s

A B

121

A B

23

r s

A B

2

Page 50: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.50Database System Concepts

Natural-Join OperationNatural-Join Operation Notation: r s Let r and s be relations on schemas R and S respectively.The result is a

relation on schema R S which is obtained by considering each pair of tuples tr from r and ts from s.

If tr and ts have the same value on each of the attributes in R S, a tuple t is added to the result, where

• t has the same value as tr on r

• t has the same value as ts on s

Example:

R = (A, B, C, D)

S = (E, B, D) Result schema = (A, B, C, D, E) r s is defined as:

r.A, r.B, r.C, r.D, s.E (r.B = s.B r.D = s.D (r x s))

Page 51: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.51Database System Concepts

Natural Join Operation – ExampleNatural Join Operation – Example

Relations r, s:

A B

12412

C D

aabab

B

13123

D

aaabb

E

r

A B

11112

C D

aaaab

E

s

r s

Page 52: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.52Database System Concepts

Division OperationDivision Operation(Prof. Shasha has never seen this used in practice so we will skip this in 2006.)(Prof. Shasha has never seen this used in practice so we will skip this in 2006.)

Suited to queries that include the phrase “for all”. (not quite right, its translation to standard operations teaches you how to handle “for all” in more general way)

Let r and s be relations on schemas R and S respectively where

• R = (A1, …, Am, B1, …, Bn)

• S = (B1, …, Bn)

The result of r s is a relation on schema

R – S = (A1, …, Am)

r s = { t | t R-S(r) u s ( tu r ) }

r s

Page 53: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.53Database System Concepts

Division Operation – ExampleDivision Operation – Example

Relations r, s:

r s: A

B

1

2

A B

12311134612

r

s

Page 54: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.54Database System Concepts

Another Division ExampleAnother Division Example

A B

aaaaaaaa

C D

aabababb

E

11113111

Relations r, s:

r s:

D

ab

E

11

A B

aa

C

r

s

Page 55: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.55Database System Concepts

Division Operation (Cont.)Division Operation (Cont.)

Property • Let q – r s

• Then q is the largest relation satisfying q x s r

Definition in terms of the basic algebra operationLet r(R) and s(S) be relations, and let S R

r s = R-S (r) –R-S ( (R-S (r) x s) – R-S,S(r))

IGNORE THIS, see my explanation following

To see why R-S,S(r) simply reorders attributes of r

R-S(R-S (r) x s) – R-S,S(r)) gives those tuples t in

R-S (r) such that for some tuple u s, tu r.

Page 56: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.56Database System Concepts

What does division meanWhat does division mean Assume for simplicity that attributes of S are the “last” attributes

of R, as indeed done by the authors see the slide with the title “Division Operation”, therefore, R = R-S,S(r)

Let R be (customer-name,branch-name), listing for each customer all the branches in which that customer has a deposit

Let S be (branch-name), listing all branches of interest.

To shorten the notation, we will write: R(A,B) and S(B).

We want to answer the following query:Find all customers listed in R who have deposits in all the branches in S (call them “good,” other in R will be “bad.”)

In other words, find all A in R, s.t. for every B in S, the tuple(A,B) is in R

©Zvi M. Kedem

Page 57: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.57Database System Concepts

What does division meanWhat does division mean We want to understand what this is

r s = R-S (r) –R-S ( (R-S (r) x s) – r)

And using our specific (but essentially very general) example:r s = A (r) –A ( (A (r) x s) – r)

t = A (r) This is the set of all customers who have a deposit

u = t x s This simply lists all possible pairs of customer of interest with branch of interest, it does not provide any “real information.”

v = u – r This lists for each customer (who has a deposit) the branches of interest in which s/he does not have a deposit

w = A (v) This lists all the bad customers

x = t – w This lists all the good customers; and answers the query

©Zvi M. Kedem

Page 58: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.58Database System Concepts

Assignment OperationAssignment Operation The assignment operation () provides a convenient way to

express complex queries, write query as a sequential program consisting of a series of assignments followed by an expression whose value is displayed as a result of the query.

Assignment must always be made to a temporary relation variable.

Example: Write r s as

temp1 R-S (r)

temp2 R-S ((temp1 x s) – R-S,S (r))

result = temp1 – temp2

• The result to the right of the is assigned to the relation variable on

the left of the .

• May use variable in subsequent expressions.

Page 59: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.59Database System Concepts

Example QueriesExample Queries Find all customers who have an account from at least the

“Downtown” and the Uptown” branches.

• Query 1

CN(BN=“Downtown”(depositor account))

CN(BN=“Uptown”(depositor account))

where CN denotes customer-name and BN denotes

branch-name.

• Query 2

customer-name, branch-name (depositor account)

temp(branch-name) ({(“Downtown”), (“Uptown”)})

Page 60: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.60Database System Concepts

Find all customers who have an account at all branches located in Austin.

customer-name, branch-name (depositor account)

branch-name (branch-city = “Austin” (branch))

Example QueriesExample Queries

Page 61: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.61Database System Concepts

Tuple Relational CalculusTuple Relational Calculus A nonprocedural query language, where each query is of the form

{t | P (t) }

It is the set of all tuples t such that predicate P is true for t

t is a tuple variable, t[A] denotes the value of tuple t on attribute A

t r denotes that tuple t is in relation r

P is a formula similar to that of the predicate calculus

Page 62: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.62Database System Concepts

Predicate Calculus FormulaPredicate Calculus Formula

1. Set of attributes and constants

2. Set of comparison operators: (e.g., , , , , , )

3. Set of connectives: and (), or (v)‚ not ()

4. Implication (): x y, if x if true, then y is true

x y x v y

5. Set of quantifiers: t r (Q(t)) ”there exists” a tuple in t in relation r

such that predicate Q(t) is true

t r (Q(t)) Q is true “for all” tuples t in relation r

Page 63: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.63Database System Concepts

Banking ExampleBanking Example branch (branch-name, branch-city, assets)

customer (customer-name, customer-street, customer-city)

account (account-number, branch-name, balance)

loan (loan-number, branch-name, amount)

depositor (customer-name, account-number)

borrower (customer-name, loan-number)

Page 64: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.64Database System Concepts

Example QueriesExample Queries Find the loan-number, branch-name, and amount for loans of

over $1200

{t | t loan t [amount] 1200}

Find the loan number for each loan of an amount greater than $1200

{t | s loan (t[loan-number] = s[loan-number] s [amount] 1200}

Notice that a relation on schema [customer-name] is implicitly defined by the query

Page 65: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.65Database System Concepts

Safety of ExpressionsSafety of Expressions It is possible to write tuple calculus expressions that generate

infinite relations.

For example, {t | t r} results in an infinite relation if the domain of any attribute of relation r is infinite

To guard against the problem, we restrict the set of allowable expressions to safe expressions.

For completeness, (we will not dwell on this): An expression {t | P(t)} in the tuple relational calculus is safe if every component of t appears in one of the relations, tuples, or constants that appear in P

Page 66: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.66Database System Concepts

Domain Relational CalculusDomain Relational Calculus A nonprocedural query language equivalent in power to the tuple

relational calculus (remember that tuples are made up of elements from domains)

Each query is an expression of the form:

{ x1, x2, …, xn | P(x1, x2, …, xn)}

• x1, x2, …, xn represent domain variables

• P represents a formula similar to that of the predicate calculus

Page 67: Prof. Kedem’s changes, if any, are marked in green, they are not copyrighted by the authors, and the authors are not responsible for them. Prof. Shasha’s

©Silberschatz, Korth and Sudarshan3.67Database System Concepts

Example QueriesExample Queries Find the branch-name, loan-number, and amount for loans of over

$1200

{ l, b, a | l, b, a loan a > 1200}

Find the names of all customers who have a loan of over $1200

{ c | l, b, a ( c, l borrower l, b, a loan a > 1200)}

Find the names of all customers who have a loan from the Perryridge branch and the loan amount:

{ c, a | l ( c, l borrower b( l, b, a loan

b = “Perryridge”))}

or { c, a | l ( c, l borrower l, “Perryridge”, a loan)}