49
7.1 Database System Concepts Relational Data Model Relational Data Model Relational Databases Model Concepts Relational Constraints and Schemas Update Operations and Constraints Violations Basic Relational Algebra concepts Additional Relational Operations Examples of Queries Views

Relational Data Model - Home | George Mason …aobaidi/spring-02/relationalalgebra.pdf · Database System Concepts 7.1 Relational Data Model QRelational Databases Model Concepts

Embed Size (px)

Citation preview

7.1Database System Concepts

Relational Data ModelRelational Data ModelRelational Databases Model ConceptsRelational Constraints and SchemasUpdate Operations and Constraints ViolationsBasic Relational Algebra conceptsAdditional Relational OperationsExamples of Queries Views

7.2Database System Concepts

Relational Model ConceptsRelational Model Concepts•Represents data model as collection of relations, Each of which is a table•Each row represents a collection of related values•A Row is called Tuple, coloumn header is attribute

7.3Database System Concepts

Basic StructureBasic StructureFormally, given sets D1, D2, …. Dn a relation r is a subset of D1 x D2 x … x DnThus a relation is a set of n-tuples (a1, a2, …, an) where ai ∈ Di

Example: ifcustomer-name = {Jones, Smith, Curry, Lindsay}customer-street = {Main, North, Park}customer-city = {Harrison, Rye, Pittsfield}

Then r = { (Jones, Main, Harrison), (Smith, North, Rye),(Curry, North, Rye),(Lindsay, Park, Pittsfield)}

is a relation over customer-name x customer-street x customer-city

7.4Database System Concepts

Attribute TypesAttribute TypesEach attribute of a relation has a nameThe set of allowed values for each attribute is called the domainof the attributeAttribute values are (normally) required to be atomic, that is, indivisible

E.g. multivalued attribute values are not atomicE.g. composite attribute values are not atomic

The special value null is a member of every domainThe null value causes complications in the definition of many operations

we shall ignore the effect of null values in our main presentation and consider their effect later

7Database System Concepts

Relation SchemaRelation SchemaA1, A2, …, An are attributesR = (A1, A2, …, An ) is a relation schema

E.g. Customer-schema =(customer-name, customer-street, customer-city)

r(R) is a relation on the relation schema R is defined by n-tuples r={t1,t2,…,tm).Each n-tuple t is an order list of values <v1,v2,…vn>

where each value vi 1≤ i ≤ n is an element in dom(Ai)

7.6Database System Concepts

Relation InstanceRelation InstanceThe current values (current relation state) of a relation are specified by a tableAn element t of r is a tuple, represented by a row in a table

attributes

customer-citycustomer-name customer-street

MainNorthNorthPark

HarrisonRyeRye

Pittsfield

JonesSmithCurry

Lindsay

tuples

customer

7.7Database System Concepts

Relations are UnorderedRelations are UnorderedOrder of tuples is irrelevant (may be stored in an arbitrary order) since a relation represents facts at the logical or abstract level.E.g. account relation with unordered tuples

7.8Database System Concepts

DatabaseDatabaseA database consists of multiple relationsInformation about an enterprise is broken up into parts, with each relation storing one part of the information

E.g.: account : stores information about accountsdepositor : stores information about which customer

owns which account customer : stores information about customers

Storing all information as a single relation such as bank(account-number, balance, customer-name, ..)

results inrepetition of information (e.g. two customers own an account)the need for null values (e.g. represent a customer without an account)

Normalization theory deals with how to design relational schemas

7.9Database System Concepts

The The customer customer RelationRelation

7.10Database System Concepts

The The depositor depositor RelationRelation

7.11Database System Concepts

EE--R Diagram for the Banking EnterpriseR Diagram for the Banking Enterprise

7.12Database System Concepts

KeysKeys

Let K ⊆ RK is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation r(R) by “possible r” we mean a relation r that could exist in the enterprise we are modeling.Example: {customer-name, customer-street} and

{customer-name} are both superkeys of Customer, if no two customers can possibly have the same name.K is a candidate key if K is minimalExample: {customer-name} is a candidate key for Customer, since it is a superkey {assuming no two customers can possibly have the same name), and no subset of it is a superkey.

7.13Database System Concepts

Determining Keys from EDetermining Keys from E--R SetsR SetsStrong entity set. The primary key of the entity set becomes the primary key of the relation.Weak entity set. The primary key of the relation consists of the union of the primary key of the strong entity set and the discriminator of the weak entity set.Relationship set. The union of the primary keys of the related entity sets becomes a super key of the relation.

For binary many-to-one relationship sets, the primary key of the “many” entity set becomes the relation’s primary key.For one-to-one relationship sets, the relation’s primary key can be that of either entity set.For many-to-many relationship sets, the union of the primary keys becomes the relation’s primary key

7.14Database System Concepts

Referential integrity (foreign key)Referential integrity (foreign key)Specified between 2 relations to maintain consistency among tuplesA set of attributes FK in relation R1 is a foreign key of R1 that references a relation R2 if satisfies

FK attributes has same domain as PK of R 2

t1[FK]=t2[FK] ; R1 is referencing relation and R2 is referenced relation

Diagrammatically, RI constraint is displayed by drawing a direct arc from each FK to the relation it references with an arrowhead point to the PK of the referenced relation.

7.15Database System Concepts

Schema Diagram for the Banking EnterpriseSchema Diagram for the Banking Enterprise

7.16Database System Concepts

Update Operation and Constraint ViolationUpdate Operation and Constraint ViolationViolationViolation

The content of the database may be modified using the following operations:

Insertion: insert new tuple(s) in a relationDeletion: delete tuple(s) in a relation Updating: change value of atribute(s) in existing tuples

Examples1. Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, null, ‘1960-04-05’, ‘6357 Windy

Lane, Katy, TX’, F, 28000, null, 4> into EMPLOYEE. This insertion violates the entity integrity constraint (null for the primary key SSN), so it is rejected.

2. Insert <‘Alicia’, ‘J’, ‘Zelaya’, ‘999887777’, ‘1960-04-05’, ‘6357 Windy Lane, Katy, TX’, F, 28000, ‘987654321’, 4> into EMPLOYEE. This insertion violates the key constraint because another tuple with the same SSN value already exists in the EMPLOYEE relation, and so it is rejected.

7.17Database System Concepts

Update Operation and Constraint Update Operation and Constraint Violation…..Violation…..

Insert <‘Cecilia’, ‘F’, ‘Kolonsky’, ‘677678989’, ‘1960-04-05’, ‘6357 Windswept, Katy, TX’, F, 28000, ‘987654321’, 7> into EMPLOYEE.

This insertion violates the referential integrity constraint specified on DNO because no DEPARTMENT tuple exists with DNUMBER = 7

Delete the EMPLOYEE tuple with SSN = ‘999887777’. This deletion is not acceptable, because tuples in WORKS_ONrefer to this tuple. Hence, if the tuple is deleted, referential integrity violations will result.

Update the SSN of the EMPLOYEE tuple with SSN = ‘999887777’ to ‘987654321’.

Unacceptable, because it violates primary key and referential integrity constraints.

7.18Database System Concepts

Relational Algebra Relational Algebra Enable users to specify basic retrieval requestsSix basic operators

selectprojectrename Unionintersectionset differenceCartesian product

Sequence of relational algebra operation forms a relational algebra expressions

7.19Database System Concepts

Select Operation Select Operation –– ExampleExampleProvides a subset of the tuple from a relation that satisfy selection criterion

• Relation r σ clause<A> <comparison op><A><A><comparison op><constant>

The above can be connected by Boolean op such as AND OR NOTComparison op could be {=, <,≤, >, ≥ , ≠ }

σC (R) ≤ R for any condition Cσ operation is commutative

σs<cond1>(s<cond2>(R)) = s<cond2>(s<cond1>(R)) σs<cond1>(s<cond2>(. . .(s<condn> (R)) . . .)) = s<cond1> AND <cond2> AND . . .

AND <condn>(R)

A B C D

1

5

12

23

7

7

3

10

α

α

β

β

α

β

β

βRR

• σA=B AND D > 5 (r)

A B C D

1

23

7

10

α

β

α

β

7.20Database System Concepts

Project Operation Project Operation –– ExampleExampleRelation r:

A C A C

Notation:

∏A1, A2, …, Ak (r)where A1, A2 are attribute names and r is a relation name.

The result is defined as the relation of k columns obtained by erasing the columns that are not listed

Duplicate rows removed from result, since relations are sets

A B C

10

20

30

40

1

1

1

2

α

α

β

β

∏A,C (r)

1

1

2

α

β

β

1

1

1

2

α

α

β

β

=

7.21Database System Concepts

Rename OperationRename OperationAllows us to name, and therefore to refer to, the results of expressions. Allows us to refer to a relation by more than one name. (nested operations)

Example:

∏ FNAME, LNAME, SALARY(σ DNO= 5(EMPLOYEE)) Alternatively DEP5_EMPS←σDNO=5(EMPLOYEE) RESULT←∏FNAME, LNAME, SALARY(DEP5_EMPS)

ρS (B1, B2, …, Bn) (R) or ρ S (R) or ρS (B1, B2, …, Bn) (R)

ρ is rename operator, S is new relation name and B1..Bn is new attributes.

7.22Database System Concepts

Union Operation Union Operation –– ExampleExampleRelations r, s: Notation: r ∪ s

Defined as: {t | t ∈ r or t ∈ s}For r ∪ s to be valid.

1. r, s must have the same number of attributes

2. The attribute domains must be compatible (e.g., 2nd column of r deals with the same type of values as does the 2nd column of s)r ∪ s = s ∪ r (commutative)r ∪ (s ∪ q) = (r ∪ s ) ∪ q (associative)

A B A B

1

2

1

α

α

β

2

3

α

β

sr

r ∪ s:

A B

1

2

1

3

α

α

β

β

7.23Database System Concepts

Intersection Operation Intersection Operation –– ExampleExampleRelations r, s:

Notation: r ∩ sDefined as: {t | t ∈ r and t ∈ s}For r ∩ s to be valid.

1. r, s must have the same number of attributes2. The attribute domains must be compatible

(e.g., 2nd column of r deals with the same type of values as does the 2nd column of s)r ∩ s = s ∩ r (commutative)r ∩ (s ∩ q) = (r ∩ s ) ∩ q (associative)r ∩ s = r - (r - s)

A B A B

1

2

1

α

α

β

2

3

α

β

s

r ∩ s:

A B

7.24Database System Concepts

Set Difference Operation Set Difference Operation –– ExampleExampleRelations r, s:

Notation r – sDefined as: {t | t ∈ r and t ∉ s}Set differences must be taken

between compatible relations.r and s must have the same number

of attributesattribute domains of r and s must be

compatible

r – s ≠ s - r

A B A B

1

2

1

α

α

β

2

3

α

β

sr

r – s:

A B

1

1

α

β

7.25Database System Concepts

CartesianCartesian--Product OperationProduct Operation--ExampleExampleRelations r, s:

Notation r x sDefined as: {t q | t ∈ r and q ∈ s}Referred as Joint

A B C D E

10102010

aabb

αββγ

1

2

α

βr

sr x s:

DA B C E

ααααββββ

11112222

αββγαββγ

1019201010102010

aabbaabb

7.26Database System Concepts

Composition of OperationsComposition of OperationsCan build expressions using multiple operationsExample: σA=C(r x s)r x s

σA=C(r x s)

A B C D E

ααααββββ

11112222

αββγαββγ

1019201010102010

aabbaabb

A B C D E

αββ

122

αββ

102020

aab

7.27Database System Concepts

Additional OperationsAdditional Operations

We define additional operations that do not add any power to therelational algebra, but that simplify common queries.

Natural joinDivisionAssignment

7.28Database System Concepts

NaturalNatural--Join OperationJoin OperationNotation: r sLet r and s be relations on schemas R and S respectively.The result is a relation on schema R ∪ S which is obtained by considering each pair of tuples tr from r and ts from s. If tr and ts have the same value on each of the attributes in R ∩ S, a tupleis added to the result, where

t has the same value as tr on r

t has the same value as ts on s

Example:R = (A, B, C, D)S = (E, B, D)

Result schema = (A, B, C, D, E)r s is defined as:

∏r.A, r.B, r.C, r.D, s.E (σr.B = s.B r.D = s.D (r x s))

7.29Database System Concepts

Natural Join Operation Natural Join Operation –– ExampleExample

Relations r, s:

B D EA B C D

aaabb

13123

αβγδ∈

aabab

12412

αβγαδ

αγβγβ

r s

A B C D Er s

ααααδ

11112

aaaab

ααγγβ

αγαγδ

7.30Database System Concepts

Division OperationDivision Operation

r ÷ s

Suited to queries that include the phrase “for all”.Let r and s be relations on schemas R and S respectively where

R = (A1, …, Am, B1, …, Bn)S = (B1, …, Bn)

The result of r ÷ s is a relation on schemaR – S = (A1, …, Am)

r ÷ s = { t | t ∈ ∏ R-S(r) ∧ ∀ u ∈ s ( tu ∈ r ) }

7.31Database System Concepts

Division Operation Division Operation –– ExampleExample

Relations r, s: A B B

12311134612

αααβγδδδ∈∈β

1

2

s

r ÷ s: A r

α

β

7.32Database System Concepts

Another Division ExampleAnother Division Example

Relations r, s: A B C D E D E

αααββγγγ

aaaaaaaa

αγγγγγγβ

aabababb

ab

11

11113111

s

r

r ÷ s: A B C

aa

αγ

γγ

7.33Database System Concepts

Division Operation (Cont.)Division Operation (Cont.)

Property Let q – r ÷ sThen q is the largest relation satisfying q x s ⊆ r

Definition in terms of the basic algebra operationLet r(R) and s(S) be relations, and let S ⊆ R

r ÷ s = ∏R-S (r) –∏R-S ( (∏R-S (r) x s) – ∏R-S,S(r))

To see why∏R-S,S(r) simply reorders attributes of r

∏R-S(∏R-S (r) x s) – ∏R-S,S(r)) gives those tuples t in

∏R-S (r) such that for some tuple u ∈ s, tu ∉ r.

7.34Database System Concepts

Assignment OperationAssignment OperationThe assignment operation (←) provides a convenient way to express complex queries, write query as a sequential program consisting of a series of assignments followed by an expression whose value is displayed as a result of the query.Assignment must always be made to a temporary relation variable.Example: Write r ÷ s as

temp1 ← ∏R-S (r)temp2 ← ∏R-S ((temp1 x s) – ∏R-S,S (r))result = temp1 – temp2

The result to the right of the ← is assigned to the relation variable on the left of the ←.

May use variable in subsequent expressions.

7.35Database System Concepts

Generalized ProjectionGeneralized ProjectionExtends the projection operation by allowing arithmetic functions to be used in the projection list.

∏ F1, F2, …, Fn(E)E is any relational-algebra expressionEach of F1, F2, …, Fn are are arithmetic expressions involving constants and attributes in the schema of E.Given relation credit-info(customer-name, limit, credit-balance),find how much more each person can spend:

∏customer-name, limit – credit-balance (credit-info)

7.36Database System Concepts

Aggregate Functions and OperationsAggregate Functions and Operations

Aggregation function takes a collection of values and returns a single value as a result.

avg: average valuemin: minimum valuemax: maximum valuesum: sum of valuescount: number of values

Aggregate operation in relational algebra

G1, G2, …, Gn g F1( A1), F2( A2),…, Fn( An) (E)

E is any relational-algebra expressionG1, G2 …, Gn is a list of attributes on which to group (can be empty)Each Fi is an aggregate functionEach Ai is an attribute name

7.37Database System Concepts

Aggregate Operation Aggregate Operation –– ExampleExampleRelation r:

A B C

773

10

ααββ

αβββ

sum-Cg sum(c) (r)27

7.38Database System Concepts

Aggregate Operation Aggregate Operation –– ExampleExample

Relation account grouped by branch-name:

branch-name account-number balance

PerryridgePerryridgeBrightonBrightonRedwood

A-102A-201A-217A-215A-222

400900750750700

branch-name g sum(balance) (account)

branch-name balancePerryridgeBrightonRedwood

13001500700

7.39Database System Concepts

Aggregate Functions (Cont.)Aggregate Functions (Cont.)Result of aggregation does not have a name

Can use rename operation to give it a nameFor convenience, we permit renaming as part of aggregate operation

branch-name g sum(balance) as sum-balance (account)

7.40Database System Concepts

Outer JoinOuter JoinAn extension of the join operation that avoids loss of information.Computes the join and then adds tuples form one relation that does not match tuples in the other relation to the result of thejoin. Uses null values:

null signifies that the value is unknown or does not exist All comparisons involving null are (roughly speaking) false by definition.

Will study precise meaning of comparisons with nulls later

7.41Database System Concepts

Outer Join Outer Join –– ExampleExample

Relation loan

branch-nameloan-number amountDowntownRedwoodPerryridge

L-170L-230L-260

300040001700

Relation borrower

customer-name loan-numberJonesSmithHayes

L-170L-230L-155

7.42Database System Concepts

Outer Join Outer Join –– ExampleExample

Inner Join

loan Borrower

loan-number amount

L-170L-230

30004000

customer-name

JonesSmith

branch-name

DowntownRedwood

Left Outer Joinloan borrower

loan-number branch-name amount customer-nameL-170L-230L-260

DowntownRedwoodPerryridge

300040001700

JonesSmithnull

7.43Database System Concepts

Outer Join Outer Join –– ExampleExampleRight Outer Joinloan borrower

loan-number branch-name amount customer-nameL-170L-230L-155

DowntownRedwoodnull

30004000null

JonesS m i tHayes

loan borrowerFull Outer Join

loan-number branch-name amount customer-name

L-170L-230L-260L-155

DowntownRedwoodPerryridgenull

300040001700null

JonesSmithnullHayes

7.44Database System Concepts

Null ValuesNull ValuesIt is possible for tuples to have a null value, denoted by null, for some of their attributesnull signifies an unknown value or that a value does not exist.The result of any arithmetic expression involving null is null.Aggregate functions simply ignore null values

Is an arbitrary decision. Could have returned null as result instead.We follow the semantics of SQL in its handling of null values

For duplicate elimination and grouping, null is treated like anyother value, and two nulls are assumed to be the same

Alternative: assume each null is different from each otherBoth are arbitrary decisions, so we simply follow SQL

7.45Database System Concepts

Null ValuesNull ValuesComparisons with null values return the special truth value unknown

If false was used instead of unknown, then not (A < 5)would not be equivalent to A >= 5

Three-valued logic using the truth value unknown:OR: (unknown or true) = true,

(unknown or false) = unknown(unknown or unknown) = unknown

AND: (true and unknown) = unknown, (false and unknown) = false,(unknown and unknown) = unknown

NOT: (not unknown) = unknownIn SQL “P is unknown” evaluates to true if predicate P evaluates to unknown

Result of select predicate is treated as false if it evaluates to unknown

7.46Database System Concepts

Deletion ExamplesDeletion Examples

Delete all account records in the Perryridge branch.

account ← account – σ branch-name = “Perryridge” (account)Delete all loan records with amount in the range of 0 to 50

loan ← loan – σ amount ≥ 0 and amount ≤ 50 (loan)Delete all accounts at branches located in Needham.

r1 ← σ branch-city = “Needham” (account branch)

r2 ← ∏branch-name, account-number, balance (r1)

r3 ← ∏ customer-name, account-number (r2 depositor)account ← account – r2

depositor ← depositor – r3

7.47Database System Concepts

Insertion ExamplesInsertion ExamplesInsert information in the database specifying that Smith has $1200 in account A-973 at the Perryridge branch.

account ← account ∪ {(“Perryridge”, A-973, 1200)}depositor ← depositor ∪ {(“Smith”, A-973)}

Provide as a gift for all loan customers in the Perryridge branch, a $200 savings account. Let the loan number serve as the account number for the new savings account.

r1 ← (σbranch-name = “Perryridge” (borrower loan))account ← account ∪ ∏ branch-name, account-number,200 (r1)depositor ← depositor ∪ ∏ customer-name, loan-number,(r1)

7.48Database System Concepts

Update ExamplesUpdate ExamplesMake interest payments by increasing all balances by 5 percent.

account ← ∏ AN, BN, BAL * 1.05 (account)where AN, BN and BAL stand for account-number, branch-nameand balance, respectively.Pay all accounts with balances over $10,0006 percent interest and pay all others 5 percent

account ← ∏ AN, BN, BAL * 1.06 (σ BAL > 10000 (account))∪ ∏ AN, BN, BAL * 1.05 (σBAL ≤ 10000 (account))

7.49Database System Concepts

ViewsViewsIn some cases, it is not desirable for all users to see the entire logical model (i.e., all the actual relations stored in the database.)Consider a person who needs to know a customer’s loan number but has no need to see the loan amount. This person should see a relation described, in the relational algebra, by

∏customer-name, loan-number (borrower loan)Any relation that is not of the conceptual model but is made visible to a user as a “virtual relation” is called a view.