Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
Relational Query LanguagesWalid G. Aref
Walid G. Aref
Query Languages For The Relational Model
• Two categories of query languages for the relational model
1. Imperative (Procedural):• Specifies the steps to be taken to evaluate a user’s query• Relational Algebra: The mathematical foundation for all relational query engines
2. Declarative (Non-procedural):• Specifies the query that the user needs answered and not how to answer it• DBMS is responsible for query compilation and optimization for efficient evaluation• Need query optimizer to re-order the operations while guaranteeing a correct answer to the
query (does not change during the process of optimization)• Relational Calculus - Comes in two flavors:
• Tuple Relational Calculus: The mathematical foundation for SQL• Domain Relational Calculus: The mathematical foundation for QBE (Query By Example)
Walid G. Aref
Query Languages For The Relational Model
Relational Query Languages
Procedural
Relational Algebra
Declarative
Domain Relational Calculus
Query By Example (QBE)
Tuple Relational Calculus
SQL
Walid G. Aref
Query Languages For The Relational Model
Relational Query Languages
Procedural
Relational Algebra
Declarative
Domain Relational Calculus
Query By Example (QBE)
Tuple Relational Calculus
SQL
Walid G. Aref
Relational Algebra
• Based on Set Theory• Building block operators to access relations and retrieve and
manipulate tuples in these relations• Each operator takes as input one or multiple relations and produce an
output relation è Composition• Operators can be nested in any order to perform more complex
operations
Walid G. Aref
Relational Algebra• Basic operations:
• Select ( ): Select a subset of the tuples from the input relation• Projection ( ): Eliminate unwanted columns from the input relation.• Cross-product ( ): Produce all possible tuple pairs from the two
input relations.• Set-difference ( ) : Produce tuples that belong to input relation 1,
but does not belong to the input relation 2. • Union ( ): Produce output table that contains tuples from the two
input relations.• Renaming: Change the name of an attribute or a table
• Additional non-basic operations• Can be realized by composing multiple basic operations • Relational Algebra is “closed” under the relational operators.• Intersection, join, division
sp
-´
!
Walid G. Aref
Our Example Relational Database Schema
• Students(sid: string, name: string, login: string, age: integer, gpa: real)• Courses(cid: string, cname: string, credits: integer) • Enrolled(sid: string, cid: string, grade: string)• Instructor(iid: string, iname: string, irank: string, isalary: real)• Teaches(iid: string, cid: string, year: integer, semester: string)
Walid G. Aref
Project ( )
• RetainthelistedattributesandEliminatetheunlistedones• May need to eliminate the resulting duplicate tuples (see Figure)• Students(sid: string, name: string, login: string, age: integer, gpa: real)• 𝜋sid, name Students
• Can also have expressions in the attribute list, e.g., 𝜋name, age + 1 Students
p
sid name login age gpa
0111 Bright, Mary [email protected] 22 4.0
0222 Star, Adam [email protected] 21 2.3
0444 Zhang, Rita [email protected] 17 3.3
0333 Shah, Ragu [email protected] 19 3.0
sid name
0111 Bright, Mary
0222 Star, Adam
0444 Zhang, Rita
0333 Shah, Ragu
Walid G. Aref
Select (𝝈 )
• Given an input predicate and table, produce as output the rows that satisfy the predicate• Schema of the output table is the same as that of the input table• 𝝈age<20(students)
sid name login age gpa
0111 Bright, Mary [email protected] 22 4.0
0222 Star, Adam [email protected] 21 2.3
0444 Zhang, Rita [email protected] 17 3.3
0333 Shah, Ragu [email protected] 19 3.0
sid name login age gpa
0444 Zhang, Rita [email protected] 17 3.3
0333 Shah, Ragu [email protected] 19 3.0
Walid G. Aref
Cross Product ( X )
• Given two tables, produce all possible combinations of tuple pairs• Enrolled(sid: string, cid: string, grade: string)• Courses(cid: string, cname: string, credits: integer) • Enrolled X Courses
• The schema of the output table is the union of the schemas of the two input tables
cid cname credits
CS541 DB Systems 3
CS580 Algorithms 3
sid cid grade
0111 CS541 A+
0222 CS580 B-
sid cid grade
0111 CS541 A+
0111 CS541 A+
0222 CS580 B-
0222 CS580 B-
cid cname credits
CS541 DB Systems 3
CS580 Algorithms 3
CS541 DB Systems 3
CS580 Algorithms 3
X =
Walid G. Aref
Union
• Given two tables with compatible schemas• Same number of attributes• Same data type per matching attribute pairs
• Will eliminate duplicates• WL_Enrolled(sid: string, cid: string, grade: string)• Calumet_Enrolled(sid: string, cid: string, grade: string)• WL_Enrolled ∪ Calumet_Enrolled
sid cid grade
0111 CS541 A+
0333 CS503 B
sid cid grade
0111 CS541 A+
0222 CS580 B-
sid cid grade
0111 CS541 A+
0333 CS503 B
0222 CS580 B-
∪
Walid G. Aref
Set Difference ( - )
• Given two tables with compatible schemas• Same number of attributes• Same data type per matching attribute pairs
• Produce tuples in the first table that do not exist in the second table• WL_Enrolled(sid: string, cid: string, grade: string)• Calumet_Enrolled(sid: string, cid: string, grade: string)• WL_Enrolled − Calumet_Enrolled• Notice that in contrast to Union, Set Difference is not commutative.
sid cid grade
0111 CS541 A+
0333 CS503 B
sid cid grade
0111 CS541 A+
0222 CS580 B-
sid cid grade
0222 CS580 B-−
Walid G. Aref
Renaming (𝜌)
• Allow us to assign a name to the result of a relational algebra expression• 𝜌X (A1, A2, …, An)(E)
• returns the result of expression E under the name X, and with the attributes renamed to A1, A2, …., An
Walid G. Aref
Composition of Operators
• Find the names of students with GPA 4.0• 𝜋name sgpa=4.0 (Students)
sid name login age gpa
0111 Bright, Mary [email protected] 22 4.0
0222 Star, Adam [email protected] 21 2.3
0444 Zhang, Rita [email protected] 17 3.3
0333 Shah, Ragu [email protected] 19 3.0
sid name login age gpa
0111 Bright, Mary [email protected] 22 4.0
name
Bright, MaryWalid G. Aref
Additional Relational Algebra Operators
• Can be realized by compositions of the basic Relational Algebra operators• Intersect: • Given two relations with compatible schemas, find the common tuples in
both input relations• Can be realized by the basic relational algebra operators
• r ∩ s = r - (r - s)
Walid G. Aref
Join
• Natural Join r ⋈ s = π𝑅 𝑈 𝑆 (𝜎r.id = s.id( r X s ))• R U S
• R,S=Schemas of r and s • Means that the common attributes are repeated only once
• r ⋈ s • Join based on equality of all the common attributes between the two tables
• Equi-Join: r ⋈ s = 𝜎r.a = s.b( r X s )
• Has an equality predicate• Theta-Join: r ⋈ s
• Has a general predicate (𝜃 𝑐𝑜𝑚𝑝𝑎𝑟𝑎𝑡𝑜𝑟)
r.a = s.b
r.a > s.b
Walid G. Aref
Division ( ÷ )
• r ÷ s• R ∩ 𝑆 ≠ ∅ (e.g., R contains foreign keys to S)• Find the tuples in r that join with all the tuples in s• For example, find the student who enrolled in all courses• r ÷ s = ÕR-S (r) –ÕR-S ( (ÕR-S (r) x s) – ÕR-S,S(r))• ÕR-S,S(r) reorders the attributes of r
Walid G. Aref
Extended Relational-Algebra Operations
• Aggregate Functions and Operations• Outer Join
Walid G. Aref
Grouping and Aggregate Functions in Relational Algebra• G1, G2, …, Gn g F1( A1), F2( A2),…, Fn( An) (E)• g: Is the grouping operator• Takes as input:
• A relation (Or an relational algebra expression that produces a relation), and • An optional list (can be null) of grouping attributes G1, G2 …, Gn• Aggregate functions F1, F2, … , Fn take as input derived from a certain attribute (A1, A2, …,
An) from the input relation, and each Fi produces a scalar value as output • Example Aggregate Functions:
• count: count the number of values (or number of input tuples)• sum: sum of values• avg: average value• min: minimum value• max: maximum value
• Example: Compute average gpa of students grouped by age: age g avg(gpa) (students)
Walid G. Aref
Outer Join
• Compute the join regularly• Then, add the tuples form
one relation that do not join with null values • Comes in three flavors:• Left outer-join ( )• Right outer-join ( )• Full outer-join ( )
• Relations r and s
r2
s1
s2r1
r s
s2r1 r ⋈ s s2r1
r s
Nulls
s1
r s
Nulls
r s
Nulls
r2 Nulls
s2r1 r⋈sr⋈s s2r1 r⋈s
r ⋈ s
Two Tables: r & s
Walid G. Aref
Modifying the Underlying Relations in Relational Algebra• E is some Relational Algebra expression that returns a table• E can be a table containing one constant tuple• Delete tuples from a relation r: r ¬ r – E• Insert tuples into a relation r: r ¬ r È E• Update value inside one attribute in a tuple in a relation r ¬Õ F1, F2, …, FI, (r)• Set Fi = Ai for the attributes you do not want to change their values• For the attribute you want to update, plug in its place the new value (may depend on
the old value, e.g., old value + Constant, A1+A2, or Constant2, etc.
Walid G. Aref
Query Languages For The Relational Model
Relational Query Languages
Procedural
Relational Algebra
Declarative
Domain Relational Calculus
Query By Example (QBE)
Tuple Relational Calculus
SQL
Walid G. Aref
Query Languages For The Relational Model
Relational Query Languages
Procedural
Relational Algebra
Declarative
Domain Relational Calculus
Query By Example (QBE)
Tuple Relational Calculus
SQL
Walid G. Aref
Tuple Relational Calculus
• Non-Procedural• Query is of the form:
{t | P (t) }• Where t is a tuple and P is a predicate (a formula in Predicate Calculus)• t[A] or t.A: The value of attribute A in t.• t Î r: Tuple t is in relation r
Walid G. Aref
Predicate Calculus Formula
• Set of attributes and constants• Set of comparison operators: (e.g., <, £, =, ¹, >, ³)• Set of connectives: and (Ù), or (v)‚ not (¬)• Implication (Þ): x Þ y, if x is true, then y is true
x Þ y º ¬x v y• Set of quantifiers:• $ t Î r (Q(t)) º ”there exists” a tuple in t in Relation r
such that predicate Q(t) is true• "t Î r (Q(t)) º Q is true “for all” tuples t in Relation r
Walid G. Aref
Expressing Relational Algebra Operators in Tuple Relational Calculus• Select 𝜎A = “123”(𝑟) : {t | t Î r Ù t.A = “123”}• Project 𝜋𝐴𝐶(r) : {u| $ t Î r Ù u.A = t.A Ù u.C = t.C} • Schema for u implicitly becomes AC
• Cross-Product r X s : {ut| u Î r Ù t Î s}• Union r U s : {t | t Î r v t Î s}• Set Difference r – s : {t | t Î r Ù t ∉ s}• Join r ⋈ s : {ut| u Î r Ù t Î s Ù r.A = s.B}
r.A = s.B
Walid G. Aref
Safety of Tuple Relational Calculus Expressions
• An expression {t | P(t)} in the TRC is safe if t appears in one of the relations, tuples, or constants that appear in P• Avoid writing tuple calculus expressions that generate infinite tuples
in a relation.• For example, {t | ¬ t Î r} is in an infinite relation if the domain of any
attribute of relation r is infinite• Solution: Restrict the set of allowable TRC expressions to safe
expressions.• NOTE: this is more than just a syntax condition. • E.g. { t | t.A=1 Ú true } is not safe --- it defines an infinite set with attribute
values that do not appear in any relation or tuples or constants in P.
Walid G. Aref
Domain Relational Calculus
• Non-procedural (Declarative)• Same as TRC but variables refer to attributes not tuples• DRC query form:
{ < x1, x2, …, xn > | P(x1, x2, …, xn)}• x1, x2, …, xn : Domain variables• P : Formula same as that in Predicate Calculus
Walid G. Aref
Expressing Relational Algebra Operators in Domain Relational Calculus• Select 𝜎A = “123”(𝑟) : {<x,y,z> | <x,y,z> Î r Ù x = “123”}• Project 𝜋𝐴𝐶(r) : {<x,z>| $ <x,y,z> Î r} • Cross-Product r X s : {<a,b,c,d,e,f>| <a,b,c> Î r Ù <d,e,f> Î s}• Union r U s : {<x,y,z> | <x,y,z> Î r v <x,y,z> Î s}• Set Difference r – s : {<x,y> | <x,y> Î r Ù <x,y> ∉ s}• Join r ⋈ s : {{<a,b,c,a,e,f>| <a,b,c> Î r Ù <a,e,f> Î s}• Notice join is implicit by plugging the same variable in multiple locations in
the expression}
r.A = s.D
Walid G. Aref
Query Languages For The Relational Model
Relational Query Languages
Procedural
Relational Algebra
Declarative
Domain Relational Calculus
Query By Example (QBE)
Tuple Relational Calculus
SQL
Walid G. Aref