Upload
abhishek-reddy
View
219
Download
0
Embed Size (px)
Citation preview
8/2/2019 cs464_564_08
1/72
CS 464 /CS 564
INTRODUCTION TO
DATABASE MANAGEMENTLecturer: Lydia Tapia
Lecture Eight: Introduction to SQL
http://www.cs.unm.edu/~tapia/cs464_cs564/
Source: Slides by Jeffrey Ullman
8/2/2019 cs464_564_08
2/72
INTRODUCTION TO SQL
Select-From-Where Statements
Subqueries
Grouping and Aggregation
2
8/2/2019 cs464_564_08
3/72
Why SQL?
SQL is a very-high-level language. Say what to do rather than how to do it. Avoid a lot of data-manipulation details needed in
procedural languages like C++ or Java.
Database management system figures outbest way to execute query.
Called query optimization.
3
8/2/2019 cs464_564_08
4/72
Select-From-Where Statements
SELECT desired attributes
FROM one or more tables
WHERE condition about tuples of
the tables
4
8/2/2019 cs464_564_08
5/72
Our Running ExampleAll our SQL queries will be based on the followingdatabase schema. Underline indicates key attributes.
Candies(name, manf)Stores(name, addr, license)
Consumers(name, addr, phone)
Likes(consumer, candy)
Sells(store, candy, price)
Frequents(consumer, store)
5
8/2/2019 cs464_564_08
6/72
ExampleUsing Candies(name, manf), what candies aremade by Hershey?
SELECT nameFROM Candies
WHERE manf = Hershey;
6
Notice SQL uses single-quotes for strings.SQL is case-insensitive, except inside strings.
8/2/2019 cs464_564_08
7/72
Result of Queryname
Twizzler
KitkatAlmondJoy
. . .
7
The answer is a relation with a single attribute,name, and tuples with the name of each candyby Hershey, such as Twizzler.
8/2/2019 cs464_564_08
8/72
Meaning of Single-Relation Query
Begin with the relation in the FROM clause.Apply the selection indicated by the WHEREclause.
Apply the extended projection indicated by theSELECT clause.
8
8/2/2019 cs464_564_08
9/72
Operational Semantics
To implement this algorithm think of a tuplevariable (tv) ranging over each tuple of the
relation mentioned in FROM.Check if the current tuple satisfies theWHERE clause.
If so, compute the attributes or expressions ofthe SELECT clause using the components ofthis tuple.
9
8/2/2019 cs464_564_08
10/72
Operational Semantics
10
Check if
Hershey
name manf
Twizzler Hersheytv
Include tv.namein the result
8/2/2019 cs464_564_08
11/72
* In SELECT clauses
When there is one relation in the FROM clause, *in the SELECT clause stands forall attributes of
this relation.
Example using Candies(name, manf):SELECT *
FROM Candies
WHERE manf = Hershey;
11
8/2/2019 cs464_564_08
12/72
Result of Query:
name manf
Twizzler Hershey
Kitkat Hershey
AlmondJoy Hershey
. . . . . .
12
Now, the result has each of the attributesof Candies.
8/2/2019 cs464_564_08
13/72
Renaming Attributes
If you want the result to have different attributenames, use AS to rename an
attribute.Example based on Candies(name, manf):
SELECT name AS candy, manf
FROM Candies
WHERE manf = Hershey
13
8/2/2019 cs464_564_08
14/72
Result of Query:
candy manf
Twizzler Hershey
Kitkat HersheyAlmondJoy Hershey
. . . . . .
14
8/2/2019 cs464_564_08
15/72
Expressions in SELECT Clauses
Any expression that makes sense can appear asan element of a SELECT clause.
Example: from Sells(store, candy, price):SELECT store, candy,
price * 79 AS priceInYen
FROM Sells;
15
8/2/2019 cs464_564_08
16/72
Result of Query
store candy priceInYen
7-11 Twizzler 285
Smiths Snickers 342
16
8/2/2019 cs464_564_08
17/72
Another Example: Constant Expressions
From Likes(consumer, candy) :
SELECT consumer,likes Kitkats
AS whoLikesKitkats
FROM Likes
WHERE candy =Kitkat
;
17
8/2/2019 cs464_564_08
18/72
Result of Query
consumer whoLikesKitkats
Sally likes Kitkats
Fred likes Kitkats
18
8/2/2019 cs464_564_08
19/72
Complex Conditions in WHERE Clause
From Sells(store, candy, price), find the price that7-11 charges for Twizzlers:
SELECT price
FROM Sells
WHERE store = 7-11 AND
candy = Twizzler;
19
8/2/2019 cs464_564_08
20/72
Patterns
WHERE clauses can have conditions inwhich a string is compared with a pattern,
to see if it matches.
General form: LIKE or NOT LIKE
Pattern is a quoted string with % = anystring; _ = any character.
20
8/2/2019 cs464_564_08
21/72
Example
From Consumers(name, addr, phone) find theconsumers with exchange 555:
SELECT name
FROM Consumers
WHERE phone LIKE%555-_ _ _ _
;
21
8/2/2019 cs464_564_08
22/72
NULL Values
Tuples in SQL relations can have NULL as avalue for one or more components.
Meaning depends on context. Two commoncases:
Missing value: e.g., we know 7-11 has someaddress, but we dont know what it is.
Inapplicable : e.g., the value of attribute spouse foran unmarried person.
22
8/2/2019 cs464_564_08
23/72
Comparing NULLs to Values
The logic of conditions in SQL is really 3-valuedlogic: TRUE, FALSE, UNKNOWN.
When any value is compared with NULL, thetruth value is UNKNOWN.
But a query only produces a tuple in the answer ifits truth value for the WHERE clause is TRUE
(not FALSE or UNKNOWN).
23
8/2/2019 cs464_564_08
24/72
Three-Valued Logic
To understand how AND, OR, and NOT work in3-valued logic, think of TRUE = 1, FALSE = 0,
and UNKNOWN = .
AND = MIN; OR = MAX, NOT(x) = 1-x.Example:TRUE AND (FALSE OR NOT(UNKNOWN)) =
MIN(1, MAX(0, (1 - ))) =
MIN(1, MAX(0, ) = MIN(1, ) = =
UNKNOWN.
24
8/2/2019 cs464_564_08
25/72
25
8/2/2019 cs464_564_08
26/72
Surprising Example
From the following Sells relation:store candy price
7-11 Twizzler NULL
SELECT store
FROM Sells
WHERE price < 2.00 OR price >= 2.00;
26
UNKNOWN UNKNOWN
UNKNOWN
8/2/2019 cs464_564_08
27/72
Reason: 2-Valued Laws != 3-Valued Laws
Some common laws, like commutativity of AND,hold in 3-valued logic.
But not others, e.g., the law of the excludedmiddle:p OR NOTp = TRUE.
Whenp = UNKNOWN, the left side is MAX( , (1 )) = != 1.
27
8/2/2019 cs464_564_08
28/72
Multirelation Queries
Interesting queries often combine data from morethan one relation.
We can address several relations in one query bylisting them all in the FROM clause.
Distinguish attributes of the same name by.
28
8/2/2019 cs464_564_08
29/72
Example
Using relations Likes(consumer, candy) andFrequents(consumer, store), find the candies likedby at least one person who frequents 7-11.
SELECT candyFROM Likes, Frequents
WHERE store = 7-11 AND
Frequents.consumer =
Likes.consumer;
29
8/2/2019 cs464_564_08
30/72
Formal Semantics
Almost the same as for single-relation queries: Start with the product of all the relations in the
FROM clause.
Apply the selection condition from the WHEREclause.
Project onto the list of attributes and expressions inthe SELECT clause.
30
8/2/2019 cs464_564_08
31/72
Operational Semantics
Imagine one tuple-variable for each relation in theFROM clause.
These tuple-variables visit each combination of tuples,one from each relation.
If the tuple-variables are pointing to tuples thatsatisfy the WHERE clause, send these tuples to
the SELECT clause.
31
8/2/2019 cs464_564_08
32/72
Example
32
consumer store consumer candy
tv1 tv2Sally Twizzler
Sally 7-11
LikesFrequents
to outputcheck theseare equal
check
for 7-11
8/2/2019 cs464_564_08
33/72
Explicit Tuple-Variables
Sometimes, a query needs to use two copiesof the same relation.
Distinguish copies by following the relationname by the name of a tuple-variable, in the
FROM clause.
Its always an option to rename relations thisway, even when not essential.
33
8/2/2019 cs464_564_08
34/72
Example
From Candies(name, manf), find all pairs ofcandies by the same manufacturer.
Do not produce pairs like (Twizzler, Twizzler). Produce pairs in alphabetic order, e.g. (Kitkat,
Twizzler), not (Twizzler, Kitkat).
SELECT c1.name, c2.name
FROM Candies c1, Candies c2WHERE c1.manf = c2.manf AND
c1.name < c2.name;
34
tuplevariables
8/2/2019 cs464_564_08
35/72
Subqueries
A parenthesized SELECT-FROM-WHEREstatement (subquery) can be used as a value
in a number of places, including FROM and
WHERE clauses.Example: in place of a relation in the FROMclause, we can place another query, and then
query its result.
Can use a tuple-variable to name tuples of theresult.
35
8/2/2019 cs464_564_08
36/72
Subqueries That Return One Tuple
If a subquery is guaranteed to produce onetuple, then the subquery can be used as a
value. Usually, the tuple has one component. A run-time error occurs if there is no tuple or more
than one tuple.
36
8/2/2019 cs464_564_08
37/72
Example
From Sells(store, candy, price), find the storesthat sell Kitkats for the same price 7-11
charges for Twizzlers. Two queries would surely work:
Find the price 7-11 charges for Twizzlers. Find the stores that sell Kitkats at that price.
37
8/2/2019 cs464_564_08
38/72
Query + Subquery Solution
SELECT store
FROM Sells
WHERE candy = Kitkat AND
price = (SELECT price
FROM Sells
WHERE store= 7-11
AND candy = Twizzler);
38
The price atwhich 7-11
sells Twizzlers
8/2/2019 cs464_564_08
39/72
The IN Operator
IN is true if and only if thetuple is a member of the relation. NOT IN means the opposite.
IN-expressions can appear in WHERE clauses.The is often a subquery.
39
8/2/2019 cs464_564_08
40/72
Example
From Candies(name, manf) and Likes(consumer,candy), find the name and manufacturer of each
candy that Fred likes.
SELECT *
FROM Candies
WHERE name IN (SELECT candy
FROM LikesWHERE consumer =Fred);
40
The set of
candies Fredlikes
8/2/2019 cs464_564_08
41/72
The Exists Operator
EXISTS( ) is true if and only if the is not empty.
Example: From Candies(name, manf) , findthose candies that are the unique candy bytheir manufacturer.
41
8/2/2019 cs464_564_08
42/72
Example Query with EXISTS
SELECT name
FROM Candies c1
WHERE NOT EXISTS(
SELECT *
FROM Candies
WHERE manf = c1.manf AND
name c1.name);
42
Set ofcandieswith thesamemanf as
c1, butnot thesamecandy
Notice scope rule: manf refersto closest nested FROM witha relation having that attribute.
Notice theSQL notequalsoperator
From Candies(name, manf) ,find those candies that are
the unique candy by their
manufacturer.
8/2/2019 cs464_564_08
43/72
The Operator ANY
x= ANY( ) is a boolean condition trueifxequals at least one tuple in the relation.
Similarly, = can be replaced by any of thecomparison operators.
Example:x> ANY( ) meansxis notthe smallest tuple in the relation.
Note tuples must have one component only.
43
8/2/2019 cs464_564_08
44/72
The Operator ALL
Similarly,x ALL( ) is true if andonly if for every tuple t in the relation,xis not
equal to t. That is,xis not a member of the relation.
The can be replaced by any comparisonoperator.
Example:x>= ALL( ) means thereis no tuple larger thanx in the relation.
44
8/2/2019 cs464_564_08
45/72
Example
From Sells(store, candy, price), find the candiessold for the highest price.
SELECT candy
FROM Sells
WHERE price >= ALL(
SELECT price
FROM Sells);
45
price from the outerSells must not beless than any price.
8/2/2019 cs464_564_08
46/72
Union, Intersection, and Difference
Union, intersection, and difference of relations areexpressed by the following forms, each involving
subqueries:
( subquery ) UNION ( subquery ) ( subquery ) INTERSECT ( subquery ) ( subquery ) EXCEPT ( subquery )
46
8/2/2019 cs464_564_08
47/72
47
8/2/2019 cs464_564_08
48/72
Example
From relations Likes(consumer, candy), Sells(store, candy, price), and Frequents
(consumer, store), find the consumers and
candies such that: The consumer likes the candy, and The consumer frequents at least one
store that sells the candy.
48
8/2/2019 cs464_564_08
49/72
Solution
(SELECT * FROM Likes)
INTERSECT
(SELECT consumer, candyFROM Sells, Frequents
WHERE Frequents.store = Sells.store
);
49
The consumer frequentsa store that sells thecandy.
8/2/2019 cs464_564_08
50/72
Bag Semantics
Although the SELECT-FROM-WHEREstatement uses bag semantics, the default for
union, intersection, and difference is set
semantics.
That is, duplicates are eliminated as the operation isapplied.
50
8/2/2019 cs464_564_08
51/72
Motivation: Efficiency
When doing projection, it is easier to avoideliminating duplicates.
Just work tuple-at-a-time.For intersection or difference, it is mostefficient to sort the relations first.
At that point you may as well eliminate theduplicates anyway.
51
8/2/2019 cs464_564_08
52/72
Controlling Duplicate Elimination
Force the result to be a set by SELECTDISTINCT . . .
Force the result to be a bag (i.e., donteliminate duplicates) by ALL, as in . . .
UNION ALL . . .
52
8/2/2019 cs464_564_08
53/72
Example: DISTINCT
From Sells(store, candy, price), find all thedifferent prices charged for candies:
SELECT DISTINCT price
FROM Sells;Notice that without DISTINCT, each price wouldbe listed as many times as there were store/candy pairs at that price.
53
8/2/2019 cs464_564_08
54/72
Example: ALL
Using relations Frequents(consumer, store) andLikes(consumer, candy):
(SELECT consumer FROM Frequents)EXCEPT ALL
(SELECT consumer FROM Likes);
Lists consumers who frequent more stores thanthey like candies, and does so as many times asthe difference of those counts.
54
8/2/2019 cs464_564_08
55/72
Join Expressions
SQL provides several versions of (bag) joins.These expressions can be stand-alone queries orused in place of relations in a FROM clause.
55
8/2/2019 cs464_564_08
56/72
Products and Natural Joins
Natural join:R NATURAL JOIN S;
Product:R CROSS JOIN S;
Example:Likes NATURAL JOIN Sells;
Relations can be parenthesized subqueries, aswell.
56
8/2/2019 cs464_564_08
57/72
Theta Join
R JOIN S ON Example: using Consumers(name, addr) andFrequents(consumer, store):
Consumers JOIN Frequents ON
name = consumer;
gives us all (c, a, c, s) quadruples such that
consumerclives at address a and frequents stores.
57
8/2/2019 cs464_564_08
58/72
Outerjoins
R OUTER JOIN S is the core of an outerjoinexpression. It is modified by:
1. Optional NATURAL in front of OUTER.2. Optional ON after JOIN.3. Optional LEFT, RIGHT, or FULL before OUTER.
LEFT = pad dangling tuples of R only. RIGHT = pad dangling tuples of S only. FULL = pad both; this choice is the default.
58
8/2/2019 cs464_564_08
59/72
Aggregations
SUM, AVG, COUNT, MIN, and MAX can beapplied to a column in a SELECT clause to
produce that aggregation on the column.
Also, COUNT(*) counts the number of tuples.
59
8/2/2019 cs464_564_08
60/72
Example: Aggregation
From Sells(store, candy, price), find the averageprice of Twizzlers:
SELECT AVG(price)
FROM Sells
WHERE candy = Twizzler;
60
8/2/2019 cs464_564_08
61/72
Eliminating Duplicates in an Aggregation
Use DISTINCT inside an aggregation.Example: find the number of different pricescharged for Twizzlers:
SELECT COUNT(DISTINCT price)
FROM Sells
WHERE candy =Twizzler
;
61
8/2/2019 cs464_564_08
62/72
NULLs Ignored in Aggregation
NULL never contributes to a sum, average, orcount, and can never be the minimum or
maximum of a column.
But if there are no non-NULL values in acolumn, then the result of the aggregation is
NULL.
62
63
8/2/2019 cs464_564_08
63/72
Example: Effect of NULLs
SELECT count(*)
FROM Sells
WHERE candy = Twizzler;
SELECT count(price)
FROM Sells
WHERE candy = Twizzler;
63
The number of storesthat sell Twizzlers.
The number of storesthat sell Twizzlers at aknown price.
64
8/2/2019 cs464_564_08
64/72
Grouping
We may follow a SELECT-FROM-WHEREexpression by GROUP BY and a list of attributes.
The relation that results from the SELECT-FROM-WHERE is grouped according to the values of allthose attributes, and any aggregation is applied
only within each group.
64
65
8/2/2019 cs464_564_08
65/72
Example: Grouping
From Sells(store, candy, price), find the averageprice for each candy:
SELECT candy, AVG(price)
FROM Sells
GROUP BY candy;
65
66
8/2/2019 cs464_564_08
66/72
Example: Grouping
From Sells(store, candy, price) and Frequents(consumer, store), find for each consumer the
average price of Twizzlers at the stores they
frequent:SELECT consumer, AVG(price)
FROM Frequents, Sells
WHERE candy = Twizzler AND
Frequents.store = Sells.store
GROUP BY consumer;
66
Computeconsumer-store-pricefor Twiz.tuples first,then group
by consumer.
67
8/2/2019 cs464_564_08
67/72
Restriction on SELECT Lists With Aggregation
If any aggregation is used, then eachelement of the SELECT list must be either:
1. Aggregated, or2. An attribute on the GROUP BY list.
67
68
8/2/2019 cs464_564_08
68/72
Query Example Correct or Not?
You might think you could find the store thatsells Twizzlers the cheapest by:
SELECT store, MIN(price)FROM Sells
WHERE candy = Twizzler;
But this query is illegal in SQL.
68
69
8/2/2019 cs464_564_08
69/72
HAVING Clauses
HAVING may follow a GROUP BYclause.
If so, the condition applies to each group, andgroups not satisfying the condition are eliminated.
69
70
8/2/2019 cs464_564_08
70/72
Example: HAVING
From Sells(store, candy, price) and Candies(name, manf), find the average prices of those
candies that are either sold in at least three stores
or are manufactured by Nestle.
70
71
8/2/2019 cs464_564_08
71/72
Solution
SELECT candy, AVG(price)
FROM Sells
GROUP BY candyHAVING COUNT(store) >= 3 OR
candy IN (SELECT name
FROM Candies
WHERE manf = Nestle);
71
Candiesmanufactured
by Nestle.
Candy groups with at least3 non-NULL stores and alsocandy groups where themanufacturer is Nestle.
72
8/2/2019 cs464_564_08
72/72
Requirements on HAVING Conditions
These conditions may refer to any relation ortuple-variable in the FROM clause.
They may refer to attributes of those relations,as long as the attribute makes sense within agroup; i.e., it is either:
A grouping attribute, or Aggregated.
72