Upload
others
View
15
Download
1
Embed Size (px)
Citation preview
MANAGING DATA(BASES) USING SQL
(NON-PROCEDURAL SQL, X401.9)
Professional Program: Data Administration and Management
Instructor: Michael Kremer, Ph.D.Technology & Information Management
Class 7
AGENDA
10. Subqueries
10.1 Overview of Subqueries
10.2 Subqueries as Column Expressions
10.3 Subqueries as Filter Expressions
10.4 Subqueries as Datasource Expressions
10.5 Correlated Subqueries
11. Advanced Subqueries
11.1 Pivot
11.2 Subquery Factoring
11.3 Top-N-Analysis
Subqueries
10.
10.1 OVERVIEW OF SUBQUERIES
❑ Subquery is a query inside a main query (or inner, nested
SELECT, or sub-SELECT)
❑ Subquery can be placed in different
locations of the main query:
❑ Same syntax with a
few exceptions:
❑ Two main types of subqueries,
based on processing order:
❑ Simple or nested subquery is a subquery that is processed completely
before the outer or main query is executed
❑ Linked or correlated is processed for each row of the outer query
❑ More than one level of nesting, innermost query is processed
first → result is used in next query, etc.
195
10.1 OVERVIEW OF SUBQUERIES
❑ Subqueries are categorized
by the structure of data
returned:
❑ Basic example:
❑ Display all employees
earning more than the
average of all employees
❑ What kind of subquery?
A scalar subquery, one
and only one value returned!
196
10.1 OVERVIEW OF SUBQUERIES
❑ Row subqueries used in the WHERE clause of the main query
❑ SQL standard defines row value constructor (used in WHERE,
HAVING, ON clause):
❑ In Oracle, only in
conjunction with subqueries:
❑ Table subquery mostly used
in FROM clause:
❑ Scalar subquery returns at
most only one value, it is
most widely used:
197
10.2 SUBQUERIES AS COLUMN EXPRESSIONS
❑ Column reference in the SELECT clause can be replaced by a
subquery
❑ Must use scalar subquery in place of a column reference since
column reference returns exactly one value per row of the query
❑ Many options in
SELECT clause:
❑ To guarantee
scalar value:
❑ Aggregate query with no GROUP BY (or WHERE clause to filter)
❑ Regular query with primary key as the filter
❑ Regular query selecting one specific value
198
10.2 SUBQUERIES AS COLUMN EXPRESSIONS
❑ Error message
when not
using a scalar
subquery:
❑ Error message
when selecting
more than one
column in
subquery:
199
10.2 SUBQUERIES AS COLUMN EXPRESSIONS
❑ Back to first
example, now add
same subquery
from WHERE
clause as column
expression:
200
10.2 SUBQUERIES AS COLUMN EXPRESSIONS
❑ Correlated subquery
as column
expression
❑ Link subquery to
main query using
department_id
❑ Result does not
make sense, the
department max
salary is the same
for all departments!
❑ Department_id in WHERE clause is the same on both sides,
subquery is using the department_id from subquery FROM
clause (departments table)
201
10.2 SUBQUERIES AS COLUMN EXPRESSIONS
❑ Must link
departments.department_id
to employees.department_id
❑ Use table aliases
(for shorter table names) in order to fully qualify column
references!
❑ Same rules as with table joins!
❑ You can remove department_id from SELECT clause, it is not
necessary to reference it here for the link
❑ All column values are available in memory, even if not selected!
202
10.2 SUBQUERIES AS COLUMN EXPRESSIONS
❑ Correlated
subquery
using table
aliases to fully
qualify column
references
203
10.2 SUBQUERIES AS COLUMN EXPRESSIONS
❑ Calculate how
many times
an employee
switched jobs:
❑ Note sorting
takes place
on subquery
column using
column alias!
204
10.2 SUBQUERIES AS COLUMN EXPRESSIONS
❑ Multiple, correlated
subqueries
❑ Show the number of
locations and
departments by
country
205
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Most common application of subqueries in WHERE/HAVING:
❑ Condition is made up of column/expr on left side
❑ Relational operator
❑ Values or subquery on the right side
❑ Relational operators are
grouped into two categories:
❑ Important to understand
how many rows a subquery
may potentially return!
❑ Example using scalar
subquery in WHERE clause:
❑ Why scalar? Employee_id
is primary key, at most one
row!
206
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Rules when using
subqueries as
filter expressions:
207
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Because of GROUP BY
clause subquery
may return multiple
rows → must use
multiple row operator!
❑ Without GROUP BY →
scalar subquery
208
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ You can use multiple
subqueries, either
nested or multiple
subquery conditions
connected through
Boolean operators
❑ Example shows two
nested subqueries:
209
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Subquery can also
be used in
HAVING clause
❑ Example: Show all
departments where
the minimum salary
in each department
is greater than the
minimum salary in
department 50:
210
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Multiple row
operators:
❑ When using multiple-row operators the subquery may return
more than one value, that is, many rows of the same column
211
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Show all employees who
earn exactly the same
salary as one of the
minimum salary for
each department
212
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ IN operator only allows to
compare the equality of
values.
❑ ANY or ALL operators in
conjunction with greater
or less than operators:
❑ ANY requires only one
value from the list to
satisfy the condition
❑ ALL requires all values
from the list to meet the
condition
213
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Previous example
can be rewritten
using MAX function:
❑ Probably a better
solution → if your query
complexity allows it!
214
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Modify example:
❑ Display all employees
whose salary is less
than that of all
purchasing clerks
❑ Fewer results due to
the fact that now
salary must be less
than all salary values
of PU_CLERK
❑ Can be rewritten
using MIN function
215
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Subquery not
returning any
result is set to
Null
❑ Any condition
involving Null
results in False
❑ Or subquery
returning
actual Null
value
❑ Manager_id
is null for one
employee!
216
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Not IN needs to check
all records → one
record returns Null
→ condition is False
❑ Rewrite subquery
to exclude Null
values
217
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Using just IN
works with
Null values
since only one
value must
match
218
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Multiple-column subquery
using compound WHERE
❑ Show employees having
the same combination
of manager_id and
department_id as
employees 141 and
147
❑ This is called pairwise
comparison, each row
of the main query
must match the row of
the subquery
219
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Column comparison
methods:
Pairwise Comparison:
❑ Valid combinations of manager_id
and department_id:
100,20 OR 105,35 OR 121,45
❑ Implementation:
Non-Pairwise Comparison:
❑ Valid combinations:
100,20 OR 100,35 OR 100,45
105,20 OR 105,35 OR 105,45
121,20 OR 121,35 OR 121,45
❑ Implementation:
220
10.3 SUBQUERIES AS FILTER EXPRESSIONS
❑ Non-Pairwise
comparison
221
10.4 SUBQUERIES AS DATASOURCE EXPRESSIONS
❑ Subquery as datasource in FROM clause, also called inline view:
❑ Break down complex queries into different steps
❑ Change SQL processing order by forcing subquery processing
❑ Rewrite previous example
of income categories
and GROUP BY clause:
222
10.4 SUBQUERIES AS DATASOURCE EXPRESSIONS
❑ Show last_name, salary, department_id and average salary per
department
❑ Cannot be done in regular
aggregate query since
last_name and salary is not
included in GROUP BY
❑ Use subquery to perform
aggregation per department
❑ Use outer query to add
last_name and salary and
join on subquery
223
10.5 CORRELATED SUBQUERIES
❑ Correlated or linked subquery works fundamentally different
from the simple or nested subqueries
❑ Correlated subquery references a column of the main query to
establish a link to the main query
❑ Nested/Simple
Subquery:
❑ Correlates Subquery:
❑ Correlated subquery
can be used as
column, filter,
and as FROM
clause expression
224
10.5 CORRELATED SUBQUERIES
❑ Show employees
whose salary is
greater than the
average salary in
their respective
department
❑ Similar to previous
example
❑ Performance wise
probably worse since
average salary is
calculated for every
row of the outer
query
225
10.5 CORRELATED SUBQUERIES
❑ Show employees who have switched jobs at least twice
226
10.5 CORRELATED SUBQUERIES
❑ Show all managers
❑ Use correlated
subquery in WHERE
clause and EXISTS
operator
❑ EXISTS tests for
mere existence of
a record:
❑ If record exists → TRUE
❑ If no record exists → FALSE
227
10.5 CORRELATED SUBQUERIES
❑ Instead of EXISTS
use IN operator
228
10.5 CORRELATED SUBQUERIES
❑ Find all departments
that do not have any
employees
associated with it
229
10.5 CORRELATED SUBQUERIES
❑ Instead of NOT EXISTS,
use NOT IN
❑ Again, be aware of the
NULL issue!
❑ Need to add condition
in subquery to exclude
NULL values
230
Advanced Subqueries
11.
11.1 PIVOT
❑ Use the PIVOT operator to display data horizontally
❑ Oracle PIVOT clause allows you to write a cross-tabulation query
❑ This means that
you can aggregate
your results and
rotate rows into
columns
231
11.1 PIVOT
❑ Previous
example
rewritten
using
PIVOT operator
232
11.2 SUBQUERY FACTORING
❑ Performance improvement
can be achieved by using the
WITH clause inside a query
block
❑ Useful when a SQL statement
needs to be evaluated more than once within one statement
❑ Rules:
❑ Next Example:
233
11.2 SUBQUERY FACTORING
234
11.2 SUBQUERY FACTORING
❑ Rewrite previous query
using WITH clause:
❑ Show last_name, salary,
department_id and average
salary per department
235
11.3 TOP-N-ANALYSIS
❑ Retrieves the greatest or least n values based on a particular
sort order
❑ Use ROWNUM pseudo
column to implement
Top-N-Analysis
❑ Use subquery with
desired sort order
❑ Use ROWNUM in main
query to limit to n
rows
236