23
Slide 1 NULLs & Outer Joins Objectives of the Lecture : To consider the use of NULLs in SQL. To consider Outer Join Operations, and their implementation in SQL.

Slide 1 Objectives of the Lecture : implementation in SQL.computing.unn.ac.uk/raqueldbsystem/content/RAQUEL the...Slide 1 NULLs & Outer Joins Objectives of the Lecture : To consider

  • Upload
    lybao

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Slide 1

NULLs & Outer Joins

Objectives of the Lecture :

•To consider the use of NULLs in SQL.

•To consider Outer Join Operations, and their

implementation in SQL.

Slide 2

Missing Values : Possible Strategies

Use a special value to represent missing data.

E.g. „N/A‟, „T.B.A.‟

The special value must have the same data type as the data

that is missing, so it can be stored with the data that is

known.

Requires no special facility from the DBMS.

Use NULL to represent missing data.

NULL is the absence of a value.

NULL 0 NULL „ ‟

NULL is not part of any data type.

Requires special support from the DBMS.

SQL DBMSs provide this support.

So most DBs use NULLs to represent missing data.

space

character

This is revision based on part of the earlier lecture „The Data in a Relation‟.

Slide 3

Display of SQL NULLs

EmpNo EName M-S Sal

E3 Smith S 18,000

E5 Robinson M 24,000

E9 Graham S

E1 Robson D 32,500

E2 Atkins 24,000

E6 Blakelaw M 54,000

E7 Mortimer D 28,000

E4 Fenwick S 40,000

Blank space

in Oracle.

Keyword NULL in

other SQL DBMSs.

Other possibilities

in other DBMSs.

This is how Oracle displays NULLs in a retrieved table. Other SQL DBMSs use

different conventions to display a NULL.

Slide 4

Dealing with NULLs in SQL Tables

Three situations arise :

Comparisons of column values.This occurs in the SQL equivalents

of the Restrict and the various Join operations, plus Deletionsand Updates.

Calculations involving column values.This occurs in the SQL equivalents

of the GroupBy and Extend operations.

Comparisons of row values.This occurs in the SQL equivalents

of the Project, GroupBy, Union, Intersect, and Differenceoperations.

Each of these three cases is now considered in turn :-

Case One : Comparison of Column Values

Slide 5

Comparison of Column Values (1)

SQL provides special comparators to check for NULL :-

X IS NULL

X IS NOT NULL

Let X be a numeric column. If X has a value, the comparison

X = 3

makes sense. It should yield true or false.

Suppose X is NULL. An error should arise.

In fact SQL treats the NULL as representing an existing but

unknown value. Comparison returns maybe.

Rationale : We don‟t know if X = 3

because X is NULL (= not available).

Note : X may represent some other case of missing data

(e.g. not applicable, does not exist).

The result is still maybe even though this is then illogical.

NULLs can be used to represent many different cases of missing data. Each different

case may require its own rationale for how to handle missing data, and they can vary

significantly. So SQL‟s choice of rationale will generally only be valid in certain cases.

Slide 6

Comparisons of Column Values (2)

Let X and Y be a numeric columns. Consider the comparison

X = Y

Suppose X and Y are both NULL.

The result is maybe not true.

NULL is not the same as maybe.

Absence

of a value.

A truth

value.

SQL uses NULL to mean maybe !

To avoid confusion in the remainder of the lecture, we will still assume the value maybe

exists in SQL.

Slide 7

Restricts, Joins, Updates and Deletions

Restrict SELECT *FROM TableNameWHERE condition ;

Join SELECT *FROM Table1 NATURAL JOIN Table2 ;

Delete DELETE FROM TableNameWHERE condition ;

Update UPDATE TableNameSET column(s) = new value(s)WHERE condition ;

Restrict / Join / Delete / Update action takenonly where condition evaluates to true,

not where it evaluates to maybe or false.

Column

comparison

used as a

condition

Similarly for other

kinds of Join.

In principle, there is no problem with this.

In practice, problems arise because the presence of NULLs is forgotten, or it is assumed

that SQL will take the same „reasonable‟ approach to NULLs that the user does, when in

fact SQL doesn‟t take that approach.

We need to assume NULLs may occur and include appropriate conditions for them.

Consider an example of what happens if this is forgotten :-

Slide 8

Unexpected Results (1)

They arise when forgetting that the condition can evaluate to maybe.

Example :-

SELECT *

FROM EMP

WHERE Sal >= 20000

UNION

SELECT*

FROM EMP

WHERE Sal < 20000 ;

the 2 Restrictions will not necessarily contain all the rows

of EMP between them.

If column „Sal‟

contains any NULLs,

the result will not

re-create table EMP.

Of course it is not assumed that you will use 2 queries simply to re-create the original

table !

Each Restriction condition is the logical inverse of the other. Consequently there is a

temptation to think that those rows not retrieved from the table by one query must

inevitably be retrieved by the other query. This is guaranteed to be true if it were not for

NULLs. If there are NULLs in the table, there will be a third set of rows only retrieved

from the table by a Restriction condition that retrieves those rows containing NULLs.

Therefore it is essential to decide when formulating the required Restriction condition

whether the NULL-including rows should be retrieved as well.

The following shows the Restriction conditions that correspond to the three sets of rows

that arise when NULLs are included.

Slide 9

Unexpected Results (2)

To ensure the table is re-created,

re-write the query as follows :-

SELECT *FROM EMPWHERE Sal >= 20000

UNIONSELECT *FROM EMPWHERE Sal < 20000

UNIONSELECT *FROM EMPWHERE Sal IS NULL ;

In general,adjust

statementsto reflect the

NULLpossibility.

In this particular case, should the rows with “NULL Sals” be retrieved with the salaries

that are £20,000 or more, or with those that are less than £20,000; or do we genuinely

want to ignore the “NULL Sals” ?

Slide 10

Join involving NULLs : Example

P# S# Qty

P1 S1 5

P2 10

P2 S2 7

S# Details

S 1 ……..

S 2 ……..

S 3 ……..

P# S# Qty Details

P1 S1 5 ………..

P2 S2 7 ………..

This row does not

appear in the result. SELECT *

FROM R Natural Join S ;

R S

If we want to include the missing row, we need to replace the

R Natural Join S

with

R Join S Using On( some condition involving NULLs )

Slide 11

Oracle‟s “NVL” Function

NVL supplies a value to use whenever a NULL is encountered.

It can be used in SELECT and WHERE phrases.

Example : NVL( Sal, 0 )

This yields the value of the „Sal‟ column, except that if „Sal‟ is

NULL, then a value of zero is returned.

NVL can be used to ensure a comparison always yields

true or false, and never maybe.

Example : ……... WHERE NVL( M-S, „S‟ ) <> „M‟

Put a column name in the first position.

Put a value in the second position.

Value „S‟ is used in the comparison when „M-S‟ is NULL.

Comparison can never return maybe.

Most SQL DBMSs have some function analogous to Oracle‟s NVL, although it may be

somewhat different.

Such a function can be very useful in practice, and can make it easier to design conditions

that cope with possible NULLs.

Case Two : Calculations Involving Column Values

Slide 12

Calculations Involving Column Values

These arise in two situations :

Scalar calculations along rows

Extend

Aggregate calculations

along columns

GroupBy

EmpNo EName M-S Sal

E3 Smith S

E5 Robinson M 24,000

E9 Graham S

E1 Robson D 32,500

E2 Atkins M

E6 Blakelaw M

E7 Mortimer D 28,000

E4 Fenwick S 40,000

We will now consider these two situations :-

Slide 13

Scalar Calculations

Any calculation involving a NULL

results in a NULL.

Examples : let n be NULL. Then :-

n + 1 NULL

n concatenate “ABCD” NULL

n - n NULL (not zero)

Example :-

SELECT Sal + 100 AS NewSal

FROM EMP ;

So “NewSal” will be NULL whenever “Sal” is NULL.

This can give some surprising results. Take care !

Slide 14

Aggregate Calculations

If the columns being aggregated contain one or more NULLs,

then the answer from :

Sum

Avg

Min ignores the NULLs.

Max

Count( Distinct )

Count(*) includes the NULLs.

Not all of these are mathematically valid. Take note and take care !

Slide 15

Example : Aggregation in GroupBy

SELECT Sum(Sal) AS Total, M-S

FROM EMP

GROUP BY M-S ;

EmpNo EName M-S Sal

E3 Smith S

E5 Robinson M 24,000

E9 Graham S

E1 Robson D 32,500

E2 Atkins M

E6 Blakelaw M

E7 Mortimer D 28,000

E4 Fenwick S 40,000

Total M-S

40,000 S

24,000 M

60,500 D

We consider here the aggregate calculation aspects of GroupBy, not the grouping

aspects.

Depending on the circumstances, what SQL does about NULLs may or may not be

appropriate. See the earlier discussion on the rationale concerning missing data.

Case Three : Comparison of Row Values

Slide 16

Comparisons of Rows

In SQL, two rows are identical if :

• they have the same number of attributes;

• corresponding attributes are of the same data type;

• corresponding attributes have the same value.

In an SQL row comparison,

a NULL compared to a NULL

true

In an SQL column comparison

(e.g. for a Join operation)

a NULL compared to a NULL

maybe

Different !!

Slide 17

Example : Row Comparison

Comparison of M-S column values :

Row Comparison : 2 NULLs are defined to be identical.

A comparison yields true !!

these 2 rows are identical.

Column Comparison : 2 NULLs are not assumed identical.

A comparison yields maybe !!

these rows are rejected.

E2 Atkins 24,000

E2 Atkins 24,000

E_No E_Name M_S Sal

Slide 18

Project, GroupBy, & Set Operators

Project SELECT DISTINCT ColumnName(s)

FROM TableName ;

GroupBy SELECT “Aggregation(s)”, GroupingCol(s)

FROM TableName

GROUP BY GroupingCol(s) ;

Set Ops SELECT *

FROM TableName1

UNION

SELECT *

FROM TableName2 ;

Project / GroupBy (grouping rows) / Union / Intersect / Except

action taken on the basis that all NULLs are identical.

Similarly for

the other

Set Ops,

Intersect &

Except/Minus.

Consider some examples :-

Slide 19

Example : Projection

SELECT DISTINCT M-S

FROM EMP ;

EmpNo EName M-S Sal

E3 Smith S 18,000

E5 Robinson M 24,000

E9 Graham 18,000

E1 Robson D 32,500

E2 Atkins 24,000

E6 Blakelaw M 54,000

E7 Mortimer D 28,000

E4 Fenwick W 40,000

M-S

S

M

D

W

Slide 20

Example : Grouping in „GroupBy‟

SELECT “Aggregation”, M-S

FROM EMP

GROUP BY M-S ;

Aggregate M-S

“Agg-Val1” S

“Agg-Val2” M

“Agg-Val3”

“Agg-Val4” D

EmpNo EName M-S Sal

E3 Smith S 18,000

E5 Robinson M 24,000

E9 Graham 18,000

E1 Robson D 32,500

E2 Atkins 24,000

E6 Blakelaw M 54,000

E7 Mortimer D 28,000

E4 Fenwick S 40,000

This example concerns the grouping of rows, not the calculation of aggregate values.

Slide 21

Example : Union Operation

Union

EmpNo EName M-S Sal

E3 Smith S 18,000

E5 Robinson M 24,000

E9 Graham S

E1 Robson D 32,500

E2 Atkins 24,000

EmpNo EName M-S Sal

E1 Robson D 32,500

E2 Atkins 24,000

E6 Blakelaw M 54,000

EmpNo EName M-S Sal

E3 Smith S 18,000

E5 Robinson M 24,000

E9 Graham S

E1 Robson D 32,500

E2 Atkins 24,000

E6 Blakelaw M 54,000

Outer Joins

Slide 22

Joins - Inner versus Outer

All joins considered so far are Inner Joins.

Only a subset of each operand‟s tuples appear in the result.

These are the tuples that match each other in the 2 operands.

(Match the comparison (of whatever kind) is true).

The unmatched tuples don‟t appear in the result.

Sometimes it is useful to have unmatched tuples in the result as

well. Outer Join

Three kinds of Outer Join, to retain in the result all the

unmatched tuples from :

• „Left‟ operand,

• „Right‟ operand,

• „Left‟ and „Right‟ operands.

Consider now some illustrations of Inner and Outer Joins.

For convenience, a Natural Join is always assumed, but the same principles apply to

every type of join.

Slide 23

Inner Joins

(Natural)

Unmatched tuples are not in the result.

unmatched

unmatched

Slide 24

Outer Join : Left

? ? ? ?

(Natural)

Some unmatched tuples are in the result.

unmatched „padding‟

unmatched

unmatched

Slide 25

Outer Join : Right

? ? ? ? ? ? ? ?

(Natural)

Some unmatched tuples are in the result.

unmatched„padding‟

unmatched

unmatched

Slide 26

Outer Join : Full

? ? ? ? ? ? ? ?

? ? ? ?

(Natural)

All unmatched tuples are in the result.

unmatched

unmatched „padding‟

„padding‟

unmatched

unmatched

Slide 27

Outer Joins in SQL

What “padding” attribute values are used with the unmatched

columns ?

What syntax is used for outer joins ?

Natural Join,

Join Using( ColNames ),

Join On( condition ).

Each of these can be used for Left, Right and Full outer joins.

9 possibilities.

SQL uses NULLs.

An extension of the FROM phrase inner join syntax.

Although 9 possible kinds of outer join may seem a lot to cope with, it is made easier to

cope with by remembering that they are the same 3 kinds of joins as for inner joins, and

each can be used orthogonally with the 3 kinds of „Outer‟ facility; i.e. we decide

independently on the kind of join and the kind of „Outer‟ facility required, and then just

put the two together.

Slide 28

SQL2 Outer Natural Joins

SELECT *FROM R Natural Join S ;

Left

Right

Full

Outeroptionally inserted

Examples :-

SELECT *

FROM SUPP Natural Left Outer Join SHIP ;

Result retains all the unmatched rows of LHS table, i.e. SUPP.

SELECT *

FROM SUPP Natural Right Join SHIP ;

Result retains all the unmatched rows of RHS table, i.e. SHIP.

Slide 29

The Other Two SQL2 Outer Joins

SELECT *

FROM R Join S Using ( attribute(s) ) ;

SELECT *

FROM R Join S On ( condition ) ;

Example :-

SELECT *

FROM SUPP Left Outer Join SHIP Using( S# ) ;

Left and right refer to the tables written to left and right of the

join operator.

Logically only left or right is required, but it is convenient to

have both.

Left

Right

FullOuteroptionally inserted

Left

Right

Full

Useful syntax rules to remember :

If „outer‟ is optionally used, it always comes after the keyword left/right/full.

The keyword(s) left/right/full (outer) always come before the keyword join.

Slide 30

Oracle : Outer Joins

Original Oracle syntax is completely non-standard.

The idea is to add a (+) suffix to

the name of the column that is in the table whose columns will

receive the NULLs as „padding‟.

Regarding „left‟ and „right‟, this is

the exact opposite of the SQL standard.

Example :-

SELECT AttributeNames

FROM SUPP, SHIP

WHERE SUPP.S# = SHIP.S#(+) ;

Old fashioned SQL1 join syntax is required.

„Left‟ & „right‟ refer to columns in the WHERE phrase, not tables in the FROM phrase.

Full join ≡ Union of left and right outer joins.

Do NOT useunless desparate !

This outer join syntax is peculiar to Oracle and used by no other DBMS. It has several

disadvantages, and now that Oracle has provided its DBMSs with an SQL2 standard

outer join syntax, you are strongly advised to use that syntax and avoid the old Oracle

version.

The old version is only mentioned for completeness - only an overview of it is given here

- should you find that you need to use an old version of an Oracle DBMS that lacks the

modern SQL2 syntax.