1 Presentation Outline SQL Writing Process SQL Standards Using Indexes The Optimizer FROM, WHERE...

Preview:

DESCRIPTION

3 SQL Writing Process Step 1: What information do I need?  Columns Step 2: Where is it?  Tables Step 3: Write SQL: SELECT columns FROM tables WHERE... (joins, filters, subqueries) I'M FINISHED!

Citation preview

1

Presentation Outline

• SQL Writing Process

• SQL Standards

• Using Indexes

• The Optimizer

• FROM, WHERE Clauses

• EXPLAIN

• SQL Trace

• Sub-Selects and Joins

• Tips and Tricks

2

Caveat

Although many of these principles apply

to all databases, Oracle will be used in

the examples.

3

SQL Writing Process

Step 1: What information do I need? Columns

Step 2: Where is it? Tables

Step 3: Write SQL:

SELECT columnsFROM tables

WHERE ... (joins, filters, subqueries)

I'M FINISHED!

4

SQL Writing Process• YOU'RE NOT FINISHED YET! You've got the results

you want, but at what cost?

• There are many, many ways to get the right results, but only one is the fastest way—1000-to-1 improvements are attainable!

• Inefficient SQL can dramatically degrade the performance of the entire system

• Developers and DBAs must work together to tune the database and the application

5

Pre-Tuning Questions• How long is too long?

• Is the statement running on near-production volumes?

• Is the optimal retrieval path being used?

• How often will it execute?

• When will it execute?

6

SQL StandardsWhy are SQL standards important?

• Maintainability, readability

• Performance: If SQL is the same as a (recently) executed statement, it can be re-used instead of needing to be reparsed

7

SQL StandardsQuestion: which of these statements are the same?

A. SELECT LNAME FROM EMP WHERE EMPNO = 12;

B. SELECT lname FROM emp WHERE empno = 12;

C. SELECT lname FROM emp WHERE empno = :id;

D. SELECT lname FROM emp WHERE empno = 12;

8

SQL Standards• Answer: None

• Whitespace, case, bind variables vs. constants all matter

• Using standards helps to ensure that equivalent SQL can be reused.

9

Tables Used in the Examples

deptno dnameloc

DEPTempno mgrjobdeptnofnamelnamecommhiredategradesal

EMPgrade losalhisal

SALGRADE

10

SELECT E.empno,

D.dname

FROM emp E,

dept D

WHERE E.deptno = D.deptno

AND (D.deptno = :vardept

OR E.empno = :varemp);

SQL Standards: Example

Keywords upper case and left-aligned

Columns on new lines

Use std. table aliases

Separate w/ one space

Use bind variables

AND/OR on new lines

No space before/after parentheses

11

Indexes: What are they?• An index is a database object used to speed retrieval

of rows in a table.

• The index contains only the indexed value--usually the key(s)--and a pointer to the row in the table.

• Multiple indexes may be created for a table

• Not all indexes contain unique values

• Indexes may have multiple columns (e.g., Oracle allows up to 32)

12

• If a column appears in a WHERE clause it is a candidate for being indexed.

• If a column is indexed the database can used the index to find the rows instead of scanning the table.

• If the column is not referenced properly, however, the database may not be able to used the index and will have to scan the table anyway.

• Knowing what columns are and are not indexed can help you write more efficient SQL

Indexes and SQL

13

No index exists for column EMPNO on table EMP, so a table scan must be performed:

Example: Query without Index

empno fname lname...4 lisa baker9 jackie miller1 john larson3 larry jones5 jim clark2 mary smith7 harold simmons8 mark burns6 gene harris

Table: EMPSELECT *FROM empWHERE empno = 8

14

7 8 9

Example: Query with Index

empno fname lname ...4 lisa baker9 jackie miller1 john larson3 larry jones5 jim clark2 mary smith7 harold simmons8 mark burns6 gene harris

Table: EMP

5Index: PK_EMPEMP (EMPNO)

1, 4 5, 9

1 2

3 4

5 6

SELECT *FROM empWHERE empno = 8

Column EMPNO is indexed, so it can be used to find the requested row:

15

• Sometimes a table scan cannot be avoided

• Not every column should be indexed--there is performance overhead on Inserts, Updates, Deletes

• Small tables may be faster with a table scan

• Queries returning a large number (> 5-20%) of the rows in the table may be faster with a table scan

Indexes: Caveats

16

Example: Index on (EMPNO, DEPTNO)

SELECT *FROM emp WHERE deptno = 10;

SELECT *FROM emp WHERE empno > 0AND deptno = 10;

Must use the leading column(s) of the index for the index to be used

Indexes: Column Order

Will NOT use index

WILL use index

17

Using a function, calculation, or other operation on an indexed column disables the use of the Index

SELECT *FROM emp WHERE TRUNC(hiredate) = TRUNC(SYSDATE); ...WHERE fname || lname = 'MARYSMITH';

SELECT *FROM emp WHERE hiredate BETWEEN TRUNC(SYSDATE) AND TRUNC(SYSDATE)+1...WHERE fname = 'MARY' AND lname = 'SMITH';

Indexes: Functions

Will NOT use index

WILL use index

18

Using NOT excludes indexed columns:

SELECT *FROM dept WHERE deptno != 0; ... deptno NOT = 0;... deptno IS NOT NULL;

SELECT *FROM dept WHERE deptno > 0;

Indexes: NOT

Will NOT use index

WILL use index

19

The Optimizer• The WHERE/FROM rules on the following pages apply

to the Rule-based optimizer (Oracle).

• If the Cost-based Optimizer is used, Oracle will attempt to reorder the statements as efficiently as possible (assuming statistics are available).

• DB2 and Sybase use only a Cost-based optimizer

• The Optimizer's access paths can be overridden in Oracle and Sybase (not DB2)

20

The Optimizer: HintsReturn the first rows in the result set as fast as possible:

SELECT /*+ FIRST_ROWS */ empnoFROM emp E dept D, WHERE E.deptno = D.deptno;

Force Optimizer to use index IDX_HIREDATE:

SELECT /*+ INDEX (E idx_hiredate) */ empnoFROM emp EWHERE E.hiredate > TO_DATE('01-JAN-2000');

21

FROM Clause: Driving TableSpecify the driving table last in the FROM Clause:

SELECT *FROM dept D, -- 10 rows emp E -- 1,000 rowsWHERE E.deptno = D.deptno;

SELECT *FROM emp E, -- 1,000 rows dept D -- 10 rowsWHERE E.deptno = D.deptno;

Driving table is EMP

Driving table is DEPT

22

FROM Clause: Intersection Table

When joining 3 or more tables, use the Intersection table (with the most shared columns) as the driving table:

SELECT *FROM dept D, salgrade S, emp EWHERE E.deptno = D.deptnoAND E.grade = S.grade;

EMP shares columns with DEPT and SALGRADE, so use as the driving table

23

WHERE: Discard EarlyUse WHERE clauses first which discard the maximum number of rows:

SELECT *FROM emp EWHERE E.empno IN (101, 102, 103)AND E.deptno > 10;

3 rows 90,000 rows

24

WHERE: AND Subquery FirstWhen using an "AND" subquery, place it first:SELECT *FROM emp EWHERE E.sal > 50000AND 25 > (SELECT COUNT(*) FROM emp M WHERE M.mgr = E.empno)

SELECT *FROM emp EWHERE 25 > (SELECT COUNT(*) FROM emp M WHERE M.mgr = E.empno)AND E.sal > 50000

CPU = 156 sec

CPU = 10 sec

25

WHERE: OR Subquery LastWhen using an "OR" subquery, place it last:SELECT *FROM emp EWHERE 25 > (SELECT COUNT(*) FROM emp M WHERE M.mgr = E.empno)OR E.sal > 50000

SELECT *FROM emp EWHERE E.sal > 50000OR 25 > (SELECT COUNT(*) FROM emp M WHERE M.mgr = E.empno)

CPU = 100 sec

CPU = 30 sec

26

WHERE: Filter First, Join LastWhen Joining and Filtering, specify the Filter condition first, Joins last.

SELECT *FROM emp E, dept DWHERE (E.empno = 123OR D.deptno > 10)AND E.deptno = D.deptno;

Filter criteria

Join criteria

27

Subqueries: IN vs. EXISTSUse EXISTS instead of IN in subqueries:

SELECT E.*FROM emp EWHERE E.deptno IN ( SELECT D.deptno FROM dept D WHERE D.dname = 'SALES');

SELECT *FROM emp EWHERE EXISTS ( SELECT 'X' FROM dept D WHERE D.deptno = E.deptno AND D.dname = 'SALES');

IN: Both tables are scanned

EXISTS: Only outer table is scanned; subquery uses index

28

Subquery vs. JoinUse Join instead of Subquery :

SELECT *FROM emp EWHERE E.deptno IN ( SELECT D.deptno FROM dept D WHERE D.dname = 'SALES');

SELECT E.*FROM emp E, dept DWHERE D.dname = 'SALES'AND D.deptno = E.deptno;

IN: Both tables are scanned

JOIN: Only one table is scanned, other uses index

29

Join vs. EXISTSBest performance depends on subquery/driving table:

SELECT *FROM emp EWHERE EXISTS ( SELECT 'X' FROM dept D WHERE D.deptno = E.deptno AND D.dname = 'SALES');

SELECT E.*FROM emp E, dept DWHERE D.dname = 'SALES'AND D.deptno = E.deptno;

EXISTS: better than Join if the number of matching rows in DEPT is small

JOIN: better than Exists if the number of matching rows in DEPT is large

30

Explain

Display the access path the database will use (e.g., use of indexes, sorts, joins, table scans)

• Oracle: EXPLAIN• Sybase: SHOWPLAN• DB2: EXPLAIN

Oracle Syntax:EXPLAIN PLAN SET STATEMENT_ID = 'statement id'INTO PLAN_TABLE FOR

statement

Requires Select/Insert privileges on PLAN_TABLE

31

Explain

Example 1: “IN” subquerySELECT *FROM emp EWHERE E.deptno IN ( SELECT D.deptno FROM dept D WHERE D.dname = 'SALES');

Result:MERGE JOIN SORT (JOIN) TABLE ACCESS (FULL) OF EMP SORT (JOIN) VIEW SORT (UNIQUE) TABLE ACCESS (FULL) OF DEPT

3 joins1 dynamic view 2 table scans3 sorts

32

Explain Example 2: "EXISTS" subquerySELECT *FROM emp eWHERE EXISTS ( SELECT 'x' FROM dept d WHERE d.deptno = e.deptno AND d.dname = 'SALES');

Result:FILTER TABLE ACCESS (FULL) OF EMP TABLE ACCESS (BY INDEX ROWID) OF DEPT INDEX (UNIQUE SCAN) OF PK_DEPT (UNIQUE)

1 table scan1 index scan1 index access

33

Explain

Example 3: Join (no subquery)SELECT E.*FROM emp E, dept DWHERE D.dname = 'SALES'AND D.deptno = E.deptno;

Result:NESTED LOOPS TABLE ACCESS (FULL) OF EMP TABLE ACCESS (BY INDEX ROWID) OF DEPT INDEX (UNIQUE SCAN) OF PK_DEPT (UNIQUE)

1 table scan1 index scan1 index access

34

SQL Trace

Use SQL Trace to determine the actual time and resource costs for for a statement to execute.

Step 1: ALTER SESSION SET SQL_TRACE TRUE;

Step 2: Execute SQL to be traced:SELECT E.*FROM emp E, dept DWHERE D.dname = 'SALES'AND D.deptno = E.deptno;

Step 3: ALTER SESSION SET SQL_TRACE FALSE;

35

SQL Trace

Step 4: Trace file is created in <USER_DUMP_DEST> directory on the server (specified by the DBA).

Step 5: Run TKPROF (UNIX) to create a formatted output file:

tkprof echd_ora_15319.trc $HOME/prof.out table=plan_table explain=dbuser/passwd

Trace fileFormatted output filedestination for Explainuser/passwd for Explain

36

SQL Trace

Step 6: view the output file:

...SELECT E.*FROM emp E, dept DWHERE D.dname = 'SALES' AND D.deptno = E.deptno;

call count cpu elapsed disk query current rows------- ------ -------- ---------- ---------- ---------- ---------- ----------Parse 1 0.00 0.00 0 0 0 0Execute 1 0.00 0.00 0 0 0 0Fetch 2 0.00 0.00 4 19 3 6------- ------ -------- ---------- ---------- ---------- ---------- ----------total 4 0.00 0.00 4 19 3 6

Misses in library cache during parse: 0Optimizer goal: CHOOSEParsing user id: 62 (PMARKS)

Rows Row Source Operation------- --------------------------------------------------- 6 NESTED LOOPS 14 TABLE ACCESS FULL EMP 14 TABLE ACCESS BY INDEX ROWID DEPT 14 INDEX UNIQUE SCAN (object id 4628)

TIMED_STATISTICS must be turned on to get these values

EXPLAIN output

37

Tips and Tricks: UNION ALLUse UNION ALL instead of UNION if there are no duplicate rows (or if you don't mind duplicates):

SELECT * FROM empUNION SELECT * FROM emp_arch;

SELECT * FROM empUNION ALLSELECT * FROM emp_arch;

UNION: requires sort

UNION ALL: no sort

38

Tips and Tricks: HAVING vs. WHEREWith GROUP BY, use WHERE instead of HAVING (if the filter criteria does not apply to a group function):

SELECT deptno, AVG(sal)FROM empGROUP BY deptnoHAVING deptno IN (10, 20);

SELECT deptno, AVG(sal)FROM empWHERE deptno IN (10, 20)GROUP BY deptno;

HAVING: rows are filtered after result set is returned

WHERE: rows are filtered first--possibly far fewer to process

39

Tips and Tricks: EXISTS vs DISTINCTUse EXISTS instead of DISTINCT to avoid implicit sort (if the column is indexed):

SELECT DISTINCT e.deptno, e.lname FROM dept d, emp eWHERE d.deptno = e.deptno;

SELECT e.deptno, e.lname FROM emp eWHERE EXISTS ( SELECT 'X' FROM dept d WHERE d.deptno = e.deptno);

DISTINCT: implicit sort is performed to filter duplicate rows

EXISTS: no sort

40

Tips and Tricks: Consolidate SQL

Select from Sequences and use SYSDATE in the statement in which they are used:

SELECT SYSDATE INTO :vardate FROM dual;

SELECT arch_seq.NEXTVAL INTO :varid FROM dual;

INSERT INTO archiveVALUES (:vardate, :varid, ...)

INSERT INTO emp_archive VALUES (SYSDATE, emp_seq.NEXTVAL, ...)

BEFORE: 3 statements are used to perform 1 Insert

AFTER: only 1 statement is needed

41

Tips and Tricks: Consolidate SQL

Consolidate unrelated statements using outer-joins to the the DUAL (dummy) table:

SELECT dname FROM dept WHERE deptno = 10;SELECT lname FROM emp WHERE empno = 7369;

SELECT d.dname, e.lnameFROM dept d, emp e, dual xWHERE d.deptno (+) = 10 AND e.empno (+) = 7369AND NVL('X', x.dummy) = NVL('X', e.ROWID (+))AND NVL('X', x.dummy) = NVL('X', d.ROWID (+));

BEFORE: 2 round-trips

AFTER: only 1 round-trip

42

Tips and Tricks: COUNT

Use COUNT(*) instead of COUNT(column):

SELECT COUNT(empno)FROM emp;

SELECT COUNT(*)FROM emp; ~ 50% faster

43

Tips and Tricks: Self-Join

Use a self-join (joining a table to itself) instead of two queries on the same table:

SELECT mgr INTO :varmgr FROM emp WHERE deptno = 10;LOOP... SELECT mgr, lname FROM emp WHERE mgr = :varmgr;

SELECT E.mgr, E.lnameFROM emp E, emp MWHERE M.deptno = 10AND E.empno = M.mgr;

BEFORE: 2 round-trips

AFTER: only 1

44

Tips and Tricks: ROWNUM

Use the ROWNUM pseudo-column to return only the first N rows of a result set. (For example, if you just want a sampling of data):

SELECT *FROM emp WHERE ROWNUM <= 10;

Returns only the first 10 employees in the table, in no particular order

45

Tips and Tricks: ROWID

The ROWID pseudo-column uniquely identifies a row, and is the fastest way to access a row:

CURSOR retired_emp_cur IS SELECT ROWID FROM emp WHERE retired = 'Y';...FOR retired_emp_rec IN retired_emp_cur LOOP SELECT fname || ' ' || lname INTO :printable_name FROM emp WHERE ROWID = retired_emp_rec.ROWID; ...

Instead of selecting the key column(s), ROWID is used to identify the row for later use

46

Tips and Tricks: Sequences

Use a Sequence to generate unique values for a table:

SELECT MAX(empno) INTO :new_empno FROM emp;...INSERT INTO emp VALUES (:new_empno + 1, ...);

INSERT INTO emp VALUES (emp_seq.NEXTVAL, ...); or SELECT emp_seq.NEXVAL INTO :new_empno FROM dual;

Using a Sequence ensures that you always have a unique number, and does not require any table reads

MAX(empno) requires a sort and an index scan

INSERT could fail with a Duplicate error if someone else gets there first

47

Tips and Tricks: Connect By

Use CONNECT BY to construct hierarchical queries:

SELECT LPAD(' ',4*(LEVEL-1)) || lname Name, JobFROM emp WHERE job != 'CLERK'START WITH job = 'PRESIDENT' CONNECT BY PRIOR empno = mgr; Name Job

King PRESIDENT Jones MANAGER Scott ANALYST Ford ANALYST Blake MANAGER Allen SALESMAN Ward SALESMAN Martin SALESMAN Turner SALESMAN Clark MANAGER

48

Tips and Tricks: Cartesian Products

Avoid Cartesian products by ensuring that the tables are joined on all shared keys:

SELECT * FROM dept, -- 10 rows salgrade, -- 20 rows emp; -- 1,000 rows

SELECT * FROM dept, -- 10 rows salgrade, -- 20 rows emp -- 1,000 rowsWHERE E.deptno = D.deptnoAND E.grade = S.grade;

10 * 1000 * 20 = 200,000 rows

1,000 rows

49

Tips and Tricks: TOAD

• Tool for Oracle Application Developers

• Oracle only! Requires Oracle SQL*Net client software

• Freeware tool for viewing/updating Oracle objects

• http://www.toadsoft.com or s:\tempfile\toad\toadfree.zip

50

Tips and Tricks: TOAD

CTRL+E displays EXPLAIN PLAN

SQL result set displayed in grid

51

Tips and Tricks: TOAD

Table/view data in an editable grid

Indexes, constraints, grants, etc. for the current table

All tables/views for a selected schema

Recommended