50
1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice) Dr. Mourad YKHLEF The slides content is derived from many references

1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Page 1: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

1

King Saud UniversityCollege of Computer & Information Sciences

IS 335 Database Management System

Lecture 6Query Processing and Optimization (Practice)

Dr. Mourad YKHLEF

The slides content is derived from many references

Page 2: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

2

Motivation (1)

• We would like to find the cheapest way

to calculate the join of three tables:

• Sailors Reserves Boats

Page 3: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

3

Motivation (2)

• We need to decide on the order of

operations:(Sailors Reserves) Boats

or

Sailors (Reserves Boats)

• We need to decide which join algorithm

to use for each of the operations

• What information do we need?

Page 4: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

4

Statistics Maintained by DBMS for Relations

• Cardinality NTuples(R): Number of tuples in each relation R

• Size NPages(R) : Number of pages in each relation R

Page 5: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

5

Statistics Maintained by DBMS for Indexes

• Index Cardinality: Number of distinct key values NKeys(I) for each index I

• Index Size: Number of pages INPages(I) in each index I

• Index Height: Number of non-leaf levels IHeight(I) in each B+ Tree index I

• Index Range: The minimum value ILow(I) and maximum value IHigh(I) for each index I

Page 6: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

6

Note

• The statistics are updated periodically

(not every time the underlying

relations are modified).

• We cannot use the cardinality for

computing

select count(*)

from R

Page 7: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

7

Estimating Result Sizes

• Consider

• The maximum number of tuples is the product of the cardinalities of the relations in the FROM clause

• The WHERE clause is associating a reduction factor with each term. It reflects the impact of the term in reducing result size.

SELECT attribute-list

FROM relation-list

WHERE term1 and ... and termn

Page 8: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

8

Result Size

• Estimated result size:

maximum size

X

the product of the reduction factors

Page 9: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

9

Assumptions

• Containment of value sets: if

NKeys(I1)<NKeys(I2) for attribute Y,

then every Y-value of R will be a Y-

value of S

• Empirically-obtained reduction factor is

1/10 if no additional info is available

Page 10: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

10

Estimating Reduction Factors

• column = value: 1/NKeys(I) – There is an index I on column.

– This assumes a uniform distribution.

– Otherwise, use 1/10.

• column1 = column2: 1/Max(NKeys(I1),NKeys(I2)) – There is an index I1 on column1 and an index I2 on column2.

– Containment of value sets assumption

– If only one column has an index, we use it to estimate the value.

– Otherwise, use 1/10.

Page 11: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

11

Estimating Reduction Factors

• column > value:

(High(I)-value)/(High(I)-Low(I)) if there

is an index I on column.

Page 12: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

12

Example

• Cardinality(R) = 100,000

• Cardinality(S) = 40,000

• NKeys(Index on R.agent) = 100

• High(Index on Rating) = 10, Low = 0

Reserves (sid, agent), Sailors(sid, rating)

SELECT *

FROM Reserves R, Sailors S

WHERE R.sid = S.sid and S.rating > 3 and

R.agent = ‘Joe’

Page 13: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

13

Example (cont.)

• Maximum cardinality: 100,000 * 40,000

• Reduction factor of R.sid = S.sid: 1/40,000

– sid is a primary key of S

• Reduction factor of S.rating > 3: (10–3)/(10-

0) = 7/10

• Reduction factor of R.agent = ‘Joe’: 1/100

• Total Estimated size: 700

Page 14: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

Creating Indexes Using Oracle

Page 15: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

15

Index

• Map between

– the row key

– the row location

• Oracle has two kinds of indexes

– B* tree

– Bitmap

• Sorted

Page 16: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

16

B* tree

Root

19 24 33

2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*

14

Page 17: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

17

Creating an Index

• Syntax:

create [bitmap] [unique] index iname on

table(column [,column] . . .)

Page 18: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

18

Unique Indexes

• Create an index that will guarantee the uniqueness of the key. Fail if any duplicate already exists.

• When you create a table with a – primary key constraint or

– unique constraint

a "unique" index is created automatically

create unique index rating_bit on Sailors(rating);

Page 19: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

19

Bitmap Indexes

• Appropriate for columns that may have very few possible values

• For each value c that appears in the column, a vector v of bits is created, with a 1 in v[i] if the i-th row has the value c– Vector length = number of rows

• Oracle can automatically convert bitmap entries to RowIDs during query processing

Page 20: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

20

Bitmap Indexes: Example

create bitmap index rating_bit on Sailors(rating);

• Corresponding bitmaps:– 3: <1 0 0 1>

– 7: <0 1 0 0>

– 10: <0 0 1 0>

SidSnameagerating

12Jim553

13John467

14Jane4610

15Sam373

Page 21: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

21

When to Create an Index

• Large tables, on columns that are likely

to appear in where clauses as a

simple equality

• where s.sname = ‘John’ and s.age = 50

• where s.age = r.age

Page 22: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

22

Function-Based Indexes

• You can't use an index on sname for the

following query:select *

from Sailors

where UPPER(sname) = 'SAM';

• You can create a function-based index to

speed up the query:

create index upp_sname on Sailors(UPPER(sname));

Page 23: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

23

Index-Organized Tables• An index organized table keeps its data sorted by the

primary key

• Rows do not have RowIDs

• They store their data as if they were an index

create table Sailors(

sid number primary key,

sname varchar2(30),

age number,

rating number)

organization index;

Page 24: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

24

Index-Organized Tables (2)

• What advantages does this have?– Primary key is not duplicated in the index

– Improve performance of queries based on the primary key

• What disadvantages? – expensive to add columns, dynamic data

• When to use?– where clause on the primary key

– static data

Page 25: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

25

Clustering Tables Together

• You can ask Oracle to store several tables

with common columns together on the disk

• This is useful if you often join these tables

• Cluster: area on the disk where the rows

of the tables are stored

• Cluster key: the columns by which the

tables are usually joined in a query

Page 26: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

26

Clustering Tables Together: Syntax

• create cluster sailor_reserves (X number);– Create a cluster with nothing in it

• create table Sailors(

sid number primary key,

sname varchar2(30),

age number,

rating number)

cluster sailor_reserves(sid);

– create the table in the cluster

Page 27: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

27

Clustering Tables Together: Syntax (cont.)

• create index sailor_reserves_index on cluster sailor_reserves– Create an index on the cluster

• create table Reserves(

sid number,

bid number,

day date,

primary key(sid, bid, day) )

cluster sailor_reserves(sid);

– A second table is added to the cluster

Page 28: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

28

Reserves

sidbidday

221027/7/97

2210110/10/96

5810311/12/96

Sailors

sidsnameratingage

22Dustin745.0

31Lubber855.5

58Rusty1035.0

Stored

sidsnameratingagebidday

22Dustin745.010

27/7/97

10

110/10/96

31Lubber855.5

58Rusty1035.010

311/12/96

Page 29: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

The Oracle Optimizer

Page 30: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

30

Types of Optimizers

• There are different modes for the optimizer

• RULE: Rule-based optimizer (RBO)– deprecated

• CHOOSE: Cost-based optimizer (CBO); picks a plan based on statistics (e.g. number of rows in a table, number of distinct keys in an index) – Need to analyze the data in the database using analyze

command

ALTER SESSION SET optimizer_mode = {choose|rule|first_rows(_n)|all_rows}

Page 31: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

31

Types of Optimizers

• ALL_ROWS: execute the query so that

all of the rows are returned as quickly

as possible– Merge Join has priority over Block Nested Loop Join

• FIRST_ROWS(n): execute the query so

that all of the first n rows are returned

as quickly as possible– Block Nested Loop Join has priority over Merge Join

Page 32: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

32

analyze table | index

<table_name> | <index_name>

compute statistics |

estimate statistics [sample <integer>

rows | percent] |

delete statistics;

analyze table Sailors estimate statistics sample 25 percent;

Analyzing the Data

Page 33: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

33

Viewing the Execution Plan(Option 1)

• You need a PLAN_TABLE table. So, the first time that you want to see execution plans, run the command:

• Set autotrace on to see all plans– Display the execution path for each query,

after being executed

@$ORACLE_HOME/rdbms/admin/utlxplan.sql

Or

C:\oracle\ora92\rdbms\admin\utlxplan.sql

Page 34: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

34

PLAN_TABLE

• create table PLAN_TABLE (        statement_id       varchar2(30),         plan_id            number,         timestamp          date,

         remarks            varchar2(4000),         operation          varchar2(30),         options            varchar2(255),         object_node        varchar2(128),         object_owner       varchar2(30),        

Page 35: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

35

PLAN_TABLE

•          object_name        varchar2(30),         object_alias       varchar2(65),         object_instance    numeric,         object_type        varchar2(30),         optimizer          varchar2(255),         search_columns     number,         id                numeric,         parent_id          numeric,         depth              numeric,        

Page 36: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

36

PLAN_TABLE

•          position           numeric,         cost               numeric,         cardinality        numeric,         bytes              numeric,         other_tag          varchar2(255),         partition_start    varchar2(255),        partition_stop     varchar2(255),        partition_id       numeric,         other              long,        

Page 37: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

37

PLAN_TABLE

•          distribution       varchar2(30),         cpu_cost           numeric,         io_cost            numeric,         temp_space       numeric,         access_predicates  varchar2(4000),         filter_predicates  varchar2(4000),         projection         varchar2(4000),         time               numeric,         qblock_name        varchar2(30) );

Page 38: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

38

Viewing the Execution Plan (Option 2)

• Another option:

explain planset statement_id='test'for SELECT *FROM Sailors SWHERE sname='Joe';

explain plan set statement_id=‘<name>’ for <statement>

Select … from Plan_Table where statement_id=‘test’;

Page 39: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

39

Operations that Access Tables

• TABLE ACCESS FULL: sequential table scan

– Oracle optimizes by reading multiple blocks

– Used whenever there is no where clause on a

query

select * from Sailors

• TABLE ACCESS BY ROWID: access rows by

their RowID values.

– How do you get the rowid? From an index!

select * from Sailors where sid > 10

Page 40: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

40

Types of Indexes

• Unique: each row of the indexed table

contains a unique value for the

indexed column

• Nonunique: the row’s indexed values

can repeat

Page 41: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

41

Operations that Use Indexes

• INDEX UNIQUE SCAN: Access of an

index that is defined to be unique

• INDEX RANGE SCAN: Access of an

index that is not unique or access of a

unique index for a range of values

Page 42: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

42

Page 43: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

43

Page 44: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

44

Page 45: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

45

When are Indexes Used/Not Used?

• If you set an indexed column equal to a value, e.g., sname = 'Jim'

• If you specify a range of values for an indexed column, e.g., sname like 'J%'– sname like '%m': will not use an index

– UPPER(sname) like 'J%' : will not use an index

– sname is null: will not use an index, since null values are not stored in the index

– sname is not null: will not use an index, since every value in the index would have to be accessed

Page 46: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

46

When are Indexes Used? (cont)

• 2*age = 20: Index on age will not be used. Index on 2*age will be used.

• sname != 'Jim': Index will not be used.

• MIN and MAX functions: Index will be used

• Equality of a column in a leading column of a multicolumn index. For example, suppose we have a multicolumn index on (sid, bid, day)– sid = 12: Can use the index

– bid = 101: Cannot use the index

Page 47: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

47

Optimizer Hints

• You can give the optimizer hints about

how to perform query evaluation

• Hints are written in /*+ */ right after

the select

• Note: These are only hints. The Oracle

optimizer can choose to ignore your

hints

Page 48: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

48

Hints

• FULL hint: tell the optimizer to perform a TABLE ACCESS FULL operation on the specified table

• ROWID hint: tell the optimizer to perform a TABLE ACCESS BY ROWID operation on the specified table

• INDEX hint: tells the optimizer to use an index-based scan on the specified table

Page 49: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

49

Examples

Select /*+ FULL (sailors) */ sidFrom sailorsWhere sname=‘Joe’;

Select /*+ INDEX (sailors) */ sidFrom sailorsWhere sname=‘Joe’;

Select /*+ INDEX (sailors s_ind) */ sidFrom sailors S, reserves RWhere S.sid=R.sid AND sname=‘Joe’;

Page 50: 1 King Saud University College of Computer & Information Sciences IS 335 Database Management System Lecture 6 Query Processing and Optimization (Practice)

IS 335 – Query Processing and Optimization - Dr. Mourad Ykhlef

50