Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
course:
Database Applications (NDBI026) WS2018/19
RNDr. Michal Kopecký, Ph.D. Department of Software Engineering, Faculty of Mathematics and Physics, Charles University in Prague
Schema modification Adding and deleting columns
Adding and deleting constraints
Changing column definition
ANSI SQL-92 Joins Optimizer and Query Optimization
Indexes
Execution Plans
Hinting
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 2
Error in application design Wrong normal form of the schema
All required data can not be stored
Wrongly defined constraint Customer changes his/her requirements Support for new attributes and entities
Change of constraints in the real world
…
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 4
Schema and/or application logic changes represent the important part of the application life-time
Data stored in the database are usually more expensive and more important than the price of the software and hardware
Ability to change/modify the schema without any data loss is more important than the ability of its creating from scratch
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 5
Adding column to existing table ALTER TABLE tab_name ADD
column_definition; Example.
ALTER TABLE Person ADD Note CHARACTER VARYING(1000); ALTER TABLE Product ADD EAN NUMERIC(13) CONSTRAINT Product_U_EAN UNIQUE;
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 6
Dropping column in existing table
It is usually necessary to transfer data somewhere before column dropping!
ALTER TABLE tab_name DROP COLUMN col_name;
Example ALTER TABLE Person DROP COLUMN ZipCode;
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 7
Adding constraint to existing table ALTER TABLE tab_name ADD
constraint_definition; Example
ALTER TABLE Person ADD CONSTRAINT Person_FK_Mother FOREIGN KEY(Mother) REFERENCES Person(ID) ON DELETE SET NULL;
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 8
Dropping unnecessary constraint in the table ALTER TABLE tab_name
DROP CONSTRAINT constraint_name; Example
ALTER TABLE Person DROP CONSTRAINT Person_U_Name;
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 9
More columns and constraint can be added in one step using statement
ALTER TABLE tab_name ADD ( column_definition | constraint_definition [, …] );
Example ALTER TABLE Person ADD ( Note CHARACTER VARYING(1000), CONSTRAINT Person_Chk_Age CHECK (Age>=0) );
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 10
Columns can be modified using statement ALTER TABLE tab_name MODIFY (
new_incremental_column_definition [, …] ); -- Oracle
ALTER TABLE tab_name ALTER COLUMN new_incremental_column_definition; -- MS SQL
Example ALTER TABLE Person MODIFY ( Note CHARACTER VARYING(2000) );
Unnoticed features remain unchanged It is possible to change
NULL to NOT NULL and vice versa Column width
▪ Increase the width ▪ Decrease (usually only if the column is empty)
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 11
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 12
The ANSI SQL-92 standard introduced more types of table join in the FROM clause (semantics taken from RA) Cartesian product Equijoin Inner join Natural join Left/Right/Full outer join
Previous version allowed only Comma separated list of data sources (tables and views) Each source can be followed by the alias separated by
space Join conditions only in the WHERE clause
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 13
ANSI SQL-92 syntax Allows usage of keyword AS between data source
and alias … FROM Emp AS E, Dept AS D
Distinguishes semantically different types of join using new keywords in the FROM clause
WHERE clause remains for additional conditions (row selection)
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 14
X CROSS JOIN Y
Cartesian product
Equivalent of previous style X, Y
SELECT EmpNo, Loc FROM Emp CROSS JOIN Dept;
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 15
1111 10
2222 20
EmpNo DeptNo
20 NEW YORK
30 DALLAS
DeptNo Loc
1111 10
1111 10
EmpNo DeptNo
20 NEW YORK
30 DALLAS
DeptNo Loc
2222 20
2222 20
20 NEW YORK
30 DALLAS
X NATURAL [INNER] JOIN Y
Natural join over all common columns of both tables (here only DeptNo)
SELECT EmpNo, Loc FROM Emp NATURAL JOIN Dept;
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 16
1111 10
2222 20
EmpNo DeptNo
20 NEW YORK
30 DALLAS
DeptNo Loc
EmpNo DeptNo Loc
2222 20 NEW YORK
X [INNER] JOIN Y ON (condition)
Standard join of tables, equivalent to older FROM … X, Y … WHERE condition
X [INNER] JOIN Y USING (column [,…])
Join over equality of column values in all mentioned columns (both tables have to have defined those columns)
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 17
DBI026 -DB Aplikace - MFF UK
It is possible to use one of following versions instead of INNER keyword LEFT [OUTER], RIGHT [OUTER], FULL [OUTER]
In case of … X LEFT JOIN Y ON (condition) ... Contains the result all rows from the left table (X), even if there is no corresponding row in the right table (Y)
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 18
INNER can be replaced by one of keywords LEFT [OUTER], RIGHT [OUTER], FULL [OUTER]
SELECT * FROM Emp NATURAL LEFT JOIN Dept; The result contains all Employees including those
that are not assigned to any department Non-existing fields
from Dept table are empty (contain NULL value)
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 19
1111 10
EmpNo DeptNo Loc
2222 20 NEW YORK
INNER can be replaced by one of keywords LEFT [OUTER], RIGHT [OUTER], FULL [OUTER]
SELECT * FROM Emp NATURAL RIGHT JOIN Dept; The result contains all Departments including
those that have no assigned Employees
Non-existing fields from Emp table are empty (contain NULL value)
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 20
2222 20
EmpNo DeptNo
NEW YORK
Loc
30 DALLAS
INNER can be replaced by one of keywords LEFT [OUTER], RIGHT [OUTER], FULL [OUTER]
SELECT * FROM Emp NATURAL FULL JOIN Dept; Combination of both left and right outer join
Non-existing fields from both tables are empty (contain NULL value)
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 21
2222 20
EmpNo DeptNo
NEW YORK
Loc
30 DALLAS
1111 10
Oracle has also its own native (proprietary) syntax for outer joins
Older, only left and right, only for equality of values
ANSI version is better and portable Left outer join
SELECT * FROM Dept, Emp WHERE Dept.Deptno = Emp.Deptno(+);
Right outer join SELECT * FROM Dept, Emp
WHERE Dept.Deptno(+) = Emp.Deptno;
Full outer join DOES NOT EXISTS
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 22
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 23
Serve for speeding-up data access according to some condition in the WHERE clause
Do not change neither syntax nor semantics of DML statements
Unique vs. Non-unique indexes
One-column vs. More-column (concatenated) indexes
Clustered vs. unclustered indexes
B-trees vs. Bitmaps
Indexes on columns vs. on expressions
Domain indexes (full-text, space, XML, …)
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 24
Index creation is not standardized in SQL-92 Individual RDBMS’s implement them in a
proprietary way
It can vary
▪ Syntax
▪ Support of particular type(s) of indexes (bitmap, hash, …)
▪ Their (non)usage for given query and data content
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 25
Usually redundant B+-trees Values in leaves
Leaves in bi-directional list to allow easy range search.
Suitable for columns having high selectivity (high number of different values in the column).
Concatenated indexes can combine more columns together to increase selectivity. ▪ Suitable, if the query searches rows according to values of first k columns in
the index. First k-1 columns have to be restricted by equality to constant value.
▪ Not suitable, if there is no condition on first column of the index.
It is usually not possible to combine more B-tree indexes. The query is evaluated using one of them (the most selective one) and other conditions have to be tested programmatically.
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 26
Cannot help If the percentage of corresponding rows is too high
▪ Overhead caused by reading additional blocks of the index and mainly by non-sequential access to the data blocks
In queries searching for rows containing NULL values in indexed column ▪ NULL values are usually not stored in the index
Can help In queries searching rows according to equality of
column value to constant In queries searching rows with column value
belonging to interval
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 27
For each possible column value is created one bitmap (bit-string) containing 1 for and only for rows with given value in the column, otherwise 0 Suitable for columns with low selectivity
Bitmaps can be effectively combined from arbitrary number of indexes to increase selectivity
Combination can increase the selectivity
SELECT * FROM Citizen
WHERE Gender=’M’
AND State IN (’US-NY’,’US-WA’);
▪ Combination of three bitmaps
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 28
0
1
0
1
1
0
0
1
0
0
1
0
0
1
0
0
0
0
1
0
1
0
0
1
0
0
0
0
1
0
0
0
1
M N Y
W A
0
1
0
0
0
0
0
1
0
0
1
( )=
Both Oracle and MS SQL creates automatically unique indexes for
Primary keys
▪ The name is the same as the name of the constraint
Candidate keys (UNIQUE columns)
▪ The name is the same as the name of the constraint
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 29
Important is to create indexes suitable for foreign key searches !!! Speeds-up the manipulation with the master table
▪ If the master row is deleted all child rows have to be found. Without the index, the engine has to do it using table full scan
▪ If the cascade delete is used the table containing hierarchy of entities, full scan has to be done for each found and deleted child recursively.
Full-scan reads all blocks, even empty ones containing only already deleted rows
Index range scan finds all child rows effectively Oracle used to use full table lock in case it needed to lock
all children rows and there was no suitable index available. This restricts parallel access to data from more users at the same time.
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 30
DBI026 -DB Aplikace - MFF UK
In other cases the indexes should be created only if they substantially help to speed-up frequently used queries
Each index speeds up some queries, but slows down data modification
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 31
Indexes on columns CREATE [UNIQUE] INDEX index_name
ON tab_name(column1[, column2 [, …]]);
Example
CREATE INDEX Person_Sn_Nm_Inx ON Person(Surname,Name);
Index can be used in statement, that searches data according value of first declared column
SELECT * FROM Person WHERE Surname=’Drake’;
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 32
Indexes on columns CREATE [UNIQUE] INDEX index_name
ON tab_name(column1[, column2 [, …]]);
Example
CREATE INDEX Person_Sn_Nm_Inx ON Person(Surname,Name);
Index cannot be used in statement, that searches data according value of second declared column
SELECT * FROM Person WHERE Name=’Francis’;
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 33
It is better to declare uniqueness using PRIMARY KEY and UNIQUE constraints
Not only indexes, but also constraints are defined
Constraints has to be used to allow using those columns as target of foreign key(s)
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 34
Indexes with ordering CREATE [UNIQUE] INDEX index_name
ON tab_name(column1 [{ASC|DESC}] [, …]);
Define ordering for each individual column
Can define the resulting row ordering in index search queries
Example
CREATE INDEX Employee_Job_Sal_Inx ON Employee(Job, Salary DESC);
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 35
Bitmap Indexes (only non-unique) CREATE BITMAP INDEX index_name
ON tab_name({column1|expression1}, …);
Example
CREATE BITMAP INDEX Teaching_Day_Inx ON Teaching(DayOfWeek);
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 36
CLUSTERED
At most one – by default the primary key
If is defined
▪ Data in the table are ordered according to index (ISF). In fact, the table forms the leaves of the index tree.
▪ Other indexes points to primary key values instead of row ID’s
If it is not defined
▪ Data in the table are not particularly ordered (HEAP)
▪ All indexes points to row ID’s NONCLUSTERED
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 37
create table onheap( id numeric(5) identity (100,10) constraint onheap_pk primary key NONCLUSTERED, name character varying(10) constraint onheap_u_name unique );
select object_id, name, index_id iid, type typ, type_desc from sys.indexes;
object_id | name |iid|typ| type_desc
1357247890 | category_pk | 1 | 1 | CLUSTERED
1357247890 | category_u_name| 2 | 2 | NONCLUSTERED
1417772108 | NULL | 0 | 0 | HEAP
1417772108 | onheap_u_name | 2 | 2 | NONCLUSTERED
1417772108 | onheap_pk | 3 | 2 | NONCLUSTERED
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 38
Equivalent of CLUSTERED in MS SQL Table ordered according to primary key,
rows form leaf level of the primary key index
Other indexes point to logcal ROWID’s Primary key value + supposed address
CREATE TABLE Person( ID VARCHAR2(11) CONSTRAINT Person_PK PRIMARY KEY, … ) ORGANIZATION INDEX;
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 39
Index dropping ORACLE: DROP INDEX index_name;
MSSQL: DROP INDEX tab_name.index_name;
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 40
Index information are in Oracle stored in views USER_INDEXES
USER_IND_COLUMNS
Index information are in MS SQL stored in views INFORMATION_SCHEMA
.INDEXES
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 41
Use correct type of indexes for given selectivity
Do not create all possible indexes over all columns and their combinations
Slows down data actualizations
Increases the amount of disk space taken
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 42
When developing the application use all available means in the target database
For finding the best possible variant of the query
Hint the optimizer only in case all other possible tries failed
Optimizers have their limits
Heuristics are used to find the best plan, non-promissing branches of plan space are pruned
Thus, only some combinations of data access paths and table joins are taken into account
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 43
One query can be written in many ways The same semantics Different way to achieve the result The time spent can differ many times !!!
The plan for executing given query written in given form provides the query optimizer
You need Know how to find out the plan used Use the best optimizable form of the query or help the
optimizer with optimization explicitly (when no other thing helps)
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 44
A (binary) tree of elementary operations Evaluated in post-order manner,
the root operation provides complete result
In leaves are data access paths to sources ▪ Table ROWID direct access ▪ Index UNIQUE SCAN ▪ Index RANGE SCAN ▪ Table FULL SCAN ▪ …
In inner nodes ▪ Accesses to table rows according to index-provided addresses ▪ Joins (nested loops, MERGE JOIN, HASH JOIN) ▪ Data sorting operations ▪ Filters for remaining predicates ▪ …
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 45
In Oracle Older RULE BASED optimization (RBO)
▪ Derives the plan from the statement syntax and from available indexes
Newer COST BASED optimization (CBO) ▪ Oracle 8+, recommended for better results ▪ Based on metadata available/computed for tables and
columns, computes the overall cost of the plan according to estimated usage of resources for operation execution (amount of time, space, ordering, data block accesses, …)
▪ Can distinguish the effectiveness of two different index range scans as well as the cost of execution for different constant used in the query
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 46
The cost of data access in descending order Table Full-scan
▪ All data blocks of the table are read one by one. Conditions are checked programmatically for each row.
▪ Can be optimal if the number of matching rows is large enough.
Index-Range-Scan ▪ The interval is found out in the index. Other conditions are checked
programmatically.
Unique-Index-Scan ▪ The at most one suitable row is found out using search in the unique index.
Other conditions are checked programmatically.
ROWID-Scan ▪ The row is fetched according to its direct address in the database
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 47
Join cost for two tables
The optimizer tries usually to use the table with more expensive data access as the pivotal table (outer loop in nested loops)
Then it searches corresponding data in the other table for each found row of the pivotal table
If both tables provides only Full-Scan data access path, data in both tables are temporarily ordered and Merge-Join is used.
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 48
How to find out the plan? In Oracle you should have table named
PLAN_TABLE available (newer versions of Oracle provide it automatically) with correct schema
The optimizer then can store plan to this table, if it is asked asked to do so @?\rdbms\admin\utlxplan[.sql]
SQL*Plus client provides option SET AUTOTRACE {OFF|ON|TRACEONLY}
Oracle provides statement EXPLAIN PLAN
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 49
EXPLAIN PLAN
SET STATEMENT_ID = ’name’
[INTO tab_name]
FOR statement;
EXPLAIN PLAN
SET STATEMENT_ID = ’emp_dept’
FOR
SELECT Emp.*, Dept.Loc
FROM Dept, Emp
WHERE Dept.DeptNo = Emp.Deptno;
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 50
Obtaining the execution plan (version 10+) select plan_table_output
from table(
dbms_xplan.display(
'PLAN_TABLE',{statement_id|null},
{'ALL'|'TYPICAL'|'BASIC'|'SERIAL'}
)
); PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 96961 | 1893K| 270 (2)|
| 1 | NESTED LOOPS | | 96961 | 1893K| 270 (2)|
| 2 | INDEX RANGE SCAN | MF_CISPOLATR_SK_ATR_DO_PBCP | 12 | 216 | 3 (0)|
| 3 | COLLECTION ITERATOR PICKLER FETCH| XMLSEQUENCEFROMXMLTYPE | | | |
----------------------------------------------------------------------------------------------------
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 51
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 52
How to find out the plan?
ISQL console provides the possibility to show the plan in textual form
set showplan_text on
go
<příkaz>
go
Usual reccomendation: Use placeholders instead of constants in your application and bind application variables to them Two “different” queries have two distinct (but equal) execution plans.
Their creation costs time and resources of the database ▪ SELECT * FROM Emp WHERE DeptNo=10;
SELECT * FROM Emp WHERE DeptNo=20;
▪ SELECT * FROM Emp WHERE DeptNo=:d;
Sometimes, of course, two CBO plans can be helpful because they are different (pokud se princip provedení odůvodněně liší). | ▪ SELECT * FROM Soldiers WHERE Gender=’M’
▪ SELECT * FROM Soldiers WHERE Gender=’F’
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 53
90% of data (full s.)
One statement write in the same form on all places in the application
Different styles cause different plans and repeated analysis of statements
▪ SELECT * FROM Emp WHERE Ename LIKE ’A%’ AND DeptNo=10;
▪ SELECT * FROM Emp WHERE DeptNo=10 AND Ename LIKE ’A%’;
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 54
If there exist more non-unique indexes on the table, RBO can choose the worse of them
SELECT * FROM Person WHERE Name=’John’ AND City=’Idaho City’;
Either all Johns are searched and the city is tested programmatically, or vice versa
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 55
Usage of one of indexes can be “disabled” by using some expression in the query
SELECT * FROM Person WHERE CONCAT(Name,’’)=’John’ AND City=’Idaho City’;
The ondex on Name cannot be used, the index on City will be used instead
Note.: More sophisticated optimizer could recognize this trick and rewrite the query to its original form.
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 56
The overall cost for individual plans is computed using lot of criteria
Amount of I/O operations, rows, Bytes, …
The cost of needed ordering operations
The cost for HASH operations
The plan with lowest weighted cost is chosen
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 57
Uses statistical information about stored data Number of different values in indexed columns,
Histograms of data values in columns, Lowest/Highest values in columns Number of rows in table, Average length of one row Number of data blocks in table Number of empty data blocks in table Number of NULLs in columns ▪ For given value or interval it can be estimated
▪ The percentage of matching rows ▪ The percentage of needed blocks ▪ Their volume
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 58
In Oracle CBO allows create indexes over expressions, not only columns (RBO cannot use them) CREATE INDEX Emp_Income_INX
ON Emp(Sal+COALESCE(Comm,0)); The query with identical expression can use the
index SELECT EName FROM Emp
WHERE Sal+COALESCE(Comm,0) > 25000; Query with modified expression cannot use that
index SELECT EName FROM Emp
WHERE COALESCE(Comm,0)+Sal > 25000;
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 59
Selection of optimizer
ALTER SESSION SET OPTIMIZER_MODE*) = ▪ CHOOSE – the optimizer is chosen according to presence of
statistics
▪ ALL_ROWS – CBO will be used, minimizes cost of obtaining all rows of the select – indexes are less used
▪ Suitable for batch processing.
▪ FIRST_ROWS –CBO will be used, minimizes cost of obtaining first few rows – indexes are more used
▪ Suitable for interactive processing
▪ RULE – always RBO *) Note.: Older syntax: OPTIMIZER_GOAL
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 60
ANALYZE TABLE tab_name {COMPUTE | ESTIMATE | DELETE} STATISTICS [FOR {TABLE | ALL [INDEXED] COLUMNS}];
DBMS_UTILITY.ANALYZE_SCHEMA( ’schema_name’,{’compute’ | ’delete’ | ’estimate’} );
DBMS_STATS.GATHER_SCHEMA_STATS(’sch_name’);
Views in data dictionary INDEX_STATS,
USER_TAB_COL_STATISTICS USER_USTATS
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 61
By default the option AUTO_CREATE_STATISTICS is enabled Automatical statistics generation ALTER DATABASE dbname SET
AUTO_CREATE_STATISTICS {ON|OFF} Manually by procedure sp_createstats
Example: creation of additional statistic dodatečné for two-column valuebased on data sample CREATE STATISTICS FirstLast ON
Person.Contact(FirstName,LastName) WITH SAMPLE 50 PERCENT
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 62
Tables Number of rows
Number of rows in one block
Number of empty/all blocks
… Sloupce Number of different values
Number of NULL values
Histograms of values
…
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 63
Using “plus sign” comments placed immediatelly after first keyword of the statement SELECT/UPDATE/INSERT/DELETE
▪ SELECT --+ list of hints ▪ Seems to be ignored
▪ SELECT /*+ list of hints */
Can be used for statement level selection of optimizer ▪ SELECT /*+ RULE */ * FROM EMP …; ▪ SELECT /*+ FIRST_ROWS */ * FROM EMP …;
The hit usage (except of RULE hint) always forces to use CBO based on statistics. If statistics are not computed or are too old, the result can be contra-productive.
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 64
General setting for optimizer CHOOSE
▪ Optimizer choses the method according to presence / not presence of statistics
RULE ▪ Optimizer uses RBO even in case of statistics are available. When
using SQL-92 joins in the statement RBO hint will be ignored!
ALL_ROWS ▪ Optimizer will minimize the cost for all rows retrieval
FIRST_ROWS, FIRST_ROWS(n) ▪ Optimizer will minimize the cost for first / first n of rows retrieval
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 65
Other hints (for data access paths) FULL(tab_name)
▪ Given table should be full-scanned
INDEX (tab_name index_name) ▪ Given index should be used to retrieve data from the table
NO_INDEX (tab_name index_name) ▪ Given index should not be used to retrieve data from the table
ORDERED ▪ The order of tables in joins should correspond to the order of
appearance in FROM clause
USE_NL, USE_MERGE, USE_HASH ▪ Joins should be implemented using nested loops / merge joins / hash
joins
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 66
FULL(tab_name) SELECT /*+ FULL(Emp) */
EmpNo, Ename FROM Emp WHERE EName>’X’;
Use FULL SCAN even if the amount of retrieved rows is small
If the table has an alias, the hint has to use this alias, it allows use the table more times with different hints
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 67
INDEX(jm_tabulky index [index …])
SELECT /*+ INDEX(Emp ENameInx EDeptInx) */ EmpNo, Ename FROM Emp WHERE EName LIKE’SC%’ AND DeptNo>50;
Use one of listed indexes, do not use other indexes, even if available and suitable
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 68
NO_INDEX(jm_tabulky index [index …])
SELECT /*+ NO_INDEX(Emp ENameInx) */ EmpNo, Ename FROM Emp WHERE EName LIKE’SC%’ AND DeptNo>50;
Do not consider listed indexes during query optimization
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 69
ORDERED
SELECT /*+ ORDERED*/ EmpNo, Ename FROM Emp, Dept WHERE …;
Tables will be joined in order of appearance in the FROM clause
It saves the time by not considering other orders of tables in join
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 70
SELECT … OPTION (hint …); Hints can be chosen from:
{ { HASH | ORDER } GROUP { CONCAT | HASH | MERGE } UNION { LOOP | MERGE | HASH } JOIN | FAST number_rows | FORCE ORDER | MAXDOP number_of_processors | OPTIMIZE FOR ( @variable_name { UNKNOWN | = literal_constant } [ , ...n ] ) | …
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 71
{ { HASH | ORDER } GROUP Implement GROUP BY using hashing or ordering data
{ CONCAT | HASH | MERGE } UNION Implement UNION without duplicities by simple
concatenating, hashing, or merging individual results { LOOP | MERGE | HASH } JOIN Implement joins by nested loops / merge joins / hash joins
FAST number_rows Optimize query for fast retrieval of first number of rows
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 72
FORCE ORDER Keep order of tables in joins in according to the FROM clause
MAXDOP number_of_processors Limitation of maximal degree of parallelism
OPTIMIZE FOR ( @variable_name { UNKNOWN | = literal_constant } [ , ...n ] ) If the statement contains a variable (placeholder),
suppose either given value or unknown value
M. Kopecký Schema Modification and Query Optimization (NDBI026, Lect. 2) 73