View
218
Download
0
Category
Preview:
Citation preview
1
Logical I/O
Julian Dyke
Independent Consultant
Web Version
juliandyke.com
© 2005 Julian Dyke
2
juliandyke.com
© 2005 Julian Dyke
Agenda
Introduction
Logical I/Os
Buffer Cache Behaviour
Statistics
Conclusion
3
juliandyke.com
© 2005 Julian Dyke
Logical I/Os Logical I/Os are read operations
Buffers are cached in shared memory
Most logical I/Os can be satisfied from cache
The remainder will result in physical I/Os
Logical I/Os include current reads consistent reads
4
juliandyke.com
© 2005 Julian Dyke
Current Reads Current reads
Current version of block
Can be updated
Can be dirty
Includes all changes
Only one current version of block in buffer cache
Only one current version of block across all instances
Can be used to construct consistent versions
5
juliandyke.com
© 2005 Julian Dyke
Consistent Reads Consistent reads
Potentially historic version of block
Consistent to a specific System Change Number (SCN)
Cannot be updated
Cannot be dirty
Can be used to construct consistent versions
Can have multiple versions of same block in buffer cache
Can be
single block (sequential reads)
multi block (scattered reads)
Can be traced using events 10200 / 10201
6
juliandyke.com
© 2005 Julian Dyke
Logical I/O statistics session logical reads statistic
Total number of logical reads in session Unreliable at system level At session level
session logical reads = db block gets + consistent gets
db block gets statistic Number of current reads
consistent gets statistic Number of consistent reads
7
juliandyke.com
© 2005 Julian Dyke
Buffer Pools There are up to eight buffer pools
DEFAULT KEEP, RECYCLE Oracle 8.0 and above 2K, 4K, 8K, 16K and 32K Oracle 9.0 and above
32K not available on all platforms
Cannot have non-standard block size same as DEFAULT block size
8
juliandyke.com
© 2005 Julian Dyke
Buffer Pool Headers One for each buffer pool (usable or unusable)
Externalized in V$BUFFER_POOL V$BUFFER_POOL_STATISTICS
Based on X$KCBWBPD
Created in shared pool permanent memory when instance is started
Contain one or more working sets
9
juliandyke.com
© 2005 Julian Dyke
X$KCBWBPD Externalises buffer pool header
ADDR RAW(4)
INDX NUMBER
INST_ID NUMBER
BP_NAME VARCHAR2(20)
BP_ID NUMBER
BP_BLKSZ NUMBER
BP_GRANSZ NUMBER
BP_BUFPERGRAN NUMBER
BP_LO_SID NUMBER
BP_HI_SID NUMBER
BP_SET_CT NUMBER
BP_SIZE NUMBER
BP_STATE NUMBER
BP_CURRGRANS NUMBER
BP_TGTGRANS NUMBER
BP_PREVGRANS NUMBER
Buffer Pool Name
Buffer Pool ID
Block Size
Granule Size
Buffers per Granule
Minimum Working Set ID
Maximum Working Set ID
Number of Working Sets
Number of Buffers
10 juliandyke.co
m
© 2005 Julian Dyke
Hash Buckets Hash value of each block calculated from
Data Block Address (DBA) Block Class
Number of hash buckets dependent on number of buffers in cache e.g.
# buffers 500 6000
# hash buckets 64 1024
Each hash bucket contains Cache Buffers Chains latch Pointer to array of double linked lists
11
juliandyke.com
© 2005 Julian Dyke
Hash Buckets
BH BH
BH
BH
# hash chains
cachebufferschainlatch
12 juliandyke.co
m
© 2005 Julian Dyke
Buffer Headers Each buffer header describes contents of one buffer
All buffers accessed via buffer header
Buffer header contains pointers to Buffer Cache Buffers Chains latch
Buffer header includes double linked lists for Cache Buffers Chain list Replacement list Users list Waiters list
13 juliandyke.co
m
© 2005 Julian Dyke
X$BH Externalises buffer headers
ADDR RAW(4)
INDX NUMBER
INST_ID NUMBER
HLADDR RAW(4)
BLSIZ NUMBER
NXT_HASH RAW(4)
PRV_HASH RAW(4)
NXT_REPL RAW(4)
PRV_REPL RAW(4)
TS# NUMBER
FILE# NUMBER
DBARFIL NUMBER
DBABLK NUMBER
OBJ NUMBER
BA RAW(4)
CR_SCN_BAS NUMBER
Hash List Address
Block Size
Hash List
Replacement List
Tablespace#
Absolute File Number
Relative File Number
Block Number
Object ID
Buffer Address
14 juliandyke.co
m
© 2005 Julian Dyke
Working Sets Introduced in Oracle 8.1.5
Each buffer pool contains one or more working sets
Working set header created in shared pool permanent memory associated with one DBWn process protected by cache buffers lru chain latch
Each working set maintains separate set of LRU lists
15 juliandyke.co
m
© 2005 Julian Dyke
LRU Lists In Oracle 9.2 each working set maintains 4 LRU lists
LRU - replacement list - normal blocks
LRU-W - write list - dirty blocks
LRU-XO - object list - buffers involved in DROP TRUNCATE
LRU-XR - range list - buffers involved in ALTER TABLESPACE BEGIN BACKUP ALTER TABLESPACE END BACKUP ALTER TABLESPACE OFFLINE ALTER TABLESPACE READ ONLY
16 juliandyke.co
m
© 2005 Julian Dyke
Main and Auxiliary Lists Each LRU contains
main list auxiliary list
Auxiliary list includes dirty buffers identified by DBWn processes buffers being written
Buffers are moved from main to auxiliary list by DBWn processes to avoid unnecessary scans
Processes scan auxiliary lists first for free buffers
Buffers also allocated to auxiliary list at startup after FLUSH_CACHE
17 juliandyke.co
m
© 2005 Julian Dyke
Working Set Lists
Hot Cold
BufferHeader
MAIN
AUX
MAIN
AUX
MAIN
AUX
MAIN
AUX
ReplacementList
Working SetHeader
WriteList
ObjectList
RangeList
18 juliandyke.co
m
© 2005 Julian Dyke
Replacement List In Oracle 8.1.5 and above a mid-point insertion algorithm is
used Buffer cache has a hot end and a cold end Buffers are inserted at mid-point Mid-point is head of cold end Starts at hot end - moves down cache Maximum mid-point determined by _db_percent_hot_default Default value is 50%
Head of Hot End
Head of Cold End
Hot End Cold EndReplacement List
19 juliandyke.co
m
© 2005 Julian Dyke
X$KCBWDS Externalises working set header
ADDR RAW(4)
INDX NUMBER
INST_ID NUMBER
SET_ID NUMBER
DBWR_NUM NUMBER
BLK_SIZE NUMBER
Working Set ID
Database Writer Number
MAIN Replacement List
AUX Replacement List
Number of buffers on MAIN Replacement List
Insertion Point
Maximum number of Hot Buffers
Number of Hot Buffers
NXT_REPL RAW(4)
PRV_REPL RAW(4)
NXT_REPLAX RAW(4)
PRV_REPLAX RAW(4)
CNUM_REPL RAW(4)
ANUM_REPL RAW(4)
COLD_HD RAW(4)
HBMAX NUMBER
HBUFS NUMBER
NXT_WRITE RAW(4)
Number of buffers on AUX Replacement List
20 juliandyke.co
m
© 2005 Julian Dyke
Touch Count Each buffer header maintains
touch count timestamp
Touch count represents number of 3 second intervals in which buffer has been accessed since buffer last read into cache touch count last reset
Each time buffer is accessed if timestamp more than 3 seconds ago
increment touch count set timestamp to current time
21 juliandyke.co
m
© 2005 Julian Dyke
Touch Count When buffer reaches tail of cold end
If touch count >= 2 then buffer is moved to hot end
Otherwise used as next free buffer
Hot criteria determined by _db_aging_hot_criteria default value is 2 touches
Time interval determined by _db_aging_touch_time default value is 3 seconds
22 juliandyke.co
m
© 2005 Julian Dyke
Single versus Multi-Block Reads Single block reads
Used with current reads Can be used with consistent reads Waits recorded by db file sequential read
Multi block reads Frequently used with consistent reads Maximum number of physical blocks read specified by
DB_FILE_MULTIBLOCK_READ_COUNT Waits recorded by db file scattered read Blocks moved to cold end of buffer cache
23 juliandyke.co
m
© 2005 Julian Dyke
Head of Cold End
Head of Hot End
Single-Block Reads
92
0
34
3
72
4
52
1
71
2
66
0
49
0
42
1
45
2
52
1
71
2
66
0
42
1
11
1
52
1
71
2
11
1
42
1
42
2
71
0
92
0
34
3
72
4
45
2
11
1
52
1
42
2
33
1
45
2
11
1
42
2
33
1
34
4
92
0
34
4
72
4
45
2
11
1
42
0
33
1
71
0
87
1
87
1
72
4
33
1
45
2
Read Block 42
Get first available buffer from cold endUpdate buffer contentsInsert buffer at head of cold end
Read Block 11
Get first available buffer from cold endUpdate buffer contentsInsert buffer at head of cold end
Read Block 42
Update touch count for block 42
Read Block 33
Move block 71 to head of hot endSet touch count on block 71 to zeroGet first available bufferfrom cold endUpdate buffer contentsInsert buffer at head of coldend
Read Block 34
Update touch countfor block 34
Read Block 87
Move block 42 to headof hot endSet touch counton block 42 to zeroGet first available bufferfrom cold endUpdate buffer contentsInsert buffer at head of coldend
STOP
Block Number
Touch Count
24 juliandyke.co
m
© 2005 Julian Dyke
Read Block 27 - SCN 132
Get first available bufferfrom cold endRead current version of block 27 into bufferApply undo to rollback block to SCN 132Insert buffer at head of cold end
Read Block 27 - SCN 128
Get first available buffer from cold endRead consistent version of block 27 into bufferApply undo to rollback block to SCN 128Insert buffer at head of cold end
Consistent Reads
Head of Cold End
Head of Hot End
Current Block
Consistent Block
5640 85
27
150
3417 95 3327
150
27
150
27
132
333417 9527
150
27
132
27
132
27
132
27
128
3417 9527
150
27
132
27
128
STOP
BlockNumber
SystemChangeNumber
25 juliandyke.co
m
© 2005 Julian Dyke
Multi-Block Reads
Head of Cold End
Head of Hot End
Read Block 1
Get first four available buffers from cold endRead next four blocks into buffers
1 2 3 4
Insert buffers at head of cold end
12 13 2 14 3 2 1
Move block 1 to cold end
121
Read Block 2
Move block 2 to cold end
21 321 3 4
Read Block 3
Move block 3 to cold end
Read Block 4
Move block 4 to cold end
Read Block 5
Get next four available buffers from cold endRead next four blocks into buffersInsert buffers at head of cold endMove block 5 to cold end
4 3 2 15
5 56
76
7 6 5
8
78 5 56 5 65 6 75 6 7 8
Read Block 6
Move block 6 to cold end
Read Block 7
Move block 7 to cold end
Read Block 8
Move block 8 to cold end
STOP
DB_FILE_MULTIBLOCK_READ_COUNT = 4
26 juliandyke.co
m
© 2005 Julian Dyke
Dirty Blocks When blocks are updated they are marked dirty
Changes immediately written to redo buffer
Changes written back to disk asynchronously by DBWn process
DBWn process scans from cold end of MAIN replacement list moves dirty blocks to auxiliary list writes dirty blocks back to disk
Written blocks remain on auxiliary list until re-used
27 juliandyke.co
m
© 2005 Julian Dyke
Buffer Pinning In Oracle 8.0 and above, Oracle uses pinning to reduce
number of logical I/Os
If buffer will be accessed again by the statement, it is pinned in the buffer cache
Frequently used with index scans
Only appears to be used with consistent gets not observed with current gets
If pinning was not implemented, number of logical I/Os would significantly increase
28 juliandyke.co
m
© 2005 Julian Dyke
Buffer Pinning Statistics buffer is not pinned count statistic
Number of pin-able buffers not pinned by this session when visited
Equivalent to number of logical I/Os (for that part of statement)
buffer is pinned count statistic Number of buffers already pinned by this session when
visited
Number of buffers visited = buffer is not pinned count + buffer is pinned count
29 juliandyke.co
m
© 2005 Julian Dyke
Consistent Gets Statistics consistent gets - examination statistic
Number of consistent gets that could be immediately performed without pinning the buffer
Generally apply to indexes Require one latch get Included in consistent gets statistic
no work - consistent read gets statistic Number of consistent gets that could be performed
without requiring rollback or cleanout Generally apply to tables Require two latch gets Included in consistent gets statistic
30 juliandyke.co
m
© 2005 Julian Dyke
Read Block 1
Segment Header 1
Read Block 1
Segment Header 2
Read Block 1
Segment Header 3
Table T1
Segment Header
Data Blocks
EmptyBlocks
UnusedBlocks
High Water Mark
Full Table Scan
session logical reads
consistent gets
no work - consistent read gets
buffer is not pinned count
table scans (short tables)
table scans rows gotten
table scans blocks gotten
1
1
2
2
3
3
1
1
1
4
1
4
4
5
5
2
2
8
2
6
6
33
3
12
7
7
4
4
4
16
8
8
5
5
20
5
9
96
6
24
6
7
10
10
7
28
7
8
11
11
8
32
8
9
12
12
9
36
9
10
13
13
10
40
10
11
14
14
11
44
11
12
15
15
12
48
12
13
16
16
13
52
13
14
17
17
14
56
14
15
15
1
60
15
18
18
16
16
1
64
16
19
19
17
17
1
68
17
20
20
18
18
1
72
18
21
21
18
18
1
72
18
22
22
18
18
1
72
18
23
23
Read Block 2
Data Block
Read Block 3
Data Block
Read Block 4
Data Block
Read Block 5
Data Block
Read Block 6
Data Block
Read Block 7
Data Block
Read Block 8
Data Block
Read Block 9
Data Block
Read Block 10
Data Block
Read Block 11
Data Block
Read Block 12
Data Block
Read Block 13
Data Block
Read Block 14
Data Block
Read Block 15
Data Block
Read Block 16
Data Block
Read Block 17
Data Block
Read Block 18
Data Block
Read Block 19
Data Block
Read Block 20
Empty Block
Read Block 21
Empty Block
SELECT SUM(c2) FROM t1;
0 SELECT STATEMENT1 0 TABLE ACCESS (FULL) OF 'T1'
STOP
31 juliandyke.co
m
© 2005 Julian Dyke
Full Table Scan - Summary In Oracle 9.2
segment header initially read 3 times segment header read again every 10 extents
All blocks are read up to high water mark
For longer tables blocks can be prefetched
Algorithm differs for Automatic Segment Space Managed tablespaces
32 juliandyke.co
m
© 2005 Julian Dyke
Table T1
Unique Scan
SELECT c2 FROM t1 WHERE c1 = 42;
0 SELECT STATEMENT1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1'2 1 INDEX (UNIQUE SCAN) OF 'I1'
session logical reads
consistent gets
consistent gets - examination
buffer is not pinned count
index fetch by key
table fetch by rowid
rows fetched by callback
Index I1
Leaf Blocks
Branch Block
1
1
1
2
2
2
1
3
3
3
2
1
1
1
Read Index Block 1
Branch Block
Read Index Block 4
Leaf Block
Read Table Block 3
Data Block
STOP
33 juliandyke.co
m
© 2005 Julian Dyke
Index Organised Table
SELECT c2 FROM t1 WHERE c1 = 42;
0 SELECT STATEMENT1 0 INDEX (UNIQUE SCAN) OF 'I1'
session logical reads
consistent gets
consistent gets - examination
index fetch by key
Index I1
Leaf Blocks
Branch Block
1
1
1
Read Index Block 1
Branch Block
2
2
2
1
Read Index Block 3
Leaf Block
STOP
34 juliandyke.co
m
© 2005 Julian Dyke
Single Table Hash Cluster
SELECT c2 FROM t1 WHERE c1 = 42;
0 SELECT STATEMENT1 0 TABLE ACCESS (HASH) OF 'T1'
session logical reads
consistent gets
no work - consistent read gets
cluster key scans
cluster key scan block gets
buffer is not pinned count
Table T1
Leaf Blocks
1
1
1
Read Table Block 7
Data Block
1
1
1
1
1
1
STOP
35 juliandyke.co
m
© 2005 Julian Dyke
Clustering Factor Measures relationship between index entries and
corresponding data blocks
Used by CBO to calculate cost of using index
Good clustering factor approaches number of blocks in table; Bad clustering factor approaches number of rows in table
CBO will favour indexes with a better clustering factor
Bad Clustering Factor Good Clustering Factor
36 juliandyke.co
m
© 2005 Julian Dyke
Table T1
Index I2
Leaf Blocks
Branch Block
Range Scan - Bad Clustering FactorSELECT c2 FROM t1 WHERE c3 = 42;
0 SELECT STATEMENT1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1'2 1 INDEX (RANGE SCAN) OF 'I2'
session logical reads
consistent gets
consistent gets - examination
no work - consistent read gets
index scans kdiixs1
buffer is not pinned count
buffer is pinned count
table fetch by rowid
Read Index Block 1
Branch Block
Read Index Block 3
Leaf Block
Read Table Block 2
Data Block
Read Index Block 3
Leaf Block (Pinned)
Read Table Block 6
Data Block
1
11
1
2
21
1
1
3
31
1
1
2
1
3
31
1
1
1
2
1
4
41
1
1
2
3
2
4
41
1
2
2
3
2
5
51
1
2
3
4
3
5
51
1
3
3
4
3
Read Index Block 3
Leaf Block (Pinned)
Read Table Block 10
Data Block
Read Index Block 3
Leaf Block (Pinned)
6
61
1
3
4
5
4
Read Table Block 14
Data Block
6
61
1
4
4
5
4
7
71
1
4
5
6
5
Read Index Block 3
Leaf Block (Pinned)
7
71
1
5
5
6
5
8
81
1
5
6
7
6
Read Table Block 18
Data Block
Read Index Block 3
Leaf Block (Pinned)
Read Table Block 22
Data Block
STOP
37 juliandyke.co
m
© 2005 Julian Dyke
Table T1
Index I3
Leaf Blocks
Branch Block
Range Scan - Good Clustering FactorSELECT c2 FROM t1 WHERE c4 = 42;
0 SELECT STATEMENT1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T1'2 1 INDEX (RANGE SCAN) OF 'I3'
session logical reads
consistent gets
consistent gets - examination
no work - consistent read gets
index scans kdiixs1
buffer is not pinned count
buffer is pinned count
table fetch by rowid
Read Index Block 1
Branch Block
Read Index Block 4
Leaf Block
Read Table Block 8
Data Block
Read Index Block 3
Leaf Block (Pinned)
Read Table Block 8
Data Block (Pinned)
Read Index Block 3
Leaf Block (Pinned)
Read Table Block 8
Data Block (Pinned)
Read Index Block 3
Leaf Block (Pinned)
Read Table Block 8
Data Block (Pinned)
Read Index Block 3
Leaf Block (Pinned)
1
11
1
Read Table Block 9
Data Block
Read Index Block 3
Leaf Block (Pinned)
Read Table Block 9
Data Block (Pinned)
2
21
1
1
3
31
1
1
2
1
3
31
1
1
1
2
1
3
31
1
2
2
2
1
3
31
1
3
2
2
1
3
31
1
4
3
2
1
3
31
1
5
3
2
1
3
31
1
6
4
2
1
3
31
1
7
4
2
1
4
41
1
7
5
3
2
4
41
1
8
5
3
2
4
41
1
9
6
3
2
STOP
38 juliandyke.co
m
© 2005 Julian Dyke
Clustering Factor - SummaryBad
ClusteringFactor
GoodClustering
Factor
session logical reads 8 4
consistent gets 8 4
consistent gets - examination 1 1
no work - consistent gets 6 2
index scans kdiixs1 1 1
buffer is not pinned count 7 3
buffer is pinned count 5 9
table fetch by rowid 6 6
Higher clustering factor Reduces number of logical I/Os required Increases number of buffers that can be pinned
39 juliandyke.co
m
© 2005 Julian Dyke
Row Prefetching For queries returning more than one row specify maximum
number of rows per round trip
If prefetch size too small Increased number of round trips Degrades performance
If prefetch size too large Increased number of packets May degrade performance
40 juliandyke.co
m
© 2005 Julian Dyke
Row Prefetching Applies to
OCI OCI_ATTR_PREFETCH_ROWS Pro*C Host Array JDBC setRowPrefetch () PL/SQL BULK COLLECT SQL*Plus SET ARRAYSIZE
OCI default prefetch value is 1 (returns 2 rows per fetch)
res = OCIAttrSet( (dvoid *)stmt, (ub4)OCI_HTYPE_STMT, (dvoid *)&prefetchRows, (ub4)0, (ub4)OCI_ATTR_PREFETCH_ROWS, (OCIError *)err);
41 juliandyke.co
m
© 2005 Julian Dyke
Row Prefetching Example - full table scan
1000 row table 31 blocks (+ segment header)
Prefetch Size
Co
nsi
sten
t G
ets
Prefetch Size
Consistent Gets
1 1003
2 518
3 337
4 276
5 227
10 130
20 82
50 53
100 43
250 37
500 35
1000 34
42 juliandyke.co
m
© 2005 Julian Dyke
Thank you for your interest
For more information and to provide feedback
please contact me
My e-mail address is:info@juliandyke.com
My website address is:
www.juliandyke.com
Recommended