106
1 © 2006 Julian Dyke Inside RAC Julian Dyke Independent Consultant Web Version juliandyke.co

Inside Rac

  • Upload
    raghu

  • View
    19

  • Download
    0

Embed Size (px)

DESCRIPTION

RAC

Citation preview

Inside RACSingle database on shared storage accessible to all nodes
Instances exchange information over an interconnect network
Node 1
Instance 1
Node 2
Instance 2
Object to which access must be controlled at instance level
Enqueue
Global Resources
Object to which access must be controlled at cluster level
Global Enqueue
Locks and enqueues which need to be consistent between all instances
© 2006 Julian Dyke
Contains convert and write queues
Distributed across all instances in cluster
Maintained by GCS and GES
Global Cache Services (GCS)
Coordinates access to database blocks for instances
Global Enqueue Services (GES)
Controls access to other resources (locks) including library cache and dictionary cache
Performs deadlock detection
© 2006 Julian Dyke
Up to 20 in Oracle 10.1
LMS0-LMS9 LMSa-LMSj
LMS0-LMS9 LMSa-LMSz
In Oracle 10.1 and above, number of GCS server processes can be configured using gcs_server_processes parameter
Default value is 1 (single CPU system)
© 2006 Julian Dyke
Formerly known as lock process
In 9.0.1 and below, number of lock processes may be configurable using _gc_lck_procs parameter
© 2006 Julian Dyke
Manages requests for global enqueues
Updates status of enqueues when granted to / revoked from an instance
One LMD0 process per instance
In 8.1.7 and below number of lock daemons may be configurable using _lm_dlmd_processes parameter
© 2006 Julian Dyke
Monitors cluster to maintain global enqueues and resources
Manages
© 2006 Julian Dyke
Creates subdirectories in BACKGROUND_DUMP_DEST directory
In Oracle 9.0.1 and above can be disabled using _diag_daemon parameter
Do not try this on a production system
© 2006 Julian Dyke
Instance specific
Can contain
X$KSLLD => V$LATCHNAME
X$KSUSD => V$STATNAME
X$KSUSE => V$SESSION
X$KSUPR => V$PROCESS
X$KQLFXPL => V$SQL_PLAN
Dynamic Performance Views
In a RAC environment each V$ view has an equivalent GV$ view
GV$ view includes INST_ID column. For example
V$SGA
GV$SGA
In Oracle 9.2 and below PARALLEL_MIN_SERVERS must be >= number of hosts to use GV$ views
In Oracle 10.1 and above PZnn background processes are used to return data on remote hosts e.g. PZ99
NAME
VARCHAR2(20)
VALUE
NUMBER
INST_ID
NUMBER
NAME
VARCHAR2(20)
VALUE
NUMBER
Synonym Name
View Name
SQL> ORADEBUG LKDEBUG HELP
-r <resource pointer> Resource Object
-b <gcs shadow pointer> GCS shadow Object
-p <process id> client pid
-P <process pointer> Process Object
-O <i1> <i2> <types> Oracle Format resname
-a <res/lock/proc> all <res/lock/proc> pointer
-A <res/lock/proc> all <res/lock/proc> contexts
-a <res> [<type>] all <res> pointers by an optional type
-a convlock all converting enqueue (pointers)
-A convlock all converting enqueue contexts
-a convres all res ptr with converting enqueues
-A convres all res contexts with converting enqueues
© 2006 Julian Dyke
-a hashcount list all resource hash bucket counts
-t Traffic controller info
-k GES SGA summary info
-m pkey <objectno> request for remastering this object at current instance
-m dpkey <objectno> request for dissolving remastering of this object at
current instance
Shared memory areas can be dumped to trace file using
ORADEBUG SETMYPID
ORADEBUG IPC
$ sqlplus /nolog
Connected
© 2006 Julian Dyke
0 0 65537 0x00000020000000 0x00000020000000
Subarea size Segment size
1 2 65537 0x00000020400000 0x00000020400000
Subarea size Segment size
2 1 65537 0x0000002012a000 0x0000002012a000
Subarea size Segment size
3 3 65537 0x00000030400000 0x00000030400000
Subarea size Segment size
0 0 65537 0x00000020000000 0x00000020000000
Subarea size Segment size
1 2 65537 0x00000020400000 0x00000020400000
Subarea size Segment size
2 1 65537 0x0000002012a000 0x0000002012a000
Subarea size Segment size
3 3 65537 0x00000030400000 0x00000030400000
Subarea size Segment size
Redo buffers
Keep and Recycle cache Oracle 8.0 and above
2K, 4K, 8K, 16K and 32K cache Oracle 9.0.1 and above
© 2006 Julian Dyke
Redo Buffers 2973696 No
Granule Size 4194304 No
Startup overhead in Shared Pool 46137344 No
Free SGA Memory Available 0
NAME
VARCHAR2(32)
BYTES
NUMBER
RESIZEABLE
VARCHAR2(3)
4 mb SGA_MAX_SIZE <= 128 mb
16 mb SGA_MAX_SIZE > 128 mb
If SGA_MAX_SIZE not set explicitly then defaults to sum of individual pool parameters
SGA_MAX_SIZE cannot be dynamically modified
© 2006 Julian Dyke
SGA_MAX_SIZE dependent on SGA_TARGET
Granule size can be controlled using _ksmg_granule_size unsupported parameter
© 2006 Julian Dyke
Large Pool 4M
Shared Pool 76M
Java Pool 4M
This is an example of an SGA mapped using X$KSMGE
STOP
In Oracle 10.2 describes around 700 memory areas
Shared Pool
gcs shadow locks freelist
ges enqueue multiple free
ges resource hash table
Significant RAC areas in Oracle 10.2
In Oracle 9.2 all five structures were stored in segmented arrays
Name
Size(bytes)
Location
Description of chunk type
Some RAC components are stored in the shared pool heap
SELECT ksmchcom, SUM(ksmchsiz), COUNT(*)
Array too large to fit in granule
Array may grow dynamically
enqueues (locks)
Segmented arrays externalised in X$KSMDD
© 2006 Julian Dyke
GCS Resources
GCS Shadows
GES Enqueues
GCS Resources
GCS Shadows
GES Resources
GES Shadows
GES Messages
Each heap extent occupies a single granule
Each extent contains one or more chunks
Each heap has a header containing
list of used chunks
list of free chunks
Can only be accessed using KSMCHDS e.g.
SELECT * FROM x$ksmhp
Buffer headers are stored in same granule as buffers
Buffer headers include
Buffer
Headers
Granule
Buffers
STOP
Read Block 11
Update buffer contents
Read Block 42
Update touch count
for block 42
Read Block 33
of hot end
Set touch count
Get first available buffer
end
of hot end
Set touch count
Get first available buffer
end
STOP
hot buffers moved to head of LRU statistic measures the number of hot buffers moved from the cold end to the hot end of the LRU
free buffers requested measures the number of times a current get or consistent get is performed against a buffer that is not already in cache.
© 2006 Julian Dyke
Read next four blocks into buffers
1
2
3
4
1
2
1
3
2
1
4
3
2
1
1
2
1
2
1
3
2
1
3
4
Read Block 4
Read Block 5
Read next four blocks into buffers
Insert buffers at head of cold end
Move block 5 to cold end
4
3
2
1
5
5
5
6
7
6
7
6
5
8
7
8
5
5
6
5
6
5
6
7
5
6
7
8
Read Block 7
Read Block 8
STOP
Request granted
Transfer block to Instance 1 for
exclusive
access
S
N
Transfer
Resource
status
1318
N
N
X
1320
N
N
X
1320
1323
Note that Instance 1 will create a past image (PI) of the dirty block
1
2
3
4
STOP
Past Images
When an instance passes a dirty block to another instance it
Flushes redo buffer to redo log
Retains past image (PI) of block in buffer cache
PI is retained until another instance writes block to disk
Used to reduce recovery times
Recorded in V$BH.STATUS as PI
Based on X$BH.STATE (value 8 in Oracle 10.2)
© 2006 Julian Dyke
Assume table t1 contains a single row in block 42
Instance 1 updates column to 1324
Block 42 is read from disk
Undo/Redo written to
Redo Log 1
Instance 1 updates column to 1325
Undo/Redo written to
Redo Log 1
Instance 1 updates column to 1326
Undo/Redo written to
Redo Log 1
Instance 1 updates column to 1327
Undo/Redo written to
Redo Log 1
Instance 1 updates column to 1328
Undo/Redo written to
Redo Log 1
Instance 2 updates column to 1329
GCS transfers block from Instance 1 to Instance 2
Instance 1 makes block 42
a Past Image block
Instance 2 Crashes
Contents of buffer cache are lost
DBWR has not written changes to block 42 back to disk yet
Instance 1 must perform recovery for Instance 2
Block 42 needs recovery
Undo/redo is applied from
Block 42 is subsequently written back to disk by DBWR
1323
1324
1324
1325
1325
1326
1326
1327
1327
1328
1328
1329
STOP
Instance 3
S
N
to avoid unnecessary lock conversions
1
3
4
2
STOP
Intended to prevent unnecessary lock downgrades when other instances only require read-only copies
For write to read transfers
Writing instance retains X lock
Reading instance retains null lock
If _fairness_threshold reached then
Reading instance receives S lock
_fairness_threshold default value is 4
© 2006 Julian Dyke
Instance 2
Instance 1 sets counter to 1
Instance 1 sends block to Instance 2
1323
1323
1323
1323
0
1
2
3
4
N
Instance 1 sets counter to 2
Instance 1 sends block to Instance 2
Instance 2 receives block with Null lock
1323
Instance 1 sets counter to 3
Instance 1 sends block to instance 2
Instance 2 receives block with Null lock
Instance 2 requests consistent read
Instance 1 sets counter to 4
Instance 1 downgrades lock from X to S
S
S
Lock
Element
GCS
Client
Buffer
Header
Lock
Element
GCS
Client
Buffer
Header
Buffer
Header
Lock
Element
GCS
Client
_gcs_resources parameter
GCS Enqueues (Shadows/Clients)
_gcs_shadow_locks parameter
© 2006 Julian Dyke
ALTER SESSION SET EVENTS
id1: 0x3591 id2: 0x10000 obj: 181 block: (1/13713)
lock: SL rls: 0x0000 acq: 0x0000 latch: 0
flags: 0x41 fair: 0 recovery: 0 fpin: 'kdswh05: kdsgrp'
bscn: 0x0.18a9c bctx: (nil) write: 0 scan: 0x0 xflg: 0 xid: 0x0.0.0
GCS CLIENT 0x21fecd60,1 sq[(nil),(nil)] resp[(nil),0x3591.10000] pkey 181
grant 1 cvt 0 mdrole 0x21 st 0x20 GRANTQ rl LOCAL
master 1 owner 0 sid 0 remote[(nil),0] hist 0x7c
history 0x3c.0x1.0x0.0x0.0x0.0x0. cflag 0x0 sender 2 flags 0x0 replay# 0
disk: 0x0000.00000000 write request: 0x0000.00000000
pi scn: 0x0000.00000000
pkey 181
hv 107 [stat 0x0, 1->1, wm 32767, RMno 0, reminc 6, dom 0]
kjga st 0x4, step 0.0.0, cinc 8, rmno 10, flags 0x0
lb 0, hb 0, myb 178, drmb 178, apifrz 0
© 2006 Julian Dyke
id1: 0x6a39 id2: 0x10000 obj: 74 block: (1/27193)
lock: SL rls: 0x0000 acq: 0x0000 latch: 0
flags: 0x41 fair: 0 recovery: 0 fpin: 'kdswh05: kdsgrp'
bscn: 0x0.26992 bctx: (nil) write: 0 scan: 0x0 xflg: 0 xid: 0x0.0.0
GCS SHADOW 0x237f43a0,1 sq[0x2ee64e8c,0x2eff3858] resp[0x2ee64e74,0x6a39.10000] pkey 74
grant 1 cvt 0 mdrole 0x21 st 0x40 GRANTQ rl LOCAL
.....
grant 0x2eff3858 cvt (nil) send (nil),0 write (nil),0@65535
.....
GCS SHADOW 0x2eff3858,1 sq[0x237f43a0,0x2ee64e8c] resp[0x2ee64e74,0x6a39.10000] pkey 74
grant 1 cvt 0 mdrole 0x21 st 0x40 GRANTQ rl LOCAL
.....
GCS SHADOW 0x237f43a0,1 sq[0x2ee64e8c,0x2eff3858] resp[0x2ee64e74,0x6a39.10000] pkey 74
grant 1 cvt 0 mdrole 0x21 st 0x40 GRANTQ rl LOCAL
.....
Block DBA is reported by X$KJBR.KJBRNAME
Names for have the format:
[<block_number>][<file_number>],[BL]
Ordering by X$KJBR.KJBRNAME is difficult because the resource names do not collate e.g.
[0x900][0x70000],[BL]
[0x90][0x70000],[BL]
© 2006 Julian Dyke
_lm_contiguous_res_count
Specifies number of contiguous blocks that will hash to the same HV bucket
Defaults to 128
Block Mastering
In Oracle 9.2 (and probably 10.1) block mastering determined by hash function
Algorithm applied to groups of 1289 contiguous blocks
In two node cluster
In three node cluster
Beware of small hot tables and indexes....
© 2006 Julian Dyke
very high threshold so difficult to test
does occur on some customer sites
may cause LMON process to crash in 10.1.0.4
bug 3659289 - patch available
OBJECT_ID
To remaster object at current instance use:
All blocks now mastered by the current instance
To redistribute masters to all available instances use:
ORADEBUG LKDEBUG -m dpkey 52084
Blocks mastered by both (all) instances again
© 2006 Julian Dyke
FILE_ID
NUMBER
OBJECT_ID
NUMBER
CURRENT_MASTER
NUMBER
PREVIOUS_MASTER
NUMBER
REMASTER_CNT
NUMBER
Instances are internally numbered 0, 1 etc
Initially contains no rows
SELECT object_id, current_master, previous_master
Object ID
Current Master
Previous Master
Information about Dynamic Remastering operations is also recorded in the following fixed views
X$KJDRMREQ
WHERE c3 = 42 AND c4 < 2004
Parent
Name
Externalized in X$KGLLK
Externalized in X$KGLPN
Each KGLHD structure has a set of double linked lists including;
Locks
Pins
Child
Handle
Child
Object
X$KGLOB
X$KGLPN
X$KGLLK
KGLOB
STOP
Lock
Lock
Pin
Lock
Pin
Pin
KGLHD
Externalized by KGLHDNSP in X$KGLOB
CRSR
LOB
REIP
RMGR
JVSD
RELS
MVOBINX
NSCPD
TABL
DIR
CPOB
XDBS
STFG
RELD
STBO
JSLV
BODY
QUEU
EVNT
PPLN
TRANS
IFSD
HTSO
MODL
TRGR
OBJG
SUMM
PCLS
RELC
XDBC
JSGA
Unused
INDX
PROP
DIMM
SUBS
RULE
USAG
JSET
Unused
CLST
JVSC
CTS
LOCS
STRM
MVOBTBL
TABLE
Unused
KGLT
JVRE
OUTL
RMOB
REVC
JSQI
CLST
Unused
PIPE
ROBJ
RULS
RSMD
STAP
CDC
INDX
Unused
ALTER SESSION SET EVENTS
For example:
-------------- --------- --------- --------- --------- ---------- ----------
....
Contains 11 rows in Oracle 10.2
RAC Specific
kglstrld,kglstinv, kglstlrq,kglstprq,kglstprl,kglstirq,kglstmiv
FROM x$kglst
WHERE indx<8 OR indx=13 OR indx=14 OR indx=32
Names are generated in dynamic performance view
Only selected rows from X$KGLST
© 2006 Julian Dyke
Contains one row for each namespace (59 rows in 10.2)
RAC Specific
Row Cache Locks
ALTER SESSION SET EVENTS
BUCKET 127469:
name=US01.T1
hash=b2f454b86387761e02fc7e686e37f1ed timestamp=01-14-2006 22:04:06
namespace=TABL flags=KGHP/TIM/MED/[40000000]
kkkk-dddd-llll=0000-0701-0701 lock=0 pin=0 latch#=1 hpc=0002 hlc=0002
lwt=0x2bb8e018[0x2bb8e018,0x2bb8e018] ltm=0x2bb8e020[0x2bb8e020,0x2bb8e020]
pwt=0x2bb8dffc[0x2bb8dffc,0x2bb8dffc] ptm=0x2bb8e004[0x2bb8e004,0x2bb8e004]
ref=0x2bb8e038[0x2bb8e038,0x2bb8e038] lnd=0x2bb8e044[0x2bb7a7ac,0x2bb8e410]
LOCK INSTANCE LOCK: id=LBb2f454b86387761e
PIN INSTANCE LOCK: id=NBb2f454b86387761e mode=S release=F flags=[00
INVALIDATION INSTANCE LOCK: id=IV0000c9890e170507 mode=S
LIBRARY OBJECT: object=2caede30
type=TABL flags=EXS/LOC[0005] pflags=[0000] status=VALD load=0
BUCKET 127469 total object count=1
For example:
For example:
BUCKET 127469:
name=US01.T1
hash=b2f454b86387761e02fc7e686e37f1ed timestamp=01-14-2006 22:04:06
namespace=TABL flags=KGHP/TIM/MED/[40000000]
kkkk-dddd-llll=0000-0701-0701 lock=0 pin=0 latch#=1 hpc=0002 hlc=0002
lwt=0x2bb8e018[0x2bb8e018,0x2bb8e018] ltm=0x2bb8e020[0x2bb8e020,0x2bb8e020]
pwt=0x2bb8dffc[0x2bb8dffc,0x2bb8dffc] ptm=0x2bb8e004[0x2bb8e004,0x2bb8e004]
ref=0x2bb8e038[0x2bb8e038,0x2bb8e038] lnd=0x2bb8e044[0x2bb7a7ac,0x2bb8e410]
LOCK INSTANCE LOCK: id=LBb2f454b86387761e
PIN INSTANCE LOCK: id=NBb2f454b86387761e mode=S release=F flags=[00
INVALIDATION INSTANCE LOCK: id=IV0000c9890e170507 mode=S
LIBRARY OBJECT: object=2caede30
type=TABL flags=EXS/LOC[0005] pflags=[0000] status=VALD load=0
BUCKET 127469 total object count=1
© 2006 Julian Dyke
id=NBb2f454b86387761e
[0xb2f454b8][0x6387761e],[NB]
© 2006 Julian Dyke
34 Parent Cache
8 Subordinate Caches
© 2006 Julian Dyke
ALTER SESSION SET EVENTS
For example:
-------------------------- ------- ------- ------ --------- -------
....
BUCKET 48205:
own=0x2bb8dd44[0x2bb8dd44,0x2bb8dd44]
status=VALID/-/-/-/-/-/-/-/-
set=0, complete=FALSE
......
00000000 00000006
ALTER SESSION SET EVENTS
For example:
2 x 32 bit integer tag fields
Used with
© 2006 Julian Dyke
V$/GV$ dynamic performance views
KJBR
KJIRFT
Other structures reference the resource names in these structures including
KJBL
KJILFKT
_lm_ress parameter
_lm_locks parameter
ADDR
RAW(4)
INDX
NUMBER
INST_ID
NUMBER
KJIRFTRP
RAW(4)
KJIRFTRN
VARCHAR2(30)
KJIRFTCQ
NUMBER
KJIRFTGQ
NUMBER
KJIRFTPR
NUMBER
KJIRFTRDN
VARCHAR2(25)
KJIRFTMN
NUMBER
KJIRFTNCL
VARCHAR2(9)
KJIRFTVS
VARCHAR2(32)
KJIRFTVB
VARCHAR2(64)
Synonym for V$DLM_RESS
Synonym for V$DLM_RESS
kjirftvs, kjirftvb
FROM x$kjirft
UNION ALL
'KJUSERVS_NOVALUE', '0x0'
FROM x$kjbr
HANDLE
RAW(4)
GRANT_LEVEL
VARCHAR2(9)
REQUEST_LEVEL
VARCHAR2(9)
RESOURCE_NAME1
VARCHAR2(30)
RESOURCE_NAME2
VARCHAR2(30)
PID
NUMBER
TRANSACTION_ID0
NUMBER
TRANSACTION_ID1
NUMBER
GROUP_ID
NUMBER
OPEN_OPT_DEADLOCK
NUMBER
OPEN_OPT_PERSISTENT
NUMBER
OPEN_OPT_PROCESS_OWNED
NUMBER
OPEN_OPT_NO_XID
NUMBER
CONVERT_OPT_GETVALUE
NUMBER
CONVERT_OPT_PUTVALUE
NUMBER
CONVERT_OPT_NOVALUE
NUMBER
CONVERT_OPT_DUBVALUE
NUMBER
CONVERT_OPT_NOQUEUE
NUMBER
CONVERT_OPT_EXPRESS
NUMBER
CONVERT_OPT_NODEADLOCKWAIT
NUMBER
CONVERT_OPT_NODEADLOCKBLOCK
NUMBER
WHICH_QUEUE
NUMBER
STATE
VARCHAR2(64)
AST_EVENT0
NUMBER
OWNER_NODE
NUMBER
BLOCKED
NUMBER
BLOCKER
NUMBER
SELECT
kjilkftxid0, kjilkftxid1, kjilkftgid, kjilkftoodd, kjilkftoopt, kjilkftoopo,
kjilkftoonxid, kjilkftcogv, kjilkftcopv, kjilkftconv, kjilkftcodv, kjilkftconq,
kjilkftcoep, kjilkftconddw, kjilkftconddb, kjilkftwq, kjilkftls, kjilkftaste0,
kjilkfton, kjilkftblked, kjilkftblker
FROM x$kjilkft
UNION ALL
SELECT inst_id,
kjbllockp, kjblgrant, kjblrequest, kjblname, kjblname2, 0, 0, 0, 0, 0, 1, 0, 1,
0, 0, 0, 0, 0, 0, 0, 0, kjblqueue, kjbllockst, 0, kjblowner, kjblblocked, kjblblocker
FROM x$kjbl
For more information and to provide feedback
please contact me