SGA Access

Preview:

Citation preview

Oaktable Jonathan Lewis and ORACLE_TRACE Oracle_Trace crashes my Database I start the SGA attach by searching every offset Anjo Kolk says James Morle wrote a program using

x$ksmmem I show James my first draft using x$ksmmem James is baffled by why I'm hard coding offsets James says the offsets are in some X$ table I search, turn up a mail by Jonathan Lewison

x$kqfco Goldmine – all the offsets Thanks Mogens Nogard! Thanks to TomKyte's Decimal to Hex

http://oraperf.sourceforge.net

Direct Oracle SGA Memory Access

Reading data directly from Oracle’s shared memory segment using C code

Sunday, April 9, 2023

SGA on UNIX

oracle oracle sqlplussqlplus

DBWRDBWR

CKPTCKPT

LGWRLGWR

ARCHARCH

PMONPMON

SMONSMON SSnnnnnnPPnnnnnn

SGAShared PoolShared Pool Database Buffer CacheDatabase Buffer Cache

Redo LogRedo LogBufferBuffer

DDnnnnnn

Machine Memory

SGA on NT

Shared PoolShared Pool Database Buffer CacheDatabase Buffer CacheRedo LogRedo Log

BufferBuffer

SMONSMON

PMONPMON

DBWRDBWRLGWRLGWR

SSnnnnnn DDnnnnnn PPnnnnnn CKPTCKPT

ARCHARCH oracle oracle

sqlplussqlplusProcess Space

Machine Memory

What is the SGA

Memory CacheOften Used Data Rapid AccessShareable Concurrently Access

SGA 4 main regions

Fixed information– Users info– Database statistics– X$dual– etc

Data block cache SQL cache ( library cache/shared pool) Redo log buffer

How is the SGA info Used?

Automatically– data blocks cached– Log buffer– Sql cache– Updates of system and user statistics

User Queries– User info v$session– System info v$parameter– Performance statistics v$sysstat, v$latch, v$system_event– Buffer cache headers, x$bh

Why Direct Access with C?

Reading Hidden Information– Sort info on version 7 – OPS locking info version 8 – Contents of data blocks (only the headers or visible in X$)

Access while Database is Hung High Speed Access

– Sampling User Waits, catch ephemeral data– Scan LRU chain in X$bh– Statistically approximate statistics

SQL statistics per user

Low overhead

Database Slow or Hung

Often happens at the largest sites when cutting edge support is expected.

Shared Pool errors ORA 4031 Archiver or Log file Switch Hangs Hang Bugs Library Cache Latch contention ORA-00379: no free buffers available in

buffer pool DEFAULT

Statistical Sampling

By Rapidly Sampling SQL statistics and the users who have the statistics open, one can see how much work a particular user does with a particular SQL statement

Low Overhead

Marketing Appeal Clients are sensitive about their

production databases Heisenberg uncertainty affect – less

overhead less affect monitoring has on performance which we are monitoring  

SGA made visible through x$tables

Most of the SGA is not visible X$KSMMEM Exception, Raw Dump of SGA Information Externalized through X$ tables Useful or Necessary information is Externalized Externalized publicly through V$ Tables

Machine Memory

0x80000000

SGASGA

Graphic SGA

SGA0x80000000

Buffer Cache

Buffer Cache

Fixed Area

Shared Pool

Log Buffer

Fixed Area

0x85251EF4

X$KSUSECST- user waitsSGA0x80000000

X$KSUSECST

170 Records2328 bytes

Row 10x85251EF4 Row 2Row 3 …

X$KSUSECST Record

One Record in X$KSUSECST

2328 bytes

1276

X$KSUSECST Fields

1276 1278 1280 1284 1288

Seq # Event # p3p2p1

Externalization of C structs: X$ tables

If Structure foo was externalized in a X$

SQL> describe x$fooColumn Name Type------------------------------ --------ADDR RAW(8)INDX NUMBERID NUMBERB NUMBER

SGA is One Large C Struct

struct foo{

int id; int A; int B; int C;};

struct foo foo[N];

Struct C code

#include <stdio.h>

#include <fcntl.h>

#define N 20

/* structure definition: */

struct foo

{

int id;

int a;

int b;

int c;

};/* end structure definition */

Struct Recordmain(){

struct foo foo[20];

int fptr;

/* zero out memory of struct */

memset(foo,0,sizeof(foo));

foo[0].id=1; /* row 0 */

foo[0].a=12;

foo[0].b=13;

foo[0].c=13;

Struct Write to File

foo[1].id=2; /* row 1 */

foo[1].a=22;

foo[1].b=23;

foo[1].c=24;/* write to file, simulate SGA */

if ((fptr = open("foo.out",O_WRONLY | O_CREAT,0777)) < 0 )

return -1;

write(fptr,foo,sizeof(foo)); return 0;

}

Simulate SGA with a File

write(fp,foo,sizeof(foo));

Simulate SGA with a File

Memory address Increasing

0 16

32

48

64

80

AID B IDC A …

Row 0 Row 1

0 4 8 12

16

20

bitsbytes

0 4 8 C 10

14

hex bytes0 4 1

014

20

24

oct bytes

Struct File Contents

$ ./foo

$ ls -l foo.out

-rw-r--r-- joe dba 320 Feb 10 19:41 foo.out

int = 32 bits

Int = 4 bytes

20 entries * 4 int * 4 bytes/int = 320 bytes

od – octal dump

$ od -l foo.out

0000000 1 12 13 13

0000020 2 22 23 24

0000040 0 0 0 0

*

0000500

Struct File Contents

Address is in HexColumn 2 is the IDColumn 3 is field AColumn 4 is field BColumn 5 is field C

X$ tables ?

Ok, x$foo =~ foo[20]How do I get a list of x$ tables?Where is each X$ located?V$Fixed_Tables

V$Fixed_Table – list of X$ tables

SQL> desc v$fixed_table;

Name Null? Type

----------------------------------------- -------- -----------------

NAME VARCHAR2(30)

OBJECT_ID NUMBER

TYPE VARCHAR2(5)

TABLE_NUM NUMBER

Graphic: X$ Addresses

0x8????????

X$????

SGA0x80000000

V$Fixed_Table

spool addr.sqlselect 'select 'addr, ||''''||name||''''||' from ' || name ||'

where rownum < 2;' from v$fixed_table where name like 'X%'/spool off@addr.sql

Example: finding the address

select a.addr ,'X$KSUSE'

from X$KSUSE

where rownum < 2 ;

X$ layout6802B244 X$KSLEMAP

6802B7EC X$KSLEI

6820B758 X$KSURU

6820B758 X$KSUSE - v$session

6820B758 X$KSUSECST – v$session_wait

6820B758 X$KSUSESTA – v$session_stat

6820B758 X$KSUSIO

6826FBD0 X$KSMDD

6831EA0C X$KSRCHDL

What's in these X$ views

V$ views are documented V$ views are based often on X$ tables The map from v$ to X$ is described in :

V$Fixed_View_Definition

V$Fixed_View_Definition

SQL> desc V$Fixed_View_Definition Name Type ----------------------------------- -------------- VIEW_NAME VARCHAR2(30) VIEW_DEFINITION

VARCHAR2(4000)

Definition of V$Session_Wait

SQL> select VIEW_DEFINITION from V$FIXED_VIEW_DEFINITION where view_name='GV$SESSION_WAIT';VIEW_DEFINITION-----------------------------------------------------------------------select s.inst_id,s.indx,s.ksussseq,e.kslednam, e.ksledp1,s.ksussp1,s.ksussp1r,e.ksledp2, s.ksussp2,s.ksussp2r,e.ksledp3,s.ksussp3,s.ksussp3r, decode(s.ksusstim,0,0,-1,-1,-2,-2, decode(round(s.ksusstim/10000),0,-1,round(s.ksusstim/10000))), s.ksusewtm, decode(s.ksusstim, 0, 'WAITING', -2, 'WAITED UNKNOWN TIME', -1, 'WAITED SHORT TIME', 'WAITED KNOWN TIME') from x$ksusecst s, x$ksled e where bitand(s.ksspaflg,1)!=0 and bitand(s.ksuseflg,1)!=0 and s.ksussseq!=0 and s.ksussopc=e.indx

The Fields in X$ tables

OK, I've picked an X$ I've got the starting address Now, how do I get the fields?

X$KQFTA

Kernel Query Fixed_view Table INDX use to find column information KQFTANAM X$ table names

X$KQFCO

Kernel Query Fixed_view Column KQFCOTAB Join with X$KQFTA.INDX KQFCONAM Column name KQFCOOFF Offset from beginning of the

row KQFCOSIZ Columns size in bytes

X$KSUSECST Fields

1276 1278 1280 1284 1288

Seq # Event # p3p2p1

2 2 4 4 4 BYTES

Address

SGA Contents in Resume

Fixed SGA

Buffer Cache

Redo Buffer

Library Cache

Memory address Increasing0x8000000

In resume:

Oracle takes the C structure defining the SGA and maps it onto a shared memory segment

Oracle provides access to some of the SGA contents via X$ tables

**** Procedure *****

1. Choose a V$ view

2. Find base X$ Tables for v$ view

3. Map X$ fields to V$ fields

4. Get address of X$ table in SGA

5. Get the size of each record in X$ table

6. Get the number of records in X$ table

7. Get offsets for each desired field in X$ table

8. Get the base address of SGA

1) V$SESSION_WAIT Example

List of all users waiting Detailed information on the waits Data is ephemeral Useful in Bottleneck diagnostics High sampling rate candidate Event 10046 captures this info

Good table for SGA sampling

V$SESSION_WAIT DescriptionSQL> desc v$session_wait

Name Type

----------------------------------------- --------------------------

SID ,NUMBER

SEQ# ,NUMBER

EVENT ,VARCHAR2(64)

P1TEXT ,VARCHAR2(64)

P1 ,NUMBER

P1RAW ,RAW(4)

P2TEXT ,VARCHAR2(64)

P2 ,NUMBER

P2RAW ,RAW(4)

P3TEXT ,VARCHAR2(64)

P3 ,NUMBER

P3RAW ,RAW(4)

WAIT_TIME ,NUMBER

SECONDS_IN_WAIT ,NUMBER

STATE ,VARCHAR2(19)

)

V$SESSION_WAIT Short

SQL> desc v$session_wait

Name Type

---------------------------- -------------

SID NUMBER

SEQ# NUMBER

EVENT VARCHAR2(64)

P1 NUMBER

P2 NUMBER

P3 NUMBER)

V$FIXED_VIEW_DEFINITION

Gives mappings of V$ views to X$ tables

SQL> select

VIEW_DEFINITION

from

V$FIXED_VIEW_DEFINITION

where

view_name='V$SESSION_WAIT‘;

V$SESSION_WAIT View DefinitionVIEW_DEFINITION---------------------------------------------------------------------select s.inst_id, s.indx, s.ksussseq, e.kslednam,e.ksledp1,s.ksussp1,s.ksussp1r,e.ksledp2,s.ksussp2,s.ksussp2r,e.ksledp3,s.ksussp3,s.ksussp3r,round(s.ksusstim / 10000), s.ksusewtm, decode(s.ksusstim, 0, 'WAITING', -2, 'WAITED UNKNOWN TIME', -1, 'WAITED SHORT TIME', 'WAITED KNOWN TIME') from x$ksusecst s, x$ksled e where bitand(s.ksspaflg,1)!=0 and bitand(s.ksuseflg,1)!=0 and s.ksussseq!=0 and s.ksussopc=e.indx

View Definition ShortVIEW_DEFINITION---------------------------------------------------------------------select

s.indx,s.ksussseq,e.kslednam,s.ksussp1,s.ksussp2,s.ksussp3

from x$ksusecst s, x$ksled e

where s.ksussopc=e.indx

2) V$SESSION_WAIT Based on X$KSUSECT

VIEW_DEFINITION----------------------------------------------------select

indx,ksussseq,ksussopc,ksussp1,ksussp2,ksussp3

from x$ksusecst

Equivalent SQL Statements

select indx,ksussseq,ksussopc,ksussp1,ksussp2,ksussp3

from x$ksusecst

select sid seq# event p1 p2 p3 from v$session_wait )

Note: x$ksusecst. Ksussopc is the event #

x$ksled.kslednam is a list of the event names where

x$ksled.indx = x$ksusecst. ksussopc

3) V$ to X$ Field Mapping

4) Get base SGA address for X$ table

Find the location of X$KSUSECST in the SGA

SQL> select addr from x$ksusecst where rownum < 2

ADDR

--------

85251EF4

5) Find the Size of Each Record

SQL> select

((to_dec(e.addr)-to_dec(s.addr))) row_size

from

(select addr from x$ksusecst where rownum < 2) s,

(select max(addr) addr from x$ksusecst where rownum < 3) e ;

 

ROW_SIZE

----------------

2328

6) Find the Number of Records in the structure

SQL> select count(*) from x$ksusecst ;

COUNT(*)

--------------

170

Get Offsets for Each Desired Field in X$ table

SQL> select c.kqfconam field_name, c.kqfcooff offset, c.kqfcosiz szfrom x$kqfco c, x$kqfta twhere t.indx = c.kqfcotab and t.kqftanam='X$KSUSECST'order by offset;

X$KQFTA - X$ Tables Names

List of X$ tables

INDX use to find column information

KQFTANAM X$ table names

To get Column information join with X$KQFCO

X$KQFTA.INDX = X$KQFCO.KQFCOTAB

X$KQFCO – X$ Table Columns

List of all the columns in X$ Tables

KQFCOTAB Join with X$KQFTA.INDX KQFCONAM Column name KQFCOOFF Offset from beginning of the

row KQFCOSIZ Columns size in bytes

Field OffsetsFIELD_NAME OFFSET SZ------------------------------ ---------- ----------ADDR 0 4INDX 0 4KSUSEWTM 0 4INST_ID 0 4KSSPAFLG 1 1KSUSSSEQ 1276 2KSUSSOPC 1278 2KSUSSP1 1280 4KSUSSP1R 1280 4KSUSSP2 1284 4KSUSSP2R 1284 4KSUSSP3 1288 4KSUSSP3R 1288 4KSUSSTIM 1292 4KSUSENUM 1300 2KSUSEFLG 1308 4

What are all the fields at OFFSET 0?

These are all calculated values and not stored explicitly in the SGA.

ADDR memory address

INDX record number, like rownum

INST_ID database instance ID

KSUSEWTM calculated field

Unexposed Fields

What happens between OFFSET 1 and 1276?

• Unexposed Fields• Sometimes exposed elsewhere, in our case

• V$SESSION• V$SESSTAT

Fields at Same Address

Why do some fields start at the same address?KSUSSP1KSUSSP1R

Are at the same addressEquivalent of

V$SESSION_WAIT.P1 V$SESSION_WAIT.P1RAW

These are the same data, just exposed asHexDecimal

7) Offsets of Fields

8) Get Base SGA Address

 

SQL> select addr from x$ksmmem where rownum < 2

 

ADDR

--------------

80000000

Results X$KSUSECST

Machine Memory

0x80000000

SGASGA

Fixed Area

0x85251EF4

X$KSUSECST- user waitsSGA0x80000000

X$KSUSECST

170 Records2328 bytes

Row 10x85251EF4 Row 2Row 3 …

X$KSUSECST Record

One Record in X$KSUSECST

2328 bytes

1276

X$KSUSECST Fields

1276 1278 1280 1284 1288

Seq # Event # p3p2p1

Attaching to the SGA

UNIX System Call “shmat”

To attach to shared memory Unix as a system call

void *shmat( int shmid,

const void *shmaddr,

int shmflg );

ID and Address arguments to “shmat”

The arguments are:

shmid – shared memory identifier specified shmaddr – starting address of the shared memory shmflg - flags

The argument shmflg can be set to SHM_RDONLY . To avoid any possible data corruption the SGA should only be attached read only.

The arguments shmid and shmaddr need to be set to Oracle’s SGA id and address.

Finding Oracle SGA’s ID and Address

Use ORADEBUG to find the SGA id

SQL> oradebug setmypid

Statement processed.

SQL> oradebug ipc

Information written to trace file.

 

Finding Trace File

SQL> show parameters user_dumpNAME VALUE

----------------------- --------------------------------

user_dump_dest /u02/app/oracle/admin/V901/udump

SQL> exit

$ cd /u02/app/oracle/admin/V901/udump

$ ls -ltr | tail -1

-rw-r----- usupport dba Aug 24 18:01 v901_ora_23179.trc

Finding SHMID in Trace File

$ vi v901_ora_23179.trc

Total size 004456c Minimum Subarea size 00000000

Area Subarea Shmid Stable Addr Actual Addr

0 0 34401 0080000000 0080000000

Attaching to the SGA

       Shmid 34401Shmaddr0x80000000Shmflg SHM_RDONLY The SGA attach call in C would be: Shmat(34401, 0x80000000, SHM_RDONLY); This call needs to be executed as a UNIX user who has

read permission to the Oracle SGA

C Code Headers

#include <stdio.h>

#include <sys/ipc.h>

#include <sys/shm.h>

#include <errno.h>

#include "event.h"

 

event.h is for translating the event #s into event names

Events.h

Spool events.h

select 'char event[][100]={' from dual;

select '"'||name||'",' from v$event_name;

select ' "" };' from dual;

spool off

Define Base Addresses and Sizes

/* SGA BASE ADDRESS */

#define SGA_BASE 0x80000000

/* START ADDR of KSUSECST(V$SESSION_WAIT) */

#define KSUSECST_ADDR 0x85251EF4

/* NUMBER of ROWS/RECORDS in KSUSECST */

#define SESSIONS 150

/* SIZE in BYTES of a ROW in KSUSECST */

#define RECORD_SZ 2328

Define Offsets to Fields

 

#define KSUSSSEQ 1276 /* sequence # */

#define KSUSSOPC 1278 /* event # */

#define KSUSSP1R 1280 /* p1 */

#define KSUSSP2R 1284 /* p2 */

#define KSUSSP3R 1288 /* p3 */

Set Up Variables

main(argc, argv) int argc; char **argv; { void *addr; int shmid; int shmaddr; void *current_addr; long p1r, p2r, p3r; unsigned int i, seq, tim, flg, evn;

Attach to SGA

 /* ATTACH TO SGA */ shmid=atoi(argv[1]); shmaddr=SGA_BASE; if (

(void *)shmat(shmid,(void *)shmaddr,SHM_RDONLY)

== (void *)-1 ) { printf("shmat: error attatching to SGA\n"); exit(); }

Set Up Sampling Loop

/* LOOP OVER ALL SESSIONS until CANCEL */

while (1) {

/* set current address to beginning of Table */

current_addr=(void *)KSUSECST_ADDR;

sleep(1);

printf("^[[H ^[[J"); /* clear screen */

/* print page heading */

printf("%4s %8s %-20.20s %10s %10s %10s \n",

"sid", "seq", "wait","p1","p2","p3");

Loop over all Sessions

for ( i=0; i < SESSIONS ; i++ ) { seq=*(unsigned short *)((int)current_addr+KSUSSSEQ); evn=*(short *) ((int)current_addr+KSUSSOPC); p1r=*(long *) ((int)current_addr+KSUSSP1R); p2r=*(long *) ((int)current_addr+KSUSSP2R); p3r=*(long *) ((int)current_addr+KSUSSP3R); if ( evn != 0 ) { printf("%4d %8u %-20.20s %10X %10X %10X \n", i, seq, event[evn] ,p1r, p2r,p3r ); } current_addr=(void *)((int)current_addr+RECORD_SZ); } } }

Output

$ sga_read_session_wait 34401

sid seq wait p1 p2 p3

0 40582 pmon timer 12C 0 0

1 40452 rdbms ipc message 12C 0 0

2 43248 rdbms ipc message 12C 0 0

3 24706 rdbms ipc message 12C 0 0

4 736 smon timer 12C 0 0

5 88 rdbms ipc message 2BF20 0 0

8 178 SQL*Net message from 6265710 1 0

Pitfalls

Byte Swapping 32 bit vs 64 bit Multiple Shared Memory Segments Segmented Memory Addresses are "unsigned int" Misaligned Access

Little Endian vs Big Endian

Is low byte values first or high byte values first ? a byte is 8 bits

– 00000000-11111111 bits,0 – 31 dec, 0x0 - 0xFF hex Big Endian is "normal" , highest bit first In ascii, the word "byte" is stored as

– b = 62, y = 79, t = 74, e = 65 echo 'byte' | od -x

– b y t e– 62 79 74 65

Little Endian, ie byte swapped (Linux, OSF, Sequent, ? )

– y b e t – 79 62 65 74

Byte Swap Example

Short = 2 bytes ie 16 bitsGoal, get the flag in the "second" byte

#ifdef __linux uflg=*(short *)

((int)sga_address)>>8;#else uflg=*(short *)((int)sga_address);#endif

Byte Swap

Big Endian:00 00 00 00 00 00 00 01Little Endian00 00 00 01 00 00 00 00Solution, push the value over 8 places, to

the right, ie >>8

64 bit vs 32 bit

SQL> desc x$ksmmem Name Type

------------------------------------- --------- ADDR RAW(4) INDX NUMBER INST_ID NUMBER KSMMMVAL RAW(4)-> 32 bitRaw(8) -> 64 bit

Segmented Memory

x$ksuse – can be dis-contiguous

Work around:select 'int users[]={' from dual;select '0x'||addr||',' from x$ksuse;select '0x0};' from dual;

Misaligned Access

Some platforms seg fault when addressing misaligned bytes, need to read in even bytes or units of 4 bytes depending on platform

1 2 3 4 5 6 7 8

x$ksusecst Record: What's Missing?

One Record in X$KSUSECST

2328 bytes

1276

??? ???

Select Addr from X$? where Rownum< 2;

6802B244 X$KSLEMAP

6802B7EC X$KSLEI

6820B758 X$KSURU

6820B758 X$KSUSE – v$session

6820B758 X$KSUSECST – v$session_wait

6820B758 X$KSUSESTA – v$sesstat

6820B758 X$KSUSIO

6826FBD0 X$KSMDD

6831EA0C X$KSRCHDL

x$ksuse Record Contains x$ksusecst

One Record in X$Ksusecst

2328 bytes

1276

v$session_wait v$sessionv$session v$sesstat

x$ksusecstx$ksusesta

x$ksuse

236

Getting v$sesstat addresses

select '#define '|| upper(translate(s.name,' :-()/*''','________'))||' '|| to_char(c.kqfcooff + STATISTIC# * 4 )from x$kqfco c, x$kqfta t, v$statname swhere t.indx = c.kqfcotab and ( t.kqftanam='X$KSUSESTA' ) and

c.kqfconam='KSUSESTV' and kqfcooff > 0order by c.kqfcooff/

User Drilldown Query: 4 joins select w.sid sid, w.seq# seq, w.event event, w.p1raw p1, w.p2raw p2, w.p3raw p3, w.SECONDS_IN_WAIT ctime, s.sql_hash_value sqlhash, s.prev_hash_value psqlhash, st.value cpu from v$session s, v$sesstat st, v$statname sn, v$session_wait w where w.sid = s.sid and st.sid = s.sid and st.statistic# = sn.statistic# and

sn.name = 'CPU used when call started' andw.event != 'SQL*Net message from client'

order by w.sid;

Other Fun Stuff

The next example is output from an SGA program that follows the LRU of the Buffer Cache

The program demonstrates the • insertion point of LRU• cold end of LRU• hot end of the LRU• Full Table Scan Insertion Point

LRU HOT

LRU COLD