11gRAC Architecture Day3

8/3/2019 11gRAC Architecture Day3

1/40

11g RAC Workshop by Ram

RAC Architecture

Oracle Real Application clusters allows multiple instances to access a

single database, the instances will be running on multiple nodes. In an

standard Oracle configuration a database can only be mounted by one

instance but in a RAC environment many instances can access a single

database.


2/40


RAC Components

The major components of a Oracle RAC

system are

Shared disk system

Oracle Clusterware

Cluster Interconnects

Oracle Kernel Components


3/40



4/40


Disk architecture With today's SAN and NAS disk storage systems,

sharing storage is fairly easy and is required for aRAC environment, you can use the below storagesetups

SAN (Storage Area Networks) - generally using

fibre to connect to the SAN NAS ( Network Attached Storage) - generally

using a network to connect to the NAS usingeither NFS, ISCSI

JBOD - direct attached storage, the old traditionalway and still used by many companies as a cheapoption


5/40



6/40


RAC Concepts

To ensure that each Oracle RAC database instance obtainsthe block that it needs to satisfy a query or transaction asfast as possible, and to ensure data integrity,

Oracle RAC instances use two processes, the Global Cache

Service (GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the statuses of each

data file and each cached block using a Global ResourceDirectory (GRD).

The GRD contents are distributed across all of the active

instances, which increases the size of the SGA for an OracleRAC instance.


7/40


After one instance caches data, any other instance within

the same cluster database can acquire a block image from

another instance in the same database faster than byreading the block from disk. Therefore, Cache Fusion moves

current blocks between instances rather than re-reading

the blocks from disk. When a consistent block is needed or

a changed block is required on another instance, CacheFusion transfers the block image directly between the

affected instances.

Oracle RAC uses the private interconnect for inter-instance

communication and block transfers. The GES Monitor and the Instance Enqueue Process

manages access to Cache Fusion resources and enqueue

recovery processing.


8/40


Cache Fusion Model

Cache fusion Model is dependent on three

services

Global Resource Directory (GRD)

Global Cache Service (GCS)

Global En-queue Service (GES)


9/40


Cache Fusion - is disk less cache coherency mechanism in Oracle RAC that

provides copies ofdata blocks directly from one instances memory cache(in which that block is available) to other instance (instance which isrequest for specific data block). Cache Fusion provides single buffer cache(for all instances in cluster) through interconnect.

In Single Node oracle database, an instance looking for data block firstchecks in cache, if block is not in cache then goes to disk to pull block fromdisk to cache and return block to client.

In RAC Database there is remote cache so instance should look not only inlocal cache (cache local to instance) but on remote cache (cache onremote instance). If cache is available in local cache then it should returndata block from local cache; if data block is not in local cache, instead ofgoing to disk it should first go to remote cache (remote instance) to checkif block is available in local cache (via interconnect)

This is because accessing data block from remote cache is faster thanaccessing it from disk


10/40


RAC Database System has two importantservices. They are Global Cache Service (GCS)

and Global Enqueue Service (GES). These are

basically collections of backgroundprocesses. These two processes together

cover and manage the total Cache Fusion

process, resource transfers, and resource

escalations among the instances


11/40


Global Resource Directory GES and GCS together maintain a Global Resource Directory (GRD)

to record the information about the resources and the enqueues.GRD remains in the memory and is stored on all the instances. Eachinstance manages a portion of the directory. This distributed natureis a key point for fault tolerance of the RAC.

Global Resource Directory (GRD) is the internal database thatrecords and stores the current status of the data blocks. Whenever

a block is transferred out of a local cache to another instancescache the GRD is updated. The following resources information isavailable in GRD.

* Data Block Identifiers (DBA)

* Location of most current version

* Modes of the data blocks: (N)Null, (S)Shared, (X)Exclusive * The Roles of the data blocks (local or global) held by each instance

* Buffer caches on multiple nodes in the cluster


12/40


Global Cache Service (GCS)GCS is the main controlling process that implements Cache Fusion. GCS

tracks the location and the status (mode and role) of the data blocks, as

well as the access privileges of various instances. GCS is the mechanism

which guarantees the data integrity by employing global access levels.

GCS maintains the block modes for data blocks in the global role. It is also

responsible for block transfers between the instances. Upon a request

from an Instance GCS organizes the block shipping and appropriate lock

mode conversions. The Global Cache Service is implemented by various

background processes, such as the Global Cache Service Processes

(LMSn) and Global Enqueue Service Daemon (LMD).


13/40


Global Enqueue ServiceThe Global Enqueue Service (GES) manages ortracks the status of all the Oracle enqueuing

mechanism. This involves all non Cache fusionintra-instance operations. GES performsconcurrency control on dictionary cache locks,library cache locks, and the transactions. GES

does this operation for resources that areaccessed by more than one instanc


14/40


Enqueues are shared memory structures (locks)that serialize access to database resources. Theycan be associated with a session or transaction.Enqueue names are displayed in the LOCK_TYPE

column of the DBA_LOCK andDBA_LOCK_INTERNAL data dictionary views.

A resource uniquely identifies an object that canbe locked by different sessions within an instance

(local resource) or between instances (globalresource). Each session that tries to lock theresource will have an enqueue on the resource.


15/40



16/40


Background process Explanation Continued..

LMSGlobal Cache Service Process

The LMS process maintains records of thedatafile statuses and each cached block by

recording information in a Global Resource

Directory (GRD). The LMS process alsocontrols the flow of messages to remote

instances and manages global data block

access and transmits block images between

the buffer caches of different instances. This

processing is part of the Cache Fusion feature.


17/40


Global Cache Service Process(Lock manager server):(LMSn):

As noted earlier, in a RAC environment, each instance of Oracle isrunning on a different machine in a cluster, and they all access, in aread-write fashion, the same exact set of database files.

To achieve this, the SGA block buffer caches must be kept consistentwith respect to each other.

This is one of the main goals of the LMSn process In earlier releases ofOracle Parallel Server (OPS) this was accomplished via a ping.

That is, if a node in the cluster needed a read-consistent view of a

block that was locked in exclusive mode by another node, theexchange of data was done via a disk flush (the block was pinged).This was a very expensive operation just to read data.

Now, with the LMSn, this exchange is done via very fast cache-to-cacheexchange over the clusters high-speed connection.

You may have up to ten LMSn processes per instance


18/40


LMONGlobal Enqueue Service Monitor

The LMON process monitors global enqueues and resources across the

cluster and performs global enqueue recovery operations.

* Lock monitor (LMON) process: The LMON process

monitors all instances in a cluster to detect the

failure of an instance. It then facilitates the recoveryof the global locks held by the failed instance. It is

also responsible for reconfiguring locks and other

resources when instances leave or are added to the

cluster (as they fail and come back online, or as newinstances are added to the cluster in real time).


19/40


LMDGlobal Enqueue Service Daemon

The LMD process manages incoming remote resourcerequests within each instance.

* Lock manager daemon (LMD) process: The LMDprocess handles lock manager service requests for theglobal cache service (keeping the block buffersconsistent between instances). It works primarily as abroker sending requests for resources to a queue that

is handled by the LMSn processes. The LMD handlesglobal deadlock detection/resolution and monitors forlock timeouts in the global environment


20/40


LCK0Instance Enqueue Process

This process is very similar in functionality to the LMD process described earlier,but it handles requests for all global resources other than database block buffers.

The LCK0 process manages non-Cache Fusion resource requests such as library androw cache requests.

ACMS:-Atomic Controlfile to Memory Service (ACMS)

ACMS stands for Atomic Controlfile Memory Service. In an Oracle RACenvironment ACMS is an agent that ensures a distributed SGA memoryupdate(ie)SGA updates are globally committed on success or globally aborted inevent of a failure

RMSnOracle RAC Management Processes (RMSn)

The RMSn processes perform manageability tasks for Oracle RAC. Tasks accomplished by an RMSn process include creation of resources related Oracle

RAC when new instances are added to the clusters.

RSMNRemote Slave Monitor manages background slave process creation and

communication on remote instances. These background slave processes


21/40


RAC Background Processes High-level Functionality

Each node has its own background processes and memory structures,there are additional processes than the norm to manage the sharedresources, theses additional processes maintain cache coherency acrossthe nodes.

Cache coherency is the technique of keeping multiple copies of a bufferconsistent between different Oracle instances on different nodes. Globalcache management ensures that access to a master copy of a data block inone buffer cache is coordinated with the copy of the block in another

buffer cache. The sequence of a operation would go as below

When instance A needs a block of data to modify, it reads the bock fromdisk, before reading it must inform the GCS (DLM). GCS keeps track of thelock status of the data block by keeping an exclusive lock on it on behalf ofinstance A

Now instance B wants to modify that same data block, it must inform GCS,GCS will then request instance A to release the lock, thus GCS ensures thatinstance B gets the latest version of the data block (including instance Amodifications) and then exclusively locks it on instance B behalf.

At any one point in time, only one instance has the current copy of theblock, thus keeping the integrity of the block.


22/40


GCS maintains data coherency and coordination by keeping track of alllock status of each block that can be read/written to by any nodes in theRAC. GCS is an in memory database that contains information aboutcurrent locks on blocks and instances waiting to acquire locks. This isknown as Parallel Cache Management(PCM).

The Global Resource Manager (GRM) helps to coordinate andcommunicate the lock requests from Oracle processes between instancesin the RAC. Each instance has a buffer cache in its SGA, to ensure that each

RAC instance obtains the block that it needs to satisfy a query ortransaction.

RAC uses two processes the GCS and GES which maintain records of lockstatus of each data file and each cached block using a GRD.

So what is a resource, it is an identifiable entity, it basically has a name ora reference, it can be a area in memory, a disk file or an abstract entity. Aresource can be owned or locked in various states (exclusive or shared).Any shared resource is lockable and if it is not shared no access conflictwill occur.


23/40


A global resource is a resource that is visible to all the nodes within the

cluster. Data buffer cache blocks are the most obvious and most heavilyglobal resource, transaction enqueue's and database data structures are

other examples. GCS handle data buffer cache blocks and GES handle all

the non-data block resources.

All caches in the SGA are either global or local, dictionary and buffer

caches are global, large and java pool buffer caches are local. Cache fusion

is used to read the data buffer cache from another instance instead of

getting the block from disk, thus cache fusion moves current copies of

data blocks between instances (hence why you need a fast private

network), GCS manages the block transfers between the instances.


24/40



25/40



26/40



27/40



28/40



29/40



30/40



31/40



32/40



33/40



34/40



35/40


What are the major RAC wait events? In a RAC environment the buffer cache is global across all instances in the

cluster and hence the processing differs.The most common wait eventsrelated to this are gc cr request and gc buffer busy

GC CR request :the time it takes to retrieve the data from the remote

cacheReason: RAC Traffic Using Slow Connection or Inefficient queries (poorlytuned queries will increase the amount of data blocks requested by anOracle session. The more blocks requested typically means the more oftena block will need to be read from a remote instance via the interconnect.)

GC BUFFER BUSY: It is the time the remote instance locally spendsaccessing the requested data block.


36/40


Oracle Clusterware Software Component

Processing Details

The following list describes the functions of someof the major Oracle Clusterware components. This

list includes these components which are

processes on Unix and Linux operating systems or

services on Windows.


37/40


Cluster Synchronization Services (CSS)Manages the cluster

configuration by controlling which nodes are members of the cluster and

by notifying members when a node joins or leaves the cluster. If you are

using third-party clusterware, then the css process interfaces with your

clusterware to manage node membership information. Cluster Ready Services (CRS)The primary program for managing high

availability operations within a cluster. Anything that the crs process

manages is known as a cluster resource which could be a database, an

instance, a service, a Listener, a virtual IP (VIP) address, an application

process, and so on. The crs process manages cluster resources based onthe resource's configuration information that is stored in the OCR. This

includes start, stop, monitor and failover operations. The crs process

generates events when a resource status changes. When you have

installed Oracle RAC, crs monitors the Oracle instance, Listener, and so

on, and automatically restarts these components when a failure occurs.By default, the crs process makes five attempts to restart a resource and

then does not make further restart attempts if the resource does not

restart.


38/40


Event Management (EVM): A background process that publishesevents that crs creates.

Oracle Notification Service (ONS): A publish and subscribe servicefor communicating Fast Application Notification (FAN) events.

RACGExtends clusterware to support Oracle-specific

requirements and complex resources. Runs server callout scriptswhen FAN events occur.

Process Monitor Daemon (OPROCD): This process is locked inmemory to monitor the cluster and provide I/O fencing. OPROCDperforms its check, stops running, and if the wake up is beyond theexpected time, then OPROCD resets the processor and reboots the

node. An OPROCD failure results in Oracle Clusterware restartingthe node. OPROCD uses the hangcheck timer on Linux platforms.


39/40


Thank You

-Ram


40/40

11 RAC W k h b R

References:

Oracle Real Application Clusters Administration and Deployment Guide 11g Release 1 (11.1) B28254-07

http://www.datadisk.co.uk/html_docs/rac/architecture.htm

http://ora-dba.com/Oracle_RAC_Information.html

http://www.franklinfaces.com/Topic22-79-1.aspx
http://www.datadisk.co.uk/html_docs/rac/architecture.htmhttp://www.datadisk.co.uk/html_docs/rac/architecture.htmhttp://ora-dba.com/Oracle_RAC_Information.htmlhttp://ora-dba.com/Oracle_RAC_Information.htmlhttp://ora-dba.com/Oracle_RAC_Information.htmlhttp://ora-dba.com/Oracle_RAC_Information.htmlhttp://www.datadisk.co.uk/html_docs/rac/architecture.htmhttp://www.datadisk.co.uk/html_docs/rac/architecture.htmhttp://www.datadisk.co.uk/html_docs/rac/architecture.htmhttp://www.datadisk.co.uk/html_docs/rac/architecture.htmhttp://www.datadisk.co.uk/html_docs/rac/architecture.htm

Documents

11gRAC Architecture Day3