11gRAC Architecture Day3

Embed Size (px)

Citation preview

  • 8/3/2019 11gRAC Architecture Day3

    1/40

    11g RAC Workshop by Ram

    RAC Architecture

    Oracle Real Application clusters allows multiple instances to access a

    single database, the instances will be running on multiple nodes. In an

    standard Oracle configuration a database can only be mounted by one

    instance but in a RAC environment many instances can access a single

    database.

  • 8/3/2019 11gRAC Architecture Day3

    2/40

    11g RAC Workshop by Ram

    RAC Components

    The major components of a Oracle RAC

    system are

    Shared disk system

    Oracle Clusterware

    Cluster Interconnects

    Oracle Kernel Components

  • 8/3/2019 11gRAC Architecture Day3

    3/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    4/40

    11g RAC Workshop by Ram

    Disk architecture With today's SAN and NAS disk storage systems,

    sharing storage is fairly easy and is required for aRAC environment, you can use the below storagesetups

    SAN (Storage Area Networks) - generally using

    fibre to connect to the SAN NAS ( Network Attached Storage) - generally

    using a network to connect to the NAS usingeither NFS, ISCSI

    JBOD - direct attached storage, the old traditionalway and still used by many companies as a cheapoption

  • 8/3/2019 11gRAC Architecture Day3

    5/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    6/40

    11g RAC Workshop by Ram

    RAC Concepts

    To ensure that each Oracle RAC database instance obtainsthe block that it needs to satisfy a query or transaction asfast as possible, and to ensure data integrity,

    Oracle RAC instances use two processes, the Global Cache

    Service (GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the statuses of each

    data file and each cached block using a Global ResourceDirectory (GRD).

    The GRD contents are distributed across all of the active

    instances, which increases the size of the SGA for an OracleRAC instance.

  • 8/3/2019 11gRAC Architecture Day3

    7/40

    11g RAC Workshop by Ram

    After one instance caches data, any other instance within

    the same cluster database can acquire a block image from

    another instance in the same database faster than byreading the block from disk. Therefore, Cache Fusion moves

    current blocks between instances rather than re-reading

    the blocks from disk. When a consistent block is needed or

    a changed block is required on another instance, CacheFusion transfers the block image directly between the

    affected instances.

    Oracle RAC uses the private interconnect for inter-instance

    communication and block transfers. The GES Monitor and the Instance Enqueue Process

    manages access to Cache Fusion resources and enqueue

    recovery processing.

  • 8/3/2019 11gRAC Architecture Day3

    8/40

    11g RAC Workshop by Ram

    Cache Fusion Model

    Cache fusion Model is dependent on three

    services

    Global Resource Directory (GRD)

    Global Cache Service (GCS)

    Global En-queue Service (GES)

  • 8/3/2019 11gRAC Architecture Day3

    9/40

    11g RAC Workshop by Ram

    Cache Fusion - is disk less cache coherency mechanism in Oracle RAC that

    provides copies ofdata blocks directly from one instances memory cache(in which that block is available) to other instance (instance which isrequest for specific data block). Cache Fusion provides single buffer cache(for all instances in cluster) through interconnect.

    In Single Node oracle database, an instance looking for data block firstchecks in cache, if block is not in cache then goes to disk to pull block fromdisk to cache and return block to client.

    In RAC Database there is remote cache so instance should look not only inlocal cache (cache local to instance) but on remote cache (cache onremote instance). If cache is available in local cache then it should returndata block from local cache; if data block is not in local cache, instead ofgoing to disk it should first go to remote cache (remote instance) to checkif block is available in local cache (via interconnect)

    This is because accessing data block from remote cache is faster thanaccessing it from disk

  • 8/3/2019 11gRAC Architecture Day3

    10/40

    11g RAC Workshop by Ram

    RAC Database System has two importantservices. They are Global Cache Service (GCS)

    and Global Enqueue Service (GES). These are

    basically collections of backgroundprocesses. These two processes together

    cover and manage the total Cache Fusion

    process, resource transfers, and resource

    escalations among the instances

  • 8/3/2019 11gRAC Architecture Day3

    11/40

    11g RAC Workshop by Ram

    Global Resource Directory GES and GCS together maintain a Global Resource Directory (GRD)

    to record the information about the resources and the enqueues.GRD remains in the memory and is stored on all the instances. Eachinstance manages a portion of the directory. This distributed natureis a key point for fault tolerance of the RAC.

    Global Resource Directory (GRD) is the internal database thatrecords and stores the current status of the data blocks. Whenever

    a block is transferred out of a local cache to another instancescache the GRD is updated. The following resources information isavailable in GRD.

    * Data Block Identifiers (DBA)

    * Location of most current version

    * Modes of the data blocks: (N)Null, (S)Shared, (X)Exclusive * The Roles of the data blocks (local or global) held by each instance

    * Buffer caches on multiple nodes in the cluster

  • 8/3/2019 11gRAC Architecture Day3

    12/40

    11g RAC Workshop by Ram

    Global Cache Service (GCS)GCS is the main controlling process that implements Cache Fusion. GCS

    tracks the location and the status (mode and role) of the data blocks, as

    well as the access privileges of various instances. GCS is the mechanism

    which guarantees the data integrity by employing global access levels.

    GCS maintains the block modes for data blocks in the global role. It is also

    responsible for block transfers between the instances. Upon a request

    from an Instance GCS organizes the block shipping and appropriate lock

    mode conversions. The Global Cache Service is implemented by various

    background processes, such as the Global Cache Service Processes

    (LMSn) and Global Enqueue Service Daemon (LMD).

  • 8/3/2019 11gRAC Architecture Day3

    13/40

    11g RAC Workshop by Ram

    Global Enqueue ServiceThe Global Enqueue Service (GES) manages ortracks the status of all the Oracle enqueuing

    mechanism. This involves all non Cache fusionintra-instance operations. GES performsconcurrency control on dictionary cache locks,library cache locks, and the transactions. GES

    does this operation for resources that areaccessed by more than one instanc

  • 8/3/2019 11gRAC Architecture Day3

    14/40

    11g RAC Workshop by Ram

    Enqueues are shared memory structures (locks)that serialize access to database resources. Theycan be associated with a session or transaction.Enqueue names are displayed in the LOCK_TYPE

    column of the DBA_LOCK andDBA_LOCK_INTERNAL data dictionary views.

    A resource uniquely identifies an object that canbe locked by different sessions within an instance

    (local resource) or between instances (globalresource). Each session that tries to lock theresource will have an enqueue on the resource.

  • 8/3/2019 11gRAC Architecture Day3

    15/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    16/40

    11g RAC Workshop by Ram

    Background process Explanation Continued..

    LMSGlobal Cache Service Process

    The LMS process maintains records of thedatafile statuses and each cached block by

    recording information in a Global Resource

    Directory (GRD). The LMS process alsocontrols the flow of messages to remote

    instances and manages global data block

    access and transmits block images between

    the buffer caches of different instances. This

    processing is part of the Cache Fusion feature.

  • 8/3/2019 11gRAC Architecture Day3

    17/40

    11g RAC Workshop by Ram

    Global Cache Service Process(Lock manager server):(LMSn):

    As noted earlier, in a RAC environment, each instance of Oracle isrunning on a different machine in a cluster, and they all access, in aread-write fashion, the same exact set of database files.

    To achieve this, the SGA block buffer caches must be kept consistentwith respect to each other.

    This is one of the main goals of the LMSn process In earlier releases ofOracle Parallel Server (OPS) this was accomplished via a ping.

    That is, if a node in the cluster needed a read-consistent view of a

    block that was locked in exclusive mode by another node, theexchange of data was done via a disk flush (the block was pinged).This was a very expensive operation just to read data.

    Now, with the LMSn, this exchange is done via very fast cache-to-cacheexchange over the clusters high-speed connection.

    You may have up to ten LMSn processes per instance

  • 8/3/2019 11gRAC Architecture Day3

    18/40

    11g RAC Workshop by Ram

    LMONGlobal Enqueue Service Monitor

    The LMON process monitors global enqueues and resources across the

    cluster and performs global enqueue recovery operations.

    * Lock monitor (LMON) process: The LMON process

    monitors all instances in a cluster to detect the

    failure of an instance. It then facilitates the recoveryof the global locks held by the failed instance. It is

    also responsible for reconfiguring locks and other

    resources when instances leave or are added to the

    cluster (as they fail and come back online, or as newinstances are added to the cluster in real time).

  • 8/3/2019 11gRAC Architecture Day3

    19/40

    11g RAC Workshop by Ram

    LMDGlobal Enqueue Service Daemon

    The LMD process manages incoming remote resourcerequests within each instance.

    * Lock manager daemon (LMD) process: The LMDprocess handles lock manager service requests for theglobal cache service (keeping the block buffersconsistent between instances). It works primarily as abroker sending requests for resources to a queue that

    is handled by the LMSn processes. The LMD handlesglobal deadlock detection/resolution and monitors forlock timeouts in the global environment

  • 8/3/2019 11gRAC Architecture Day3

    20/40

    11g RAC Workshop by Ram

    LCK0Instance Enqueue Process

    This process is very similar in functionality to the LMD process described earlier,but it handles requests for all global resources other than database block buffers.

    The LCK0 process manages non-Cache Fusion resource requests such as library androw cache requests.

    ACMS:-Atomic Controlfile to Memory Service (ACMS)

    ACMS stands for Atomic Controlfile Memory Service. In an Oracle RACenvironment ACMS is an agent that ensures a distributed SGA memoryupdate(ie)SGA updates are globally committed on success or globally aborted inevent of a failure

    RMSnOracle RAC Management Processes (RMSn)

    The RMSn processes perform manageability tasks for Oracle RAC. Tasks accomplished by an RMSn process include creation of resources related Oracle

    RAC when new instances are added to the clusters.

    RSMNRemote Slave Monitor manages background slave process creation and

    communication on remote instances. These background slave processes

  • 8/3/2019 11gRAC Architecture Day3

    21/40

    11g RAC Workshop by Ram

    RAC Background Processes High-level Functionality

    Each node has its own background processes and memory structures,there are additional processes than the norm to manage the sharedresources, theses additional processes maintain cache coherency acrossthe nodes.

    Cache coherency is the technique of keeping multiple copies of a bufferconsistent between different Oracle instances on different nodes. Globalcache management ensures that access to a master copy of a data block inone buffer cache is coordinated with the copy of the block in another

    buffer cache. The sequence of a operation would go as below

    When instance A needs a block of data to modify, it reads the bock fromdisk, before reading it must inform the GCS (DLM). GCS keeps track of thelock status of the data block by keeping an exclusive lock on it on behalf ofinstance A

    Now instance B wants to modify that same data block, it must inform GCS,GCS will then request instance A to release the lock, thus GCS ensures thatinstance B gets the latest version of the data block (including instance Amodifications) and then exclusively locks it on instance B behalf.

    At any one point in time, only one instance has the current copy of theblock, thus keeping the integrity of the block.

  • 8/3/2019 11gRAC Architecture Day3

    22/40

    11g RAC Workshop by Ram

    GCS maintains data coherency and coordination by keeping track of alllock status of each block that can be read/written to by any nodes in theRAC. GCS is an in memory database that contains information aboutcurrent locks on blocks and instances waiting to acquire locks. This isknown as Parallel Cache Management(PCM).

    The Global Resource Manager (GRM) helps to coordinate andcommunicate the lock requests from Oracle processes between instancesin the RAC. Each instance has a buffer cache in its SGA, to ensure that each

    RAC instance obtains the block that it needs to satisfy a query ortransaction.

    RAC uses two processes the GCS and GES which maintain records of lockstatus of each data file and each cached block using a GRD.

    So what is a resource, it is an identifiable entity, it basically has a name ora reference, it can be a area in memory, a disk file or an abstract entity. Aresource can be owned or locked in various states (exclusive or shared).Any shared resource is lockable and if it is not shared no access conflictwill occur.

  • 8/3/2019 11gRAC Architecture Day3

    23/40

    11g RAC Workshop by Ram

    A global resource is a resource that is visible to all the nodes within the

    cluster. Data buffer cache blocks are the most obvious and most heavilyglobal resource, transaction enqueue's and database data structures are

    other examples. GCS handle data buffer cache blocks and GES handle all

    the non-data block resources.

    All caches in the SGA are either global or local, dictionary and buffer

    caches are global, large and java pool buffer caches are local. Cache fusion

    is used to read the data buffer cache from another instance instead of

    getting the block from disk, thus cache fusion moves current copies of

    data blocks between instances (hence why you need a fast private

    network), GCS manages the block transfers between the instances.

  • 8/3/2019 11gRAC Architecture Day3

    24/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    25/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    26/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    27/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    28/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    29/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    30/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    31/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    32/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    33/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    34/40

    11g RAC Workshop by Ram

  • 8/3/2019 11gRAC Architecture Day3

    35/40

    11g RAC Workshop by Ram

    What are the major RAC wait events? In a RAC environment the buffer cache is global across all instances in the

    cluster and hence the processing differs.The most common wait eventsrelated to this are gc cr request and gc buffer busy

    GC CR request :the time it takes to retrieve the data from the remote

    cacheReason: RAC Traffic Using Slow Connection or Inefficient queries (poorlytuned queries will increase the amount of data blocks requested by anOracle session. The more blocks requested typically means the more oftena block will need to be read from a remote instance via the interconnect.)

    GC BUFFER BUSY: It is the time the remote instance locally spendsaccessing the requested data block.

  • 8/3/2019 11gRAC Architecture Day3

    36/40

    11g RAC Workshop by Ram

    Oracle Clusterware Software Component

    Processing Details

    The following list describes the functions of someof the major Oracle Clusterware components. This

    list includes these components which are

    processes on Unix and Linux operating systems or

    services on Windows.

  • 8/3/2019 11gRAC Architecture Day3

    37/40

    11g RAC Workshop by Ram

    Cluster Synchronization Services (CSS)Manages the cluster

    configuration by controlling which nodes are members of the cluster and

    by notifying members when a node joins or leaves the cluster. If you are

    using third-party clusterware, then the css process interfaces with your

    clusterware to manage node membership information. Cluster Ready Services (CRS)The primary program for managing high

    availability operations within a cluster. Anything that the crs process

    manages is known as a cluster resource which could be a database, an

    instance, a service, a Listener, a virtual IP (VIP) address, an application

    process, and so on. The crs process manages cluster resources based onthe resource's configuration information that is stored in the OCR. This

    includes start, stop, monitor and failover operations. The crs process

    generates events when a resource status changes. When you have

    installed Oracle RAC, crs monitors the Oracle instance, Listener, and so

    on, and automatically restarts these components when a failure occurs.By default, the crs process makes five attempts to restart a resource and

    then does not make further restart attempts if the resource does not

    restart.

  • 8/3/2019 11gRAC Architecture Day3

    38/40

    11g RAC Workshop by Ram

    Event Management (EVM): A background process that publishesevents that crs creates.

    Oracle Notification Service (ONS): A publish and subscribe servicefor communicating Fast Application Notification (FAN) events.

    RACGExtends clusterware to support Oracle-specific

    requirements and complex resources. Runs server callout scriptswhen FAN events occur.

    Process Monitor Daemon (OPROCD): This process is locked inmemory to monitor the cluster and provide I/O fencing. OPROCDperforms its check, stops running, and if the wake up is beyond theexpected time, then OPROCD resets the processor and reboots the

    node. An OPROCD failure results in Oracle Clusterware restartingthe node. OPROCD uses the hangcheck timer on Linux platforms.

  • 8/3/2019 11gRAC Architecture Day3

    39/40

    11g RAC Workshop by Ram

    Thank You

    -Ram

  • 8/3/2019 11gRAC Architecture Day3

    40/40

    11 RAC W k h b R

    References:

    Oracle Real Application Clusters Administration and Deployment Guide 11g Release 1 (11.1) B28254-07

    http://www.datadisk.co.uk/html_docs/rac/architecture.htm

    http://ora-dba.com/Oracle_RAC_Information.html

    http://www.franklinfaces.com/Topic22-79-1.aspx

    http://www.datadisk.co.uk/html_docs/rac/architecture.htmhttp://www.datadisk.co.uk/html_docs/rac/architecture.htmhttp://ora-dba.com/Oracle_RAC_Information.htmlhttp://ora-dba.com/Oracle_RAC_Information.htmlhttp://ora-dba.com/Oracle_RAC_Information.htmlhttp://ora-dba.com/Oracle_RAC_Information.htmlhttp://www.datadisk.co.uk/html_docs/rac/architecture.htmhttp://www.datadisk.co.uk/html_docs/rac/architecture.htmhttp://www.datadisk.co.uk/html_docs/rac/architecture.htmhttp://www.datadisk.co.uk/html_docs/rac/architecture.htmhttp://www.datadisk.co.uk/html_docs/rac/architecture.htm