SURENDER SARA NCOAUG Email : [email protected] [email protected] RAC and 11i - 101

SURENDER SARANCOAUG

Email :[email protected]@SERACONSULTING.US

RAC and 11i - 101

Two Node Architecture, Unprotected

Two Node Architecture, Protected Apps tier and unprotected DB tier

Two Node Architecture, Protected Apps tier and DB tier

Failover Cluster• Detecting failure by monitoring the heartbeat and checking status of resources• Reorganizing Cluster membership in the cluster manager• Transferring Disk ownership from primary node to secondary node• Mounting the FS on secondary node• Starting DB instance• Recovering the Database and rollback of uncommitted data• Reestablishing the client connections to the failover node

FAILOVER CLUSTER OFFERINGS

• Veritas cluster server

• HP Service Guard

• Microsoft Cluster Service with Oracle Failsafe

• RedHat Linux Advanced Server 2.1

• Sun Cluster Oracle Agent

• Compaq, now HP, Segregated Cluster

• HACMP

RAC

Scalable RAC

Real Application Cluster• Many instances of Oracle running on many nodes• Multiple instances share a single physical database• All instances have common data, control, and initialization files• Each instances has individual, shared log files and rollback segments

or undo tablespaces• All instances can simultaneously execute transactions against the

single database• Caches are synchronized using Oracle’s Global Cache Management

technology (Cache Fusion)

RAC Building Blocks

• Instance and Database files

• Shared storage with OCFS, CFS or raw devices

• Redundant HBA cards per HOST

• Redundant NIC cards per HOST, one for cluster interconnect and one for LAN connectivity

• Local RAID protected drives for ORACLE_HOMES ( OCFS does not support ORACLE_HOME install)

CLUSTERINTER CONNECTFUNCTION • - Monitoring Health, status and message synchronization• - Transporting Distributed Lock manager messages• - Accessing remote File system• - Moving application specific traffic• - providing cluster alias routingInterconnect Requirements • - Low latency for short messages• - High speed and sustained data rates for large messages• - LOW Host CPU utilization• - Flow Control, Error Control and heart beat continuity monitoring• - switched network that scale well

INTERCONNECT PRODUCTS

• Memory Channel

• SMP Bus

• Myrinet

• Sun SCI

• Gigabit Ethernet

• Infiband Interconnect

INTERCONNECT PROTOCOL

• TCP/IP

• UDP

• VIA

• RDG

• HMP

IO CHANNEL HBA Products

• Adaptec

• DPT

• LSI Logic

• Interphase

• Qlogic

• Emulex

• JNI

FACRIC SWITCHES

• mcDATA

• EMC

• QLOGIC

• BROCADE

CLUSTER NODES

NUMASMP • - shared system bus and IO• - expensive and scalability problems• - Adding more CPU can result into upgrading architecture components• - DELL and HP-Compaq

BLADE Servers• - BladeFram system from egenera• - egenera - 24 2 way and 4 way SMP processing resources• - egenera - redundant central controllers ,redundant high-speed interconnects, PAN manager• - egenera - PAN manager handles external storage mapping and virtualization• - egenera - PAN manager handles , IO and network traffic to and from individual servers

Oracle’s High Availability (HA) Solution Stack

System Failure

Data Failure& Disaster

Human Error

Real Application ClustersContinuous Availability for all Applications

Data GuardZero Data Loss

Flashback QueryEnable Users to Correct their Mistakes

SystemMaintenance

Data Maintenance

Dynamic ReconfigurationCapacity on Demand without Interruption

Online RedefinitionAdapt to Change Online

UnplannedDowntime

PlannedDowntime

Shared Storage Options

• NFS Mounted storage ( Netapp )

• SCSI shared storage with OCFS, OFS, Raw devices

• Fiber channel Storage with fabric Architecture

11i Steps - 1

• Install RED HAT As 2.1 on all nodes

• Install 11i as single node install on Apps Tier

• Attached shared storage and install drivers for HBA

11i Steps -2 ( install OS Patches)

• rpm -Uv tar-1.13.25-9.i386.rpm

• This provides an updated version of tar

• Allows a user to tar files from a running database on OCFS

• examples :

• tar --o_direct -cvf /tmp/backup.tar *

11i Steps -2 ( install OS Patches)

• rpm -Uv fileutils-4.1-4.2.i386.rpm• This provides an updated version of cp and dd• Allows a user to copy files from a running database on

OCFS

• examples :• cp --o_direct /ocfs/quorum.dbf /tmp/backup/quorum.dbf• dd o_direct=yes if=/ocfs/quorum.dbf

of=/tmp/backup/quorum.dbf

11i Steps -3 Install oracle provided RPM’s

• ocfs-support-1.0.9-11.i686.rpm

• ocfs-tools-1.0.9-11.i686.rpm

• j2sdk-1_3_1_09-linux-i586.rpm.bin

• unzip-5.50-30.i386.rpm

• zip-2.3-10.i386.rpm

• wu-ftpd-2.6.1-21.i386.rpm

• hangcheck-timer-2.4.9-e.10-0.4.0-2.i686.rpm

• hangcheck-timer-2.4.9-e.10-enterprise-0.4.0-2.i686.rpm

11i steps -3 ( interconnect)

• ifconfig eth0:0 192.168.2.100• route add -host 192.168.2.100 dev eth0:0• Do this on each node• Create watchdog file (oracle installer checks for this to

install cluster option) # touch /dev/watchdog• Setup hangcheck-timer module

– # vi /etc/modules.conf– options hangcheck-timer hangcheck_tick=30

hangcheck_margin=180– # modprobe hangcheck-timer

11i steps -5 OCFS.conf – 5

• # ocfstool ( from x windows)• # ocfs config• # Ensure this file exists in /etc• #• node_name = linux3.home.com• node_number = • ip_address = 192.168.1.100• ip_port = 7000• comm_voting = 1• guid = 9D3B77AF2FF26E92E25D00E04CA44B58

11i Steps -6 install OCFS

• mkfs.ocfs -F -b 128 -L /s01 -m /s01 -u 500 -g 500 0755 /dev/sda1

• srvconfig_loc=/s01/oragsd-config ( touch this file)

11i steps -7 OCM• $ ls• If cmcfg.ora exists:• $ cp cmcfg.ora cmcfg.ora.original• If cmcfg.ora does not exist:• $ cp cmcfg.ora.tmp cmcfg.ora• $ echo HostName=dc1node3inter >> cmcfg.ora• $ vi cmcfg.ora• [comment out WatchdogSafetyMargin and WatchdogTimerMargin]• PrivateNodeNames=linux22 linux33 • PublicNodeNames=linux2 linux3• MissCount=210• KernelModuleName=hangcheck-timer• CmDiskFile=/u02/oracm-qourum• $ vi ocmargs.ora• [comment out first line, which contains the word “watchdogd”]• $ cd ../bin• $ cp ocmstart.sh ocmstart.sh.original• $ vi ocmstart.sh• [remove words “watchdog and” from line containing “Sample startup script...”]• [remove every line containing “watchdogd”, uppercase or lowercase. If it’s in a if/then/fi then remove the whole if/then/fi.]• $ su – root• export ORACLE_HOME=/d02/oracle/proddb/9.2.0• /d02/oracle/proddb/9.2.0/oracm/bin/ocmstart.sh• Configure and Start Cluster Manager• $ cd $HOME/product/9.2/oracm/admin

11i steps -4 ( cp/dd - DB files to shared storage )

• cp --o_direct /d03/oracle/proddata/* /s01/oracle/proddata/

• Recreate the controlfile

11i steps 8 – init.ora / spfile

• Create UNDO TBS for each instance

• Enable and disable thread for instance 2 from instance 1 and vice versa

11i steps 9 – instance 1• # RAC-specific Parameters• #• #########• cluster_database = true• cluster_database_instances=2• thread = 1• instance_number = 1• instance_name = PRODi1• service_names = PROD• local_listener = PRODi1• remote_listener = PRODi2

11i steps 10 – instance 2

• cluster_database = true• cluster_database_instances=2• thread = 2• instance_number = 2• instance_name = PRODi2• service_names = PROD• local_listener = PRODi2• remote_listener = PRODi1

11i Apps tier – 806/iAS tnsnames.ora

• PROD = (DESCRIPTION=• (ADDRESS_LIST =• (ADDRESS=(PROTOCOL=tcp)(HOST=linux1)(PORT=1521))• (ADDRESS=(PROTOCOL=tcp)(HOST=linux2)(PORT=1521))• )• (CONNECT_DATA=(SERVICE_NAME=PROD)(SERVER=DEDICATED))• )

• PRODi2 = (DESCRIPTION=• (ADDRESS=(PROTOCOL=tcp)(HOST=linux2)(PORT=1521))• (CONNECT_DATA=(INSTANCE_NAME=PRODi2)(SERVICE_NAME=PROD))• )

• PRODi1 = (DESCRIPTION=• (ADDRESS=(PROTOCOL=tcp)(HOST=linux1)(PORT=1521))• (CONNECT_DATA=(INSTANCE_NAME=PRODi1)(SERVICE_NAME=PROD))• )

Modify DBC file for Failover

• APPS_JDBC_DRIVER_TYPE=THIN• FND_MAX_JDBC_CONNECTIONS=100

# Setup at Apps Tier• APPS_JDBC_URL=jdbc:oracle:thin:@(DESCRIPTION=

(ADDRESS_LIST=(LOAD_BALANCE=ON)

(ADDRESS=(PROTOCOL=TCP)(HOST=linux1)(PORT=1521))

(ADDRESS=(PROTOCOL=TCP)(HOST=linux2)(PORT=1521)))

(CONNECT_DATA=(SERVICE_NAME=prod)))

WHAT can & cannot failover

• SQL* PLUS will failover using TAF

• JDBC Connections will failover

• Forms run time connections will not, users will have to reconnect

Questions And Answers

• [email protected]

Contact us for 5 DAY Live 11i and RAC building workshop

Documents

SURENDER SARA NCOAUG Email : [email protected] [email protected] RAC and 11i - 101