Oracle Traffic Director Instances, Processes and High Availability explained

Oracle Traffic Director

Instances, processes and high availability explained

2

vServer

Liberty Global | Month.Day.Year | Presenter | Event | Department | V1

OTD instances & processes • Two ‘logical’ OTD instances per OTD host vServer

• Admin node is special instance that synchronizes with the OTD admin server and manages the life-cycle of configuration instances

• Instance per OTD configuration deployed on the admin node

• Three trafficd processes per OTD instance • Watch dog: Process spawns the primordial process. It will

restart the primordial process or acceptor process if it crashes

• Primordial process; main process of the instance that spawns one or more load balancer acceptor threads

• Acceptor Process: the load balancer acceptor threads

Admin NodeConfiguration

Instancewatchdog

primordial

processacceptor

watchdog

primordial

processacceptor

[oracle@otd-server1 ~]$ ps -ef | grep trafficdoracle 5665 5100 0 12:24 pts/1 00:00:00 grep trafficdoracle 19259 1 0 May05 ? 00:00:00 trafficd-wdog -d /u01/appl/oracle/otd/otdnode2/admin-server/config -r /u01/appl/oracle/products/otd -t /tmp/admin-server-f79dbba8 -u oracleoracle 19260 19259 0 May05 ? 00:00:36 trafficd -d /u01/appl/oracle/otd/otdnode2/admin-server/config -r /u01/appl/oracle/products/otd -t /tmp/admin-server-f79dbba8 -u oracleoracle 19261 19260 0 May05 ? 00:02:52 trafficd -d /u01/appl/oracle/otd/otdnode2/admin-server/config -r /u01/appl/oracle/products/otd -t /tmp/admin-server-f79dbba8 -u oracleoracle 23543 1 0 May05 ? 00:00:00 trafficd-wdog -d /u01/appl/oracle/otd/otdnode2/net-testotd-configuration/config -r /u01/appl/oracle/products/otd -t /tmp/net-testotd-configuration-f7683315 -u oracleoracle 23544 23543 0 May05 ? 00:00:36 trafficd -d /u01/appl/oracle/otd/otdnode2/net-testotd-configuration/config -r /u01/appl/oracle/products/otd -t /tmp/net-testotd-configuration-f7683315 -u oracleoracle 23545 23544 0 May05 ? 00:00:52 trafficd -d /u01/appl/oracle/otd/otdnode2/net-testotd-configuration/config -r /u01/appl/oracle/products/otd -t /tmp/net-testotd-configuration-f7683315 -u oracle

• VRRP protocol implemented by keepalived linux program to control active-passive VIP; VIP is only active on 1 OTD node at a time

• Per OTD failover group a keep alived deamon is started

• VIPs cannot be shared across OTD configuration

OTD fail over group processes

3

OTDHost1 OTDHost2

Keepalived (Master)

Keepalived(Backup)

nic nicvip

Keepalived deamon mounts the VIP

VRRP heartbeats

Active VIP

[oracle@otd-server1 ~]$ ps -ef | grep keepalivedoracle 16320 5100 0 14:39 pts/1 00:00:00 grep keepalivedroot 18678 1 0 Mar27 ? 00:00:41 /usr/sbin/keepalived --vrrp --use-file /u01/appl/oracle/otd/otdnode2/net-testotd-configuration/config/keepalived.conf --pid /tmp/net-testotd-configuration-f7683315/keepalived.pid --vrrp_pid /tmp/net-testotd-configuration-f7683315/vrrp.pid --log-detailroot 18679 18678 0 Mar27 ? 00:08:51 /usr/sbin/keepalived --vrrp --use-file /u01/appl/oracle/otd/otdnode2/net-testotd-configuration/config/keepalived.conf --pid /tmp/net-testotd-configuration-f7683315/keepalived.pid --vrrp_pid /tmp/net-testotd-configuration-f7683315/vrrp.pid --log-detail

4

Application availability vs. Node availability

Application availability with the watchdog process

Provides availability of the OTD in case of a software failure with OTD ensure OTD continues to front-end request to back end applications

Node availability with keepalived VRRP deamon

Provide availability of OTD if the vServer crashes to ensure OTD continues to front-end request to back end applications

Liberty Global | Month.Day.Year | Presenter | Event | Department | V1

OTD high availability

ConfigurationInstance

watchdog

primordial

processacceptor

OTDHost1 OTDHost2

Keepalived(Master)

Keepalived(Backup)

nic nicvip vip

1. Keepalived on node 2 detects node failure

2. Keepalived deamon activates the VIP on the

remaining host

0. Node 1 crashes

Watchdog proces restarts OTD

processes

OTD high availability• When OTD instance run as root

– The life-cycle of the keepalived deamon is managed automatically under the covers by OTD when starting/stopping instances or toggling primary/backup nodes

• When the OTD instance is started as non-root – the keepalived deamon has to be started and stopped separately as root (or sudo)

through a script on each OTD node$ORACLE_HOME/tadm <start|stop>-failover --instance-home=/u01/OTDInstances/OTDNode1 --

config=test

– Check this blog for more configuration details http://exablurb.blogspot.nl/2013/06/running-otd-ha-with-minimal-root-usage.html

Things you should know -1

OTD high availability• When the OTD instance is started as non-root

– Be carefull when shutting down the primary instance from OTD console; It will NOT failover the VIP to the back up node. For this you should first stop the keepalived deamon on the node corresponding to the configuration

• VIP failover roughly takes 3 seconds– The default heartbeat advertisement interval is 1 second (int_advert setting in

keepalived.conf) and the failover duration is basically 3 times the advertisement interval)• An OTD process crash will not trigger VIP failover as the OTD watchdog process will restart the

OTD process in-place and the keepalived deamon still runs.• Keepalived logging can be found in /var/log/messages• VIPs cannot be shared across multiple configurations

Things you should know - 2

Software

Oracle Traffic Director Instances, Processes and High Availability explained