Temenos T24 and Microsoft SQL Server HADR White Paper

The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 i

The Microsoft High Availability and Disaster

Recovery Solution for TEMENOS T24

A deployment reference architecture and guidance for implementing a high-availability and disaster-recovery solution for TEMENOS T24 running on the Microsoft Application Platform Technical White Paper

Published: May 2012

Applies to: Microsoft SQL Server 2012

Authors: Igor Pagliai (Microsoft) Dammika Wickramasinghe (Temenos)

Abstract

Temenos and Microsoft worked together to define a deployment architecture/topology that provides high availability and disaster recovery for the TEMENOS T24 core banking solution using the Microsoft Application platform and Microsoft technologies.

This white paper describes the results of this joint effort.

The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 ii

©2012 Microsoft Corporation. All rights reserved. This document is provided “as-is.” Information and views

expressed in this document, including URL and other Internet Web site references, may change without notice.

You bear the risk of using it.

This document does not provide you with any legal rights to any intellectual property in any Microsoft product.

You may copy and use this document for your internal, reference purposes.

The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 iii

Table of Contents

Introduction .................................................................................................................................................. 1

Technical Overview of TEMENOS T24 ............................................................................................................ 5

SQL Server AlwaysOn .................................................................................................................................... 6

Recovery Objectives .......................................................................................................................................... 7

Fault Tolerance and Disaster Recovery Architecture ........................................................................................ 8

High Availability and Disaster Recovery Solution ......................................................................................... 10

Setup and Configuration .............................................................................................................................. 13

SQL Server 2012 HADR Configuration ............................................................................................................ 13

Windows Server Firewall Configurations ........................................................................................................ 14

T24 File Share Configuration .......................................................................................................................... 15

Active Directory Domain Services DNS Configuration .................................................................................... 17

Application-Tier NLB Configuration ................................................................................................................ 18

T24 Application Server Configuration ............................................................................................................. 20

Web-Tier NLB Configuration ........................................................................................................................... 23

T24Browser Configuration.............................................................................................................................. 25

Disaster Recovery Procedures ..................................................................................................................... 27

DNS Switching ................................................................................................................................................ 29

SQL Server 2012 HADR Failover ...................................................................................................................... 31

Findings and Carryovers .............................................................................................................................. 50

Recommended Hotfixes and Service Packs .................................................................................................. 51

Additional Resources ................................................................................................................................... 52

SQL Server 2012 .............................................................................................................................................. 52

Windows Server Failover Cluster .................................................................................................................... 55

Network Load Balancing ................................................................................................................................ 56

About Temenos .............................................................................................................................................. 57

About Microsoft.............................................................................................................................................. 57

The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 1

Introduction TEMENOS T24 (T24) is a fully integrated, modular core banking solution that covers a broad

spectrum of functional requirements for the retail, private, corporate, universal, and Islamic

banking and microfinance sectors. T24 provides a single, real-time view of client computers across

the entire enterprise, making it possible for banks to maximize returns while also streamlining costs.

Microsoft SQL Server 2012 data management software provides an ideal data management

framework for T24. With this foundation, T24 customers can experience faster funds transfers,

higher security-trades volumes, and quicker close-of-business processes; they can benefit from

open, state-of-the-art technologies to accelerate innovation, which helps to greatly increase the

speed and effectiveness with which new products and services are created.

As part of their strategic alliance, Microsoft and Temenos worked together to define a

recommended deployment architecture that provides high availability and disaster recovery (HADR)

for T24 running on the Microsoft Application Platform and using Microsoft technologies. This joint

effort was conducted in the Temenos Hemel Hempstead lab.

One of the main drivers for developing the architecture/topology was to reduce the cost of

Microsoft software licenses and the use of specialized hardware (such as load balancers) to

minimize the total cost of ownership (TCO). Therefore, the recommended software topologies can

be customized to meet customer’s needs.

The following considerations apply to the recommended architecture:

The SQL Server 2012 Availability Group feature, part of the AlwaysOn technology set, was

selected instead of storage area network (SAN)–level synchronous storage replication to

avoid the cost of an additional SAN device and the licensing cost for SAN replication

software.

A SQL Server 2012 Failover Cluster Instance (FCI) was adopted for the primary site instead

of two standalone instances to reduce licensing cost, minimize management and

performance overhead, and augment the possibility of using an existing deployment based

on a typical Windows Failover Clustering (WSFC) configuration.

The Network Load Balancing (NLB) feature of Windows Server 2008 R2 was chosen to

eliminate the need for an expensive hardware load balancer device in front of the JBoss

servers.

The NLB feature of Windows Server 2008 R2 was chosen to provide better load balancing

performance than the native T24 capabilities in front of T24 servers.

Two cluster nodes in the primary site with shared SAN storage were used to provide high

availability for the T24 application file share.


The implementation/requirements of HADR solutions can vary based on variety of factors, including

service level agreements (SLAs), cost, number of sites, and network infrastructure. Therefore, the

requirements of individual HADR solutions need to be determined on a case-by-case basis for each

deployment.

Alternatives to the Recommended Architecture

The architecture proposed in this white paper is not the only one possible using SQL Server 2012

AlwaysOn features, but this architecture has been thoroughly tested. Possible alternatives to the

recommended schema can include the following:

Use two standalone SQL Server 2012 instances (in an AlwaysOn Availability Group) instead

of a single SQL Server 2012 Failover Cluster Instance. This lets you avoid using shared SAN

for the cluster nodes in the primary site.

o If you are using an availability group, all nodes in the must still be part of a cluster,

and a standalone SQL Server 2012 instance must still be installed on each node. The

cost savings with this alternative come from eliminating the need for shared

storage.

o To ensure that there is no local data loss if there is local failover between instances

in the primary site, the two standalone SQL Server 2012 instances, along with the

one (or more) in the disaster recovery site, must be configured for synchronous

replication.

o In this configuration, automatic failover can be provided by the AlwaysOn

Availability Group feature, but extra care must be taken to avoid unwanted failover

to the remote disaster recovery site.

Use an existing highly available network storage for the cluster file share witness. Used in

combination with the previous option, a highly available network storage for the cluster file

share witness can render the installation of a Windows Server Failover Cluster unnecessary.

o NOTE Distributed File System Replication (DFS-R) can be used to replicate files

from the primary site to the disaster recovery site with a less frequent schedule.

Use of DFS-R as a solution to avoid a clustered file share by having continuous

replication with local folders, however, is not recommended because of the

possible performance impact.

Use an additional node in the disaster recovery site with shared SAN storage between the

nodes. With this alternative, a second SQL Server 2012 FCI can be used, providing high

availability at the level of the disaster recovery site as well.

o This second instance must be installed only on the nodes in the disaster recovery

site.

o This instance is distinct from the one used in the primary site.


o This instance should be configured for synchronous replication in the availability

group replication.

o The shared SAN storage between the nodes in the disaster recovery site is not

linked/replicated to the shared storage between the nodes in the primary site.

IMPORTANT In the proposed scenario, the minimum number of servers has been used in the

disaster recovery site to reduce costs. This means that in the case of a complete primary site

disaster, the disaster recovery site will operate in an exposed configuration that is not highly

available. For this reason, it is highly recommended that you recover the primary site as soon as

possible or use an additional node in the disaster recovery site with shared SAN storage between

the nodes, as mentioned previously.

Additional SQL Server 2012 HADR Capabilities for Future Consideration

Note that the following SQL Server 2012 HADR capabilities have not been tested prior to publication

of this white paper because of time, resource, and configuration constraints. They should be

considered to be future enhancements to the recommended architecture, and should be tested for

custom deployments and/or lab testing sessions:

Readable secondary for Availability Group replicated databases

This feature presents no theoretical risks and could be used to better utilize hardware

resources in the disaster recovery site (including read-only queries, reporting, backups, and

integrity checks,), but T24 should be modified to take advantage of this capability (for read-

only queries only). The following links provide more information:

o Active Secondaries: Readable Secondary Replicas

(http://msdn.microsoft.com/en-us/library/ff878253.aspx)

o Configure Read-Only Access on an Availability Replica

(http://msdn.microsoft.com/en-us/library/hh213002.aspx)

NOTE In the recommended configuration, the secondary replicas for the

availability group replicated the databases. Read-only access is not enabled, but can

be easily activated with no downtime.

Availability Group Read-Only Routing and Application Intent

These features cannot be used because they require the SQL Server 2012 Native Open

Database Connectivity (ODBC) client to be installed on the T24 servers. As a future

enhancement, this version of the client should be tested for T24 use. The following links

provide more information:

o Configure Read-Only Routing for an Availability Group (SQL Server)


o Client Connection Access to Availability Replicas (SQL Server)


http://msdn.microsoft.com/en-us/library/ff878253.aspx

http://msdn.microsoft.com/en-us/library/hh213002.aspx




Multi-subnet failover clustering

Windows Server 2008 R2 and SQL Server 2012 support this type of configuration, but this

has not been tested for using in reducing downtime because of Domain Name System (DNS)

replication latency. The following links provide more information.

o SQL Server Multi-Subnet Clustering


o SQL Server 2012 AlwaysOn: Multisite Failover Cluster Instance

(http://sqlcat.com/sqlcat/b/whitepapers/archive/2011/12/22/sql-server-2012-

alwayson_3a00_-multisite-failover-cluster-instance.aspx)

Flexible failover policy

SQL Server 2012 introduces a new health detection mechanism for clustered installation

that can be modified so that the Windows Failover Clustering is more alert to possible

SQL Server 2012 health problem conditions. The following links provide more information.

o Failover Policy for Failover Cluster Instances


o Configure FailureConditionLevel Property Settings


Document Scope

The following are considered in the scope of this white paper:

This document applies to T24 R11 and R12 (Temenos Application Framework C) with

T24Browser as a channel.

This document focuses only on HADR functionality.

The document applies to following software:

o Windows Server 2008 R2 with Service Pack 1 (SP1)

o Windows Server 2008 R2 Network Load Balancing (NLB)

o Windows Server 2008 R2 clustering

o Windows Server 2008 R2 clustered file share

o Windows Server 2008 R2 Distributed File System (DFS) Replication

o SQL Server 2012 AlwaysOn Availability Group

o Windows Server 2008 R2 domain controller o JBoss 5.1.0 GA

The following are considered out of the scope of this white paper:

Performance tuning recommendations.

T24 channels other than T24Browser, such as TWS.NET, TOCF.NET, and BizTalk Adapter.

http://msdn.microsoft.com/en-us/library/ff878716.aspx)

http://msdn.microsoft.com/en-us/library/ff878664(v=SQL.110).aspx)



Administration and monitoring of the software.

Hardware configurations, such as RAID and network adapter teaming.

Security.

Local area network (LAN)/wide area network (WAN) configurations and recommendations.

Technical Overview of TEMENOS T24 The various components of a T24-based solution are shown in Figure 1.

Figure 1. T24 logical component view

Table 1 provides a description of the components.

Note that the HADR solution recommended in this white paper focuses on T24 with T24Browser as

a channel.

Tem

en

os

T24

Connectivity

TWS.NET TOCF.NET TOCF (EE)TWS (EE)

Ma

na

ge

me

nt

Se

cu

rity

Channels

T24 Browser

ARC IB ARC Mobile

Windows Server 2008 R2 Windows Server 2008 R2Internet Information Services (IIS) 7.5

Application

Message QueueTAFC Agent

T24

TAFC

DCD

C / C++

C / C++

C / C++

C / C++

FXTAFC Agent

T24

TAFC

DCD

C / C++

C / C++

C / C++

C / C++

FXTAFC Agent

T24

TAFC

DCD

C / C++

C / C++

C / C++

C / C++

FXTAFC Agent

T24

TAFC

DCD

C / C++

C / C++

C / C++

C / C++

EBTAFC Agent

T24

TAFC

DCD

C / C++

C / C++

C / C++

C / C++

AATAFC Agent

T24

TAFC

DCD

C / C++

C / C++

C / C++

C / C++

DXT24 Agent

T24

TAFC

Database Driver

AC

T

24

Mo

nito

r

SQL Server 2012

Windows Server 2008 R2

Windows Server 2008 R2Active Directory


Table 1. Component descriptions

Component Description

T24 Agent T24 Agent is a server-side jBASE component that is

responsible for accepting and processing incoming client

requests. Communication is established via TCP socket

connections and by means of a well-defined protocol. T24

Agent is a socket server listening on a user-defined TCP port,

and has the capability to serve a wide range of client

applications as long as they speak the same protocol.

T24 T24 is the banking business logic written by using jBC, which

is used to generate C / C++ code.

TAFC The Temenos Application Framework C (TAFC) version

provides additional runtime services that are currently not

available in jBC.

Database Driver Direct Connect Driver (DCD) is the T24 data abstraction layer

that decouples T24 business logic from the underlying data

storage/structure.

T24 Monitor T24 Monitor is a Java Management Extensions (JMX) and

web-based online monitoring tool for T24, offering real-time

statistics, as well as historical views of a particular T24

system.

Message Queue Message Queue is an optional middleware infrastructure that

lets T24 use message-driven communication with the

channel layer.

Database The jBASE or vendor-provided relational database

management system (RDBMS); currently supported

platforms are Oracle, Microsoft SQL Server, and IBM DB2.

SQL Server AlwaysOn SQL Server AlwaysOn is a new integrated, flexible, and cost-efficient HADR solution. AlwaysOn can

provide data and hardware redundancy within and across data centers, and it can improve

application failover time to increase the availability of mission-critical applications. AlwaysOn is

flexible and lets you reuse existing hardware investments.

A solution using AlwaysOn can take advantage of two major SQL Server 2012 features for

configuring availability at both the database and the instance level:


AlwaysOn Availability Groups

AlwaysOn Availability Groups are new in SQL Server 2012. They greatly enhance the

capabilities of database mirroring, help ensure availability of application databases, and

enable zero data loss through log-based data movement for data protection without shared

disks. Availability groups provide an integrated set of options, including automatic and

manual failover of a logical group of databases, support for up to four secondary replicas,

fast application failover, and automatic page repair.

AlwaysOn Failover Cluster Instances (FCIs)

FCIs enhance the SQL Server failover clustering feature and support multi-site clustering

across subnets, which enables cross-data-center failover of SQL Server instances. Faster and

more predictable instance failover is another key benefit that enables faster application

recovery.

Recovery Objectives

Data redundancy is a key component of a high-availability database solution. Transactional activity

on your primary SQL Server instance is synchronously or asynchronously applied to one or more

secondary instances. When an outage occurs, transactions that were in-flight might be rolled back,

or they might be lost on the secondary instances because of delays in data propagation.

You can measure the impact and set recovery goals in terms how long it takes to get back in

business and how much time latency there is in the last transaction recovered:

Recovery Time Objective (RTO)

The RTO is the duration of the outage. The initial goal is to get the system back online in at

least a read-only capacity to facilitate investigation of the failure. However, the primary goal

is to restore full service to the point that new transactions can take place.

Recovery Point Objective (RPO)

The RPO is often referred to as a measure of acceptable data loss. It is the time gap or

latency between the last committed data transaction before the failure and the most recent

data recovered after the failure. The actual data loss can vary depending on the workload

on the system at the time of the failure, the type of failure, and the type of high availability

solution used.

You should use RTO and RPO values as goals that indicate business tolerance for downtime and

acceptable data loss, and as metrics for monitoring availability health.

The business goals for RTO and RPO should be key drivers in selecting a SQL Server technology for

your high-availability and disaster-recovery solution.

Table 2 offers a rough comparison of the type of results that those different solutions may achieve.


Table 2. Comparison of SQL Server HADR solutions

SQL Server HADR Solution Potential Data Loss (RPO)

Potential Recovery Time (RTO)

Automatic Failover

Readable Secondaries1

AlwaysOn Availability

Group—synchronous-

commit

Zero Seconds Yes2 0–2

AlwaysOn Availability

Group—asynchronous-

commit

Seconds Minutes No 0–4

AlwaysOn Failover Cluster

Instance NA3

Seconds -to-minutes

Yes NA

Database Mirroring4—

High-safety (sync + witness) Zero Seconds Yes NA

Database Mirroring2—

High-performance (async) Seconds5 Minutes5 No NA

Log Shipping Minutes5 Minutes -to-hours5

No Not during a restore

Backup, Copy, Restore6 Hours5 Hours -to-days5

No Not during a restore

Fault Tolerance and Disaster Recovery Architecture SQL Server AlwaysOn solutions help provide fault tolerance and disaster recovery across several

logical and physical layers of infrastructure and application components. Historically, it has been

common practice to separate duties and responsibilities for the various audiences and roles

involved, so that each was predominately concerned with only a portion of those solution layers.

This section describes each of those layers and offers guidance for your design discussions and

implementation decisions.

A successful SQL Server AlwaysOn solution requires understanding and collaboration across these

solution layers:

1 An AlwaysOn Availability Group can have no more than a total of four secondary replicas, regardless of type. 2 Automatic failover of an Availability Group is not supported to or from a failover cluster instance. 3 The FCI itself does not provide data protection; data loss is dependent upon the storage system implementation. 4 This feature will be removed in future versions of Microsoft SQL Server. Use AlwaysOn Availability Groups instead. 5 This is highly dependent upon the workload, data volume, and failover procedures. 6 Backup, Copy, Restore is appropriate for disaster recovery, but not for high availability.


Infrastructure level

Server-level fault-tolerance and intra-node network communication use Windows Server

Failover Clustering (WSFC) features for health monitoring and failover coordination.

SQL Server instance level

A SQL Server AlwaysOn Failover Cluster Instance (FCI) is a SQL Server instance that is

installed across and can fail over to server nodes in a WSFC cluster. The nodes that host the

FCI are attached to robust symmetric shared storage (SAN or SMB).

Database level

An availability group is a set of user databases that fail over together. An availability group

consists of a primary replica and one to four secondary replicas. Each replica is hosted by an

instance of SQL Server (FCI or non-FCI) on a different node of the WSFC cluster.

Client connectivity

Database client applications can connect directly to a SQL Server instance network name, or

they may connect to a virtual network name (VNN) that is bound to an availability group

listener. The VNN abstracts the WSFC cluster and Availability Group topology, logically

redirecting connection requests to the appropriate SQL Server instance and database

replica.

Figure 2 shows a logical topology of a representative AlwaysOn solution.

Figure 2. Logical representation of an AlwaysOn solution


High Availability and Disaster Recovery Solution The recommended HADR solution for T24 deployments was designed based on the following:

Incurring zero data loss when failing over to the disaster recovery site, assuming that there

is a compatible network connection between the sites that are capable of synchronous data

replication.

Reducing the cost of Microsoft software licenses and specialized hardware (such as load

balancers) to minimize the total cost of ownership.

Maximizing use of any Windows Server 2008 R2 features and capabilities that complement

T24.

The following decisions were made in the solution design. Refer to Figure 3 for further information.

The disaster recovery site used for testing had only one server for each tier. If the disaster

recovery site also requires high availability, the configuration used in the primary site

should be used for the disaster recovery site.

The Windows Server 2008 R2 NLB feature is used to load balance the traffic into the JBoss

application servers in the primary site. The same feature can be used for the disaster

recovery site if there will be two or more disaster recovery nodes.

A DNS host record was created for the web-tier NLB IP to make the failover to the disaster

recovery site transparent to the users (for example, T24Browser.CoE.Temenos.com).

T24Browser is a stateful application that normally deploys with a sticky-session

configuration. Although this configuration provides the required functionality, it reduces the

scalability of the T24 web tier. The user might lose the session if an application server goes

down, reducing the availability. The solution presented in this white paper eliminates these

limitations by removing sticky sessions. This is achieved by persisting the JBoss session state

in the SQL Server database and configuring NLB to “Affinity: None”.

Using NLB and DNS host record and avoiding the use of sticky sessions lets you add or

remove web-tier servers transparently, without affecting users.

T24Browser is capable of performing simple load balancing among the available T24

application servers when a load balancing solution is not available in the application tier.

This feature is disabled in the recommended solution, and NLB is used instead with the

“Affinity: None” configuration to achieve the best possible load balancing.

DNS host record was created for the application-tier NLB IP so that you have the option of

failing over only the application tier to the disaster recovery site if necessary (for example,

T24Server.CoE.Temenos.com).

This is an optional configuration that is only required if a facility needs to simplify server

maintenance and keep the T24Browser configurations identical in both sites. However, this

option does create an additional step in the disaster recovery procedures.


Using the NLB “Affinity: None” configuration makes it possible to add or remove

application-tier servers transparently, without affecting online transactions.

The SQL Server 2012 HADR AlwaysOn (HADRON) configuration with a SQL Server 2012

Failover Cluster instance for the primary site is used to reduce the number of required

SQL Server 2012 licenses.

The primary site can have two standalone instances of SQL Server 2012 instead of the

failover cluster instance if you need to remove the shared storage; however, this will

require licenses for each SQL Server 2012 instance, while the failover cluster instance

requires only one license regardless of the number of nodes in the cluster.

The disaster recovery instance of SQL Server 2012 is configured as a SQL Server 2012

HADRON synchronous AlwaysOn replica for zero data loss.

Synchronous replication requires a fast and stable network connection in order to work as

expected. This needs to be taken into account when setting up the network. If you do not

have a fast and stable network connection, implement asynchronous replication instead,

but understand that asynchronous replication does have a possibility of data loss.

The same Windows Server Failover Cluster that hosts the SQL Server 2012 clustered

instance is used to host a clustered file share to keep T24 shared files and folders. The

clustered file share increases the availability of the T24 shared files and folders.

The disaster recovery site has a local folder for T24 shared files/folders. Windows Server

2008 R2 Distributed File System Replication (DFS-R) is implemented with an Active Directory

Domain Services (AD DS)–published namespace to make the file share failover to the

disaster recovery site transparent and to replicate T24 shared files/folders.

Making the T24 shared files available in the disaster recovery site is not mandatory because

T24 can recover without them. However, having the T24 shared files available has a positive

impact. Therefore, DFS-R is scheduled to occur several times per day to reduce the

overhead of the replication.

T24 typically accesses shared files and folders via a mapped drive letter in each T24 server.

Since accidentally removing or changing the mapped drive letter can cause failures, file and

folder symbolic links were created by using the “mklink” utility of Windows and used

instead of the mapped drive letters to avoid unintended mistakes. Symbolic links make the

shared files and folders imitate local entities, and therefore T24 can access them directly.

A JBoss session persistence database was created in the same SQL Server 2012 HADRON

configuration as the T24 database, therefore having the same high availability and disaster

recovery capabilities. This makes management easier and reduces the steps in disaster-

recovery procedures. You can, however implement the JBoss session persistence database

as a different instance, if required.


Figure 3. HADR solution


Setup and Configuration This section describes how to configure the HADR solution.

SQL Server 2012 HADR Configuration

SQL Server 2012 HADR is configured with a clustered instance for the primary site and a standalone

instance in the disaster recovery site. The configuration uses the AlwaysOn Availability Group to

replicate database content and to provide transparent failover. The disaster recovery instance is

configured as a synchronous replica for zero data loss. Figure 4 shows a schematic of the solution.

Figure 4. SQL Server 2012 HADR solution

The Windows Server Failover Cluster consists of a cluster with three nodes: two nodes in the

primary site and one node in the disaster recovery site with a SAN shared only between the two

nodes in the primary site. The disaster recovery instance has only local storage where the database

content is replicated by using the availability group.

The cost of the solution is reduced because there is no shared storage between nodes in the

primary site and the node in the disaster recovery site, because there is no SAN in the secondary

site, and because you do not need an expensive storage-level synchronization mechanism to

replicate disk data content.

A clustered SQL Server 2012 instance is primarily used to reduce the number of SQL Server 2012

licenses that are required. The primary site could have two standalone instances of SQL Server

2012 instead of the failover cluster instance if this is required to remove the shared storage;

however, this option requires licenses for each SQL Server 2012 instance, while the failover

cluster instance requires only one licence regardless of the number of nodes in the cluster.


If the disaster recovery site also requires high availability, the same configuration used in the

primary site needs to be available in the disaster recovery site.

When the recommended solution was tested, all of the SQL Server instances were created as

named instances to make them easy to identify during maintenance and monitoring. Table 3 lists

the names that were used in the test environment during setup; these names can be used as a

reference guideline.

Table 3. Names of SQL Server instances

Name Description

SQL11HA SQL Server 2012 instance name of the primary site.

Since the named instance uses a dynamic TCP port, static TCP

port 1533 was configured via the SQL Server Configuration

Manager.

SQL11DR SQL Server 2012 instance name of the disaster recovery site.

Since the named instance uses a dynamic TCP port, static TCP

port 1533 was configured via the SQL Server Configuration

Manager.

T24AG SQL Server 2012 AlwaysOn Availability Group name. This name is

not used by T24, and is used in SQL Server Management Studio

when required to fail over to the disaster recovery instance.

The JBoss session persistence database was added to the same

availability group in the test environment. This makes

management easier, and disaster recovery failover becomes a

single process for both the databases.

T24AgListener SQL Server 2012 AlwaysOn Availability Group listener name. This

is the name T24 uses to connect the SQL Server 2012 HADRON

instance.

When creating the listener, 1433 (the SQL Server default port)

was used as the TCP port number to avoid having to change the

T24 connection parameters to use a different port number.

Windows Server Firewall Configurations The Windows Server Firewall is on by default; therefore, you need to create relevant inbound

firewall exceptions in the servers for the configuration to work as expected.

Table 4 shows the inbound firewall rules that need to be created in all the database servers.


Table 4. Firewall rules

Name Description

SQL11 (1533) Inbound firewall exception rule for TCP port 1533, which is the

static port configured for the SQL Server instance.

SQL11 Browser (1434) Inbound firewall exception rule for UDP port 1434, which is

required for the SQL Server Browser when named instances

exist.

SQL11 AG (5022) Inbound firewall exception rule for TCP port 5022, which is

required for the SQL Server 2012 HADRON Availability Group.

SQL11 AG Listener (1433) Inbound firewall exception rule for TCP port 1433, which is

configured for the SQL Server 2012 HADRON Availability Group

Listener.

T24 File Share Configuration

In the multi-server configuration, T24 is required to have a shared location for its working files and

folders. Any single file is created or written by only one T24 instance and is read by all instances.

There is no concern about file write locks; however, the share needs to be resilient for the multi-

server configuration to function properly.

If T24 fails over to the disaster recovery site, making the T24 shared files available in the

disaster recovery site is not mandatory because T24 can recover without them. However,

having the shared files available does have a positive impact.

A resilient file share solution with less frequent (once or twice a day) file replication to the disaster

recovery site is therefore a good solution. Windows Server Clustered File Server, in conjunction with

DFS-R, provides an optimal solution and does not require any additional licenses.

For simplicity, an Active Directory Domain Services (AD DS)—published DFS Namespace is used to

refer the shared file folder. Therefore, T24 can refer the same path (namespace) for shared files,

whether it is in the primary site or in the disaster recovery site.

Figure 5 shows the T24 file share and file replication configuration.


Figure 5. File share and file replication

Windows Server Clustered File Share Configuration

The recommended SQL Server 2012 HADR configuration uses a Windows Server Cluster. Using the

same cluster to host the file server reduces the complexity of the solution and simplifies

management and monitoring.

Since only the primary site servers in the cluster have access to the shared storage, the only

possible owners of the file server are the servers in the primary site. The file server, therefore,

does not fail over to the disaster recovery site, and the disaster recovery instance of T24 will

only have access to its local folders.

A shared folder called “T24FileShare” was created in the file server and used as the resilient file

share location of the primary site.


If the disaster recovery site also uses a T24 multi-server configuration, the same type of file

share needs to be created in the disaster recovery site. However, because the test environment

had only a single T24 instance, a local folder was created with the same shared folder name.

Distributed File System Replication Configuration

DFS-R was used to periodically replicate T24 shared files between the primary site and the disaster

recovery site. The replication frequency was set to the lowest possible (once or twice per day) to

avoid any performance implications, and because having the shared files available in the disaster

recovery site is not mandatory to T24.

The disaster recovery site of the test environment had a single instance of T24; therefore, the folder

for the shared files was created locally in the same server. The DFS replication was set up to

replicate the files between the clustered file share in the primary site and the local folder in the T24

disaster recovery instance.

Active Directory Domain Services DNS Configuration To make the web-tier failover transparent to the users, you must have a DNS host record that can be

referred by the users to reach T24Browser instead of the load balancer IP. Failover to disaster

recovery will therefore only require changing the IP address of the DNS host record, and users do

not need to use a different URL. In the test environment, the DNS host record

“T24Browser.CoE.Temenos.com” was created for the web-tier Network Load Balancing IP.

You can also create a DNS host record for the application-tier servers if it is a requirement to be able

to transparently fail over the application tier independently to the web tier.

Note that this is an optional configuration that is helpful if you need to ease server maintenance

and keep the T24Browser configurations identical in both sites. However, this configuration

does add a step to the disaster recovery procedures.

The DNS host record “T24Server.CoE.Temenos.com” was created for the application-tier NLB IP in

the test environment.

One drawback of using DNS host records is that the client application using the name caches the IP

address. Therefore, even if the IP address of the DNS host record is changed at the server-side in a

disaster recovery failover, the client application might still use the old IP address, and this old IP

address might no longer be available.

To minimize the chance to this happening, the “time to live” (TTL) value of the DNS host record

needs to be adjusted. In the test environment, the TTL value was set to one minute, which means

that the client application verified the DNS host record IP address with the server every one minute.


While shorter TTL values can increase the load on the DNS server, they can be useful with

critical services like web servers, application servers, and load balancers. TTL values are often

reduced by the DNS administrator before service is moved to minimize disruptions.

Table 5 shows the DNS host records that were created in the test environment.

Table 5. DNS host records

DNS Host Record Description

T24Browser.CoE.Temenos.com The Domain Name System (DNS) host record of the T24

web-tier load balancer that was used in the web

browser URL to connect to T24Browser.

The TTL value was set to one minute for testing.

T24Server.CoE.Temenos.com An optional DNS host record created for the T24

application-tier load balancer to test transparent

failover of the application tier independently to the web

tier.

This was used by the T24Browser (configured in t24-

ds.xml) to connect to the load balancer in the test

environment.

The TTL value was set to one minute for testing.

Application-Tier NLB Configuration T24Browser is capable of performing simple load balancing among the available T24 application

servers when a load balancing solution is not available in the application tier. However, specialized

load balancing solutions can provide better load balancing capabilities.

The NLB feature in Windows Server is a software load balancing solution that does not require

additional licenses and complements T24 by providing a specialised load balancing solution.

In the recommended solution, the NLB feature In Windows Server is enabled and configured in the

T24 application servers in the primary site, and created an NLB cluster consisting of the two servers.

Figure 6 shows the application-tier NLB cluster.


Figure 6. Application-tier NLB cluster

If the disaster recovery site has multiple T24 application servers, an NLB cluster needs to be

configured in those servers as well.

Table 6 shows the NLB configurations used.

Table 6. NLB configurations

Configuration Description

Cluster operation mode “Multicast” operation mode was used to keep the network

adapter’s built-in media access control (MAC) address. This

was because the test servers had only one network adapter,

and this network adapter had to be used for server

management as well.

If the server has multiple network adapters, the cluster

operation mode can be set to “Unicast.”

Protocol The protocol used for communication with T24 was TCP/IP.

Port range The port range was limited to 20002, which is the T24 agent

port configuration.

Filtering mode “Affinity: None” was selected to achieve best possible load

balancing.

The simple load balancing feature in T24 of T24Browser is disabled and used NLB cluster name

(T24Server.CoE.Temenos.com) as the T24 instance. This lets the network load balancing route the

connections to the T24 instances in the cluster.


Using NLB with the “Affinity: None” configuration lets you add or remove application-tier

servers transparently, without affecting online transactions.

T24 Application Server Configuration

The T24 application tier is configured with two T24 instance (nodes: App Node 1 and App Node 2) in

the primary site and a single instance (node: App Node 3) in the disaster recovery site.

Note that it is possible to have multiple T24 instances (application server nodes) in the disaster

recovery site if high availability is a requirement for the disaster recovery site.

The Windows Server 2008 R2 NLB feature was used to balance the T24 application servers.

The HADR solution for the T24 file share is implemented by using a Windows Server 2008 R2

clustered file share and DFS-R.

Figure 7 shows the T24 application tier configuration.

Figure 7. T24 application tier

The Temenos Application Framework ‘C’ (TAFC) is the execution environment for the T24

application. Install TAFC and T24 application on all application servers (for installation guidance,

contact Temenos).


Following is a description of how the T24 application servers were configured:

All the T24 instances in the test environment used multiple server configurations with the

required licenses. To use one instance of T24 on multiple servers, install the multiple

application server module.

When using multiple application servers, define port ranges for each T24 application server

to avoid conflicts or deadlock situation during close of business.

Ports can be assigned by using the following variable in each application server:

JBCPORTNO= port range

The same jbase_agent port must be used on all T24 application servers. The default

jbase_agent port 20002 was used in the test environment.

The same port must be used because requests to the T24 servers are controlled by the

load balancer, and therefore T24Browser sees only a single instance of T24 (load

balancing cluster name), regardless of the number of T24 applications servers available.

Inbound Windows firewall exception rule for TCP port 20002 was created to make the

jbase_agent port accessible from T24Browser.

The T24 database driver (Direct Connect Driver *DCD+) requires the SQL Server client to be

installed on the server. At the time of testing, the DCD for the SQL Server 2012 Native Client

was still in development. For this reason, the SQL Server 2008 R2 Native Client was used.

Because the SQL Server 2012 HADR configuration is used for the database tier, the T24

database must be accessed via the SQL Server 2012 AlwaysOn Availability Group. Therefore,

the availability group listener name was used in the T24 configuration instead of the

database server IP address.

File jedi_config , Record 'XMLMSSQL_FRMWRK'

Command->

0001 R12.100203 Direct connect driver version.

0002 T24AgListener]T24R12 DB Server name] DB name

0003 T24User]uHdE9oJj8B5Y0cUF0hGh0A==] DB User/Password encrypted

Default database locking (SQL Server application lock) was used for the testing.


Limitations of Using the SQL 2008 R2 Native Client

During the testing, the SQL Server 2008 R2 Native Client was used with the T24 database driver

because the DCD did not support SQL Server 2012 client libraries.

The following limitations therefore apply to the SQL Server 2012 HADR AlwaysOn functionalities:

Read-only routing for the availability group is not available.

Application intent is not available.

Optimizations for fast multi-subnet failover clustering are not available.

When the SQL Server 2012 Native Client is certified for use with T24, the considerations for client

availability features shown in Table 7 will apply.

Table 7. Client type considerations

Driver Multi-subnet failover

Application intent

Read-only routing

Multi-subnet failover:

faster single subnet endpoint failover

Multi-subnet failover:

named instance resolution for SQL Server clustered instances

SQL Server Native

Client 11.0 ODBC

Yes Yes Yes Yes Yes

SQL Server Native

Client 11.0 OLE DB

No Yes Yes No No

ADO.NET with

Microsoft .NET

Framework 4.0 update

4.0.2*

Yes Yes Yes Future date Future date

ADO.NET with .NET

Framework 3.5

Future

date

Future date Future

date

Future date Future date

Microsoft Java

Database Connectivity

(JDBC) driver 4.0 for

SQL Server

Yes Yes Yes Yes Future date

*ADO.NET with .NET Framework 4.0.2 patch download for connectivity improvement

(http://support.microsoft.com/kb/2544514).

For more information about connection string keywords, see:

Using Connection String Keywords with SQL Server Native Client (http://msdn.microsoft.com/en-us/library/ms130822(v=sql.110).aspx).

http://support.microsoft.com/kb/2544514

http://msdn.microsoft.com/en-us/library/ms130822(v=sql.110).aspx


T24 Shared Files

For T24 multiple server installation, it is necessary to share certain files and folders among T24

application servers.

T24 typically accesses shared files and folders via a mapped drive letter in each T24 server.

However, accidentally removing or changing the mapped drive letter can cause failures. Therefore,

file and folder symbolic links were created by using the Windows “mklink” utility instead of mapped

drive letters to avoid unintended mistakes. Symbolic links make the shared files and folders act as

local entities, so T24 can directly access them. If there are additional folders/files that need to be

shared, appropriate symbolic links should be created.

Web-Tier NLB Configuration

When the web tier has multiple servers (nodes), there needs to be a mechanism to route the

requests to the servers and to provide a single address to the requester (web browser), regardless

of the number of servers in the tier. This functionality is typically provided by using the proxy server

or/and load balancer with redundancy to increase the availability of the service.

The Network Load Balancing (NLB) feature of Windows Server does not have a single point of

failure because the service works on the network layer of all the servers. Because it is a readily

available feature in Windows Server, the NLB feature does not require additional licenses.

The NLB feature is enabled and configured in the web servers in the primary site, and created an

NLB cluster consisting of the two servers.

Figure 8 shows the web-tier NLB cluster.

Figure 8. Web-tier NLB cluster


If the disaster recovery site has multiple web-tier servers, an NLB cluster needs to be configured

in those servers as well.

Table 8 shows the NLB configurations used. Table 8. NLB configurations

Configuration Description

Cluster operation mode “Multicast” operation mode was used to keep the network

adapter’s built-in media access control (MAC) address. This

was because the test servers had only one network adapter,

and this network adapter had to be used for server

management as well.

If the server has multiple network adapters, the cluster

operation mode can be set to “Unicast.”

Protocol TCP was used as the HTTP traffic transport over TCP/IP.

Port range The port range was limited to 8080, which was the JBoss web

site port range configured in the test environment.

Filtering mode “Affinity: None” was selected to achieve best possible load

balancing.

Typically, the T24Browser requires “Affinity: Single” (sticky-

session) configuration because it is a stateful application.

However, in the recommended solution, JBoss is configured

to persist session states in the SQL Server database;

therefore, it is possible to use the “Affinity: None”

configuration in the load balancer.

To make it possible to fail over the web tier to the disaster recovery site transparently, the DNS

host record (T24Browser.CoE.Temenos.com) is used for the NLB cluster IP address. Therefore, the

web browser URL remains unchanged, even if there is a failover to the disaster recovery site.

Not using sticky-sessions increases the availability of the site; in addition, using NLB with the

DNS host record allows for adding or removing web-tier servers transparently and without

affecting the users.


T24Browser Configuration The T24 web tier is configured with two JBoss instances with T24Browser (nodes: Web Node 1 and

Web Node 2) in the primary site and a single instance (node: Web Node 3) in the disaster recovery

site.

It is possible to have a multiple JBoss/T24Browser instances (web server nodes) in the disaster

recovery site if high availability is a requirement for the disaster recovery site.

The Windows Server 2008 R2 NLB feature was used to balance the loads on the JBoss server nodes.

Figure 9 shows the T24 web tier.

Figure 9. T24 web tier

JBoss Configuration

The JBoss application server 5.1.0 GA was used in the test environment that hosted T24Browser

Java Servlet application. No clustered instance of JBoss was installed in the web-tier servers.

Following is the list of configurations that were made after successfully installing JBoss:

Because of the limitations of JBoss cluster session replication and to avoid using sticky

sessions, JBoss session persistence functionality was implemented using a SQL Server

database. A JBoss session persistence database was created in the same SQL Server 2012

HADR configuration as the T24 database. Therefore, the JBoss session persistence database

has the same high availability and disaster recovery capabilities as the T24 database. This

makes management easier and reduces the number of steps in the disaster recover

procedures. (Note that the JBoss session persistence database can be implemented as a

different instance if required.)


An inbound Windows firewall exception rule for TCP port 8080 was created to make JBoss

accessible to users.

T24Browser with AGENT Connection Method

After successful installation of the JBoss application server, T24Browser can be deployed and

configured to use one of the two types of supported configurations, AGENT or JMS. Detailed step-

by-step setup and configuration can be requested from Temenos.

For the online transactions used in this testing, the AGENT configuration is recommended. Tables 9

and 10 show the settings that were configured in the T24Browser.

Table 9. Settings in browserParameters.xml

Parameter

name

Description

Server

Connection

Method

Configuration of the connection to the T24 server.

AGENT connection method was used for the testing.

ConnectionTime

out

The connection expiration time if T24Browser does not get a response from the

T24 application server.

This was set to 20 seconds.

RetryCount The number of retry attempts the T24Browser should make if it can’t reach T24

to successfully execute a transaction.

This was set to 20 times.

RetryWait When retrying, the number of seconds to wait before attempting to retry the

transaction.

This was set to 5 seconds.


Table 10. Settings in t24-ds.xml

Property name Description

Host A comma-separated list of available T24 servers.

Because the NLB feature in Windows Server 2008 R2 is configured at the

application tier, the name of the load balancing cluster needs to be used instead

of the names of the T24 servers.

The load balancing cluster “T24Server.CoE.Temenos.com” was used in the test

environment.

Ports The jbase_agent TCP port number.

All T24 instances in the test environment are configured to use TCP port 20002;

therefore, 20002 is used as the jbase_agent port number.

loadBalancing To enable or disable the simple load balancing feature in T24Browser.

This is set to “false” because the NLB feature in Windows Server 2008 R2

performs the load balancing in the recommended solution.

actionTimeout The number of seconds that the jbase_agent waits for a response from T24

application server.

This was set to 60 seconds in the test environment.

Disaster Recovery Procedures The high availability solution described in this document implements “automatic failover” between

the primary site servers (nodes). Human intervention is therefore not required. However, the

disaster recovery failover is intentionally designed to be manual, because this is typically part of the

business continuity plan. Therefore, the disaster recovery failover might require additional

procedures to be followed.

This section describes the disaster recovery procedures that were successfully tested for the

recommended solution.

Figure 10 shows the three failover activities that are required. Note that the second failover activity

is optional, and can be used if application-tier failover is implemented to ease maintenance

activities. In addition, if the optional DFS-R is implemented, the “DFS Namespace” fails over

automatically and manual failover is not required.


Figure 10. Failover to disaster recovery site

The steps required for the failover activities are described in detail in the sections that follow. Note

that the steps in all sections need to be completed to successfully fail over to the disaster recovery

site.


DNS Switching

Web-tier and application-tier DNS switching require changing the IP address of the DNS host

records to the IP address of the relevant server (node) in the disaster recovery site.

Following are the steps that need to be followed to change the IP addresses of the DNS host

records:

1. Log on to the domain controller as the administrator.

2. Navigate to Server Manager.

3. Expand Roles, expand DNS Server, expand DNS, expand Server Name, and then expand

Forward Lookup Zones.

4. Select the domain name (for example, CoE.Temenos.com). Note that T24Browser and

optional T24Server are the DNS host records that require the IP changes (Figure 11).

Figure 11. Select the DNS host record


5. Right click on the DNS host record T24Browser, and then select Properties (Figure 12).

Figure 12. T24Browser DNS host record properties

6. Change the address in the IP address field to the IP address of the web-tier server in the

disaster recovery site, and then click OK.

If the disaster recovery site has more than one web-tier server, the previous IP address

should be the IP address of the web-tier load balancer (NLB cluster).


7. If the “T24Server” DNS host record is also available, right-click the DNS host record, and

then select Properties. Change the address in the IP address field to the IP address of the

application-tier server in the disaster recovery site, and then click OK (Figure 13).

Figure 13. T24Server DNS host record properties

If the disaster recovery site has more than one application-tier server, the IP address

should be the IP address of the application-tier load balancer (NLB cluster).

SQL Server 2012 HADR Failover The SQL Server 2012 HADR failover to the disaster recovery site might be required for the following

two scenarios:

Planned manual failover

Primary site database servers are available, but required to fail over to the disaster

recovery site.

Unplanned forced failover

Complete primary site or primary site database server failure, and the database servers in

the primary site are not accessible.


Planned Manual Failover

When the failover is planned, there is no server downtime in the primary site, the Windows Server

Failover Cluster (WSFC) is active, and databases are in “Synchronized” state in both primary and

disaster recovery instances of SQL Server.

Therefore, before starting the failover procedure, make sure that the databases are in

“Synchronized” state in both primary and disaster recovery instances of SQL Server (Figure 14 and

Figure 15).

Figure 14: SQL Server primary instance database status


Figure 15. SQL Server disaster recovery instance database status

For more information about planned manual failover, see:

Perform a Planned Manual Failover of an Availability Group (SQL Server)

(http://msdn.microsoft.com/en-us/library/hh231018.aspx).

Limitations and Restrictions

A failover command returns as soon as the target secondary replica has accepted the

command. However, database recovery occurs asynchronously after the availability group

has finished failing over.

Cross-database consistency across databases within the availability group is not maintained

during failover.

Cross-database transactions and distributed transactions are not supported by

AlwaysOn Availability Groups.

For more information, see:

Cross-Database Transactions Not Supported for Database Mirroring or AlwaysOn

Availability Groups (SQL Server)

(http://msdn.microsoft.com/en-us/library/ms366279.aspx).


http://msdn.microsoft.com/en-us/library/ms366279.aspx


Prerequisites and Restrictions

The target secondary replica and the primary replica must both be running in synchronous-

commit availability mode.

The target secondary replica must currently be synchronized with the primary replica. This

requires that all the secondary databases on this secondary replica must have been joined

to the availability group and must be synchronized with their corresponding primary

databases (that is, the local secondary databases must be synchronized).

To determine the failover readiness of a secondary replica, query the is_failover_ready

column in the sys.dm_hadr_database_cluster_states dynamic management view (see:

http://msdn.microsoft.com/en-us/library/hh213319.aspx) or look at the Failover

Readiness column of the AlwaysOn Group Dashboard (see:

http://msdn.microsoft.com/en-us/library/hh213474.aspx).

This task is supported only on the target secondary replica. You must be connected to the

server instance that hosts the target secondary replica.

Failover Procedure

Following are the steps that need to be followed to fail over the SQL Server 2012 HADR to the

disaster recovery site.

1. Connect to Primary or Secondary (disaster recovery) instance of SQL Server by using the

SQL Server 2012 Management Studio (Figure 16).

Figure 16. SQL Server 2012 primary instance




2. Right-click on the Availability Group (for example, T24AG), and then select Failover

(Figure 17).

Figure 17. Select "Failover"

3. In the Fail Over Availability Group Wizard, click Next (Figure 18).

Figure 18. Failover Availability Group wizard – Introduction page


4. In the Select New Primary Replica page, select the secondary SQL Server instance if it is not

already selected, and then click Next (Figure 19).

Figure 19. Fail Over Availability Group wizard – Select New Primary Replica page

5. In the Connect to Replica page, connect to the secondary instance by providing the

credentials, and then click Next (Figure 20).

Figure 20. Fail Over Availability Group wizard – Connect to Replica page


6. Click Finish at the Summary page to start the failover (Figure 21).

Figure 21. Fail Over Availability Group wizard – Summary page

7. After the successful failover, the wizard will show a Results page similar to the following

(Figure 22).

Figure 22. Fail Over Availability Group Wizard – Results Page


The “Validating WSFC quorum vote configuration” warning appears because of the

special quorum configuration used in this solution and is safe to ignore (Figure 23).

Figure 23. Fail Over Availability Group wizard – WSFC quorum configuration warning

8. Check the database status and Availability Group status in SQL Server 2012 Management

Studio to verify the failover (Figure 24).

Figure 24. Management Studio after Fail Over Availability Group wizard


Unplanned Forced Failover

When the primary site or the database servers (nodes) in the primary site are not available, the

Windows Server Failover Cluster (WSFC) will not have quorum to bring the cluster online. Therefore

WSFC needs to be deliberately started (forced) before the database failover.

After bringing the WSFC online with a forced quorum, the SQL Server 2012 AlwaysOn Availability

Group needs to force failover to the disaster recovery instance.

For more information about unplanned forced failover, see:

Perform a Forced Manual Failover of an Availability Group (SQL Server)

(http://msdn.microsoft.com/en-us/library/ff877957(SQL.110).aspx).

Limitations and Restrictions

Data loss is possible during the forced failover of an availability group. In addition, if the

primary replica is running when you initiate a forced failover, client computers might still

be connected to former primary databases. Therefore, it is strongly recommended that you

force failover only if the primary replica is no longer running and if you are willing to risk

losing data to restore access to databases in the availability group.

When a database on a secondary replica is in the REVERTING or INITIALIZING state, forcing

failover causes the database to fail to start as a primary database. If the database was in

the INITIALIZING state, you will need to apply the missing log records from a database

backup or fully restore the database from scratch. If the database was in the REVERTING

state, you will need to fully restore the database from backups.

A failover command returns as soon as the target secondary replica has accepted the

command. However, database recovery occurs asynchronously after the availability group

has finished failing over.

Cross-database consistency across databases within the availability group is not maintained

upon failover.

Cross-database transactions and distributed transactions are not supported by

AlwaysOn Availability Groups.

For more information, see:

Cross-Database Transactions Not Supported for Database Mirroring or AlwaysOn

Availability Groups (SQL Server)

(http://msdn.microsoft.com/en-us/library/ms366279.aspx).

http://msdn.microsoft.com/en-us/library/ff877957(SQL.110).aspx



Prerequisites and Restrictions

Windows Server Failover Cluster (WSFC) needs to be brought online with a forced quorum.

For more information about the forced quorum procedure, see:

WSFC Disaster Recovery through Forced Quorum (SQL Server)

(http://msdn.microsoft.com/en-us/library/hh270277.aspx).

You must be able to connect to the server instance that hosts the target secondary replica.

Failover Procedure

When the primary site or the primary site database servers are not available, the only accessible

database server will be the disaster recovery instance.

The following shows how Windows Server Failover Cluster and SQL Server instance can be seen in

the disaster recovery database server (Figure 25 and Figure 26).

Figure 25. Windows Server failover Cluster without quorum



Figure 26. SQL Server 2012 – primary site failure

To bring the database online in the disaster recover site, you first need to start Windows Server

Failover Cluster with forced quorum, followed by SQL Server 2012 availability group forced failover.

The following sub-sections provide the steps required to bring the database online. The steps in

all sections need to be completed to successfully fail over to the disaster recovery site.

Force Cluster Start with Force Quorum

Following are the steps need to be followed to force the cluster to start in the disaster recovery site

with force quorum:

1. Log on to the disaster recovery database server with a domain account that has

administrator privileges to the local computer.


2. Open Server Manager, expand Features, and then expand Failover Cluster Manager. Select

the cluster (Figure 27).

Figure 27. Failed cluster due to quorum vote

3. Click Force Cluster Start in the Actions pane (Figure 28).

Figure 28. Cluster Manager - force cluster start option


4. Confirm the action by selecting Yes – Force my cluster to start option (Figure 29).

Figure 29. Confirm force cluster start

5. Cluster start will take some time—wait till the cluster starts successfully (Figure 30).

Figure 30.Cluster force start in progress


6. After the cluster starts, the cluster will look like the following figure in the Failover Cluster

Manager (Figure 31).

Figure 31. Cluster started with force quorum

Force Failover SQL Server 2012 Availability Group

Once the Windows Server Failover Cluster is online with force quorum, the following steps need to

be followed to force failover in the SQL Server 2012 availability group:

1. Open SQL Server 2012 Management Studio and connect to the SQL Server disaster

recovery instance (Figure 32).

Figure 32.SQL Server instance before forced failover


2. Right-click on the Availability Group (for example, T24AG), and then select Failover (Figure 33).

Figure 33. Start force failover

3. In the Fail Over Availability Group Wizard, click Next (Figure 34).

Figure 34. Fail Over Availability Group wizard – Introduction page


4. In the Select New Primary Replica page, select the secondary SQL Server instance if it is not

already selected. Also note the warning. Click Next (Figure 35 and Figure 36).

Because the cluster quorum is forced, the quorum status is showing as “Forced Quorum”.

Figure 35. Fail Over Availability Group wizard – Select New Primary Replica page

Figure 36. Fail Over Availability Group wizard – Select New Primary Replica page warning


5. Select and confirm failover with potential data loss, and then click Next (Figure 37).

Because the database status is not synchronized, SQL Server warns about potential data

loss. However, there is no data loss if the databases were in “Synchronized” state at the

time of the site failure

Figure 37. Fail Over Availability Group wizard – Potential Data Loss confirmation


6. Click Finish on the Summary page to start the failover (Figure 38).

Figure 38. Fail Over Availability Group wizard – Force Failover Summary page

7. After the successful force failover, wizard will show the “Results” page (Figure 39).

Figure 39. Fail Over Availability Group wizard – Results page


The “Validating WSFC quorum vote configuration” warning appears because of the

special quorum configuration that is used in the recommended solution and is safe to

ignore (Figure 40).

Figure 40. Fail Over Availability Group wizard – WSFC Quorum Configuration warning

8. After successful force failover, the database status and availability group status in

SQL Server 2012 Management Studio will look like the following figure (Figure 41).

Figure 41. Management Studio after Fail Over Availability Group wizard

Additional Considerations

It is highly recommended that you change the cluster quorum configuration if planned (scheduled

maintenance) or unplanned (primary site disaster) shutdown of all cluster nodes in the primary site

occurs, and if the disaster recovery SQL Server 2012 instance becomes active as the primary

instance for an extended period of time. If you do not change the cluster quorum configuration, the

entire cluster might shut down because of insufficient quorum vote availability.


Change the value for the disaster recovery cluster node property “NodeWeight” to 1, and

change the value for the cluster nodes in the primary site to 0. For more information, see

the Microsoft Support article at http://support.microsoft.com/kb/2494036/en-us.

Shutting down only one node in the primary site will not affect cluster availability as

long as the second node in the primary site will be still up and running along with the

File Share Witness (FSW).

If the FSW in the primary site will not be available and cannot be contacted by the cluster

node in the disaster recovery site, change the FSW location to be in the disaster recovery

site.

Running the entire system with only one node in the disaster recovery site will not guarantee high

availability. Therefore, this should only be done for a limited amount of time. Otherwise, it is highly

recommended that you add a second node in the disaster recovery site and modify the cluster

quorum configuration accordingly.

Findings and Carryovers The following findings and carryovers were noted during the testing of the proposed solution in this

document.

Using the NLB feature in Windows Server provides better stability, better scalability, and

faster failover with no additional cost. NLB also lets you transparently add or remove

nodes in the web and application tiers.

JBoss session persistence increases the reliability and provided better scalability for the

solution.

Removing the sticky-session requirement in T24Browser makes the solution more reliable

and scalable.

A JBoss session persistence database in the same SQL Server 2012 AlwaysOn Availability

Group reduces the administrative work and reduces the steps in the disaster recovery

procedures.

SQL Server 2012 HADR and AlwaysOn provides simplified disaster recovery failover while

maintaining database replica in the disaster recovery site.

T24 works well with a configuration that uses the NLB feature in Windows Server and

provides faster application-tier failover.

http://support.microsoft.com/kb/2494036/en-us


Windows Server DFS-R with DFS Namespace published in Active Directory Domain Services

provides a unique URL that can be used to refer the file share, regardless of the system

that is operating in the primary or the disaster recovery environment.

File and folder symbolic links make the shared file/folder access more resilient.

A clustered instance of SQL Server 2012 for high availability reduces licensing

requirements.

A SQL Server 2012 AlwaysOn Availability Group eliminates SAN replications.

DNS host records used for the load balancer IP addresses make disaster recovery failover

transparent at the web and application tiers.

Recommended Hotfixes and Service Packs The following best practices apply to the recommended configuration:

Regularly check and apply all the security hotfixes for Windows Server 2008 R2.

Regularly check and apply the latest available service pack for Windows Server 2008 R2

after checking with Temenos about the supportability.

o NOTE Currently, Service Pack 1 (SP1) for Windows Server 2008 R2 is available and

certified by both Microsoft and Temenos.

Regularly check and apply the pertinent hotfixes mentioned in the following knowledge

base (KB) article to enhance stability and fix known critical bugs (not security related).

Recommended hotfixes and updates for Windows Server 2008 R2–based server

clusters


As a special “out-of-band” recommended hotfix for Windows Server 2008 R2, please install

the following hotfix on all the cluster nodes in the primary and disaster recovery sites.

A hotfix that improves the performance of the "AlwaysOn Availability Group"

feature in SQL Server 2012 is available for Windows Server 2008 R2


Regularly check and apply all the security hotfixes for SQL Server 2012.

o NOTE Currently, SQL Server 2012 does not have any security hotfixes released.

Regularly check and apply the latest available service pack for SQL Server 2012 after

checking with Temenos about the supportability.

o NOTE Currently there is no released service pack for SQL Server 2012.




As a special “out-of-band” recommended hotfix for SQL Server 2012, install the following

update package on all the SQL Server 2012 instances in the primary and disaster recovery

sites.

Cumulative update package 1 for SQL Server 2012


NOTE If a more recent update is available, it is not necessary to install the previous hotfix.

Regularly check for latest “cumulative update” (CU) release for SQL Server 2012, review the

fixed bugs and install only if you are affected and after checking with Temenos about

supportability. For a list of released CUs for SQL Server 2012, see the following KB article.

The SQL Server 2012 builds that were released after SQL Server 2012 was released


Finally, it is highly recommended that you check periodically with the Microsoft Support Service for

any recommended non-security related hotfixes for Windows Server 2008 R2 and SQL Server 2012.

Additional Resources Following are links for further information.

SQL Server 2012

Books Online for SQL Server 2012


Database Availability Key Capabilities and Concepts:

o Failover Clustering and AlwaysOn Availability Groups (SQL Server)


o Active Secondaries: Readable Secondary Replicas (AlwaysOn Availability Groups)


Database Availability Step-by-Step Guide:

o Deploying a new Availability Group

http://msdnstage.redmond.corp.microsoft.com/en-

us/library/ff877884.aspx#RelatedTasks

o Create or Configure an Availability Group Listener (SQL Server)

http://go.microsoft.com/fwlink/?LinkId=201271

o Perform a Forced Manual Failover of an Availability Group (SQL Server)


Instance Availability Key Capabilities and Concepts:






http://msdnstage.redmond.corp.microsoft.com/en-us/library/ff877884.aspx#RelatedTasks

http://msdnstage.redmond.corp.microsoft.com/en-us/library/ff877884.aspx#RelatedTasks

http://go.microsoft.com/fwlink/?LinkId=201271



o Failover Policy for Failover Cluster Instances


Instance Availability Step-by-Step Guide:

o SQL Server Multi-Subnet Clustering


o Configure FailureConditionLevel Property Settings


o View and Read Failover Cluster Instance Diagnostics Log


AlwaysOn FAQ for SQL Server 2012

http://msdn.microsoft.com/en-us/sqlserver/gg508768(l=en-us)

Hardware and Software Requirements for Installing SQL Server 2012


Introducing SQL Server AlwaysOn

http://msdn.microsoft.com/en-us/sqlserver/gg490638

Overview of AlwaysOn Availability Groups


Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups

http://msdn.microsoft.com/en-us/library/ff878487.aspx#SystemReqsForAOAG

Before Installing Failover Clustering


Create a New SQL Server Failover Cluster (Setup)


Add or Remove Nodes in a SQL Server Failover Cluster (Setup)


Microsoft SQL Server AlwaysOn Solutions Guide for High Availability and Disaster Recovery

http://download.microsoft.com/download/D/2/0/D20E1C5F-72EA-4505-9F26-

FEF9550EFD44/Microsoft%20SQL%20Server%20AlwaysOn%20Solutions%20Guide%20for%

20High%20Availability%20and%20Disaster%20Recovery.docx

Availability Modes






http://msdn.microsoft.com/en-us/sqlserver/gg508768(l=en-us)


http://msdn.microsoft.com/en-us/sqlserver/gg490638


http://msdn.microsoft.com/en-us/library/ff878487.aspx#SystemReqsForAOAG




http://download.microsoft.com/download/D/2/0/D20E1C5F-72EA-4505-9F26-FEF9550EFD44/Microsoft%20SQL%20Server%20AlwaysOn%20Solutions%20Guide%20for%20High%20Availability%20and%20Disaster%20Recovery.docx





AlwaysOn Failover Cluster Instances


Enable and Disable AlwaysOn Availability Groups (SQL Server)


Creating an Availability Group (SQL Server)


Create or Configure an Availability Group Listener (SQL Server)


Monitor Availability Groups


AlwaysOn Availability Groups Dynamic Management Views and Functions


Manually Prepare a Secondary Database for an Availability Group (SQL Server)


SQL Server 2012 AlwaysOn: Multisite Failover Cluster Instance

http://sqlcat.com/sqlcat/b/whitepapers/archive/2011/12/22/sql-server-2012-

alwayson_3a00_-multisite-failover-cluster-instance.aspx

Perform a Forced Manual Failover of an Availability Group


Availability Group Listeners, Client Connectivity, and Application Failover (SQL Server)


Configure Read-Only Access on an Availability Replica (SQL Server)


Configure Read-Only Routing on an Availability Group (SQL Server)


Client Connection Access to Availability Replicas (SQL Server)


Configure Read-Only Access on an Availability Replica


Configure the Windows Firewall to Allow SQL Server Access

http://msdn.microsoft.com/en-us/library/cc646023.aspx








http://sqlcat.com/sqlcat/b/whitepapers/archive/2011/12/22/sql-server-2012-alwayson_3a00_-multisite-failover-cluster-instance.aspx

http://sqlcat.com/sqlcat/b/whitepapers/archive/2011/12/22/sql-server-2012-alwayson_3a00_-multisite-failover-cluster-instance.aspx






http://msdn.microsoft.com/en-us/library/cc646023.aspx


How to use Kerberos authentication in SQL Server


How to transfer the logins and the passwords between instances of SQL Server 2005 and

SQL Server 2008


SQL Server Web site

http://www.microsoft.com/sqlserver

SQL Server Tech Center

http://technet.microsoft.com/en-us/sqlserver

SQL Server Dev Center

http://msdn.microsoft.com/en-us/sqlserver

Windows Server Failover Cluster

Windows Server | Failover Clustering and Node Balancing

http://www.microsoft.com/windowsserver2008/en/us/failover-clustering-main.aspx

Checklist: Create a Failover Cluster

http://technet.microsoft.com/en-us/library/cc755009.aspx

Failover Cluster Step-by-Step Guide: Validating Hardware for a Failover Cluster

http://technet.microsoft.com/en-us/library/cc732035(WS.10).aspx

Failover Cluster Step-by-Step Guide: Configuring the Quorum in a Failover Cluster

http://technet.microsoft.com/en-us/library/cc770620(v=ws.10).aspx

Failover Cluster Step-by-Step Guide: Configuring Accounts in Active Directory


Configure Cluster Quorum NodeWeight Settings

http://msdn.microsoft.com/en-us/library/hh270281(SQL.110).aspx

Force a WSFC Cluster to Start Without a Quorum

http://msdn.microsoft.com/en-us/library/hh270275(v=SQL.110).aspx

Failover Policy for Failover Cluster Instances


Checklist: Create a Clustered File Server




http://www.microsoft.com/sqlserver

http://technet.microsoft.com/en-us/sqlserver

http://msdn.microsoft.com/en-us/sqlserver

http://www.microsoft.com/windowsserver2008/en/us/failover-clustering-main.aspx





http://msdn.microsoft.com/en-us/library/hh270281(SQL.110).aspx

http://msdn.microsoft.com/en-us/library/hh270275(v=SQL.110).aspx




Recommended hotfixes and updates for Windows Server 2008 R2-based server clusters


A hotfix that improves the performance of the "AlwaysOn Availability Group" feature in

SQL Server 2012 is available for Windows Server 2008 R2


Network Load Balancing

Network Load Balancing


NLB 101: How NLB balances network traffic

http://blogs.technet.com/b/networking/archive/2008/10/01/nlb-101-how-nlb-balances-

network-traffic.aspx

Network Load Balancing parameters


Specifying the Affinity and Load-Balancing Behavior of the Custom Port Rule


Upgrading the Network Load Balancing Cluster (to 2008)


Network Load Balancing: Configuration Best Practices for Windows 2000 and Windows

Server 2003

http://www.microsoft.com/downloadS/details.aspx?FamilyID=d24c373e-bafc-4e31-b1b2-

d86584a12ca4&displaylang=en




http://blogs.technet.com/b/networking/archive/2008/10/01/nlb-101-how-nlb-balances-network-traffic.aspx

http://blogs.technet.com/b/networking/archive/2008/10/01/nlb-101-how-nlb-balances-network-traffic.aspx




http://www.microsoft.com/downloadS/details.aspx?FamilyID=d24c373e-bafc-4e31-b1b2-d86584a12ca4&displaylang=en

http://www.microsoft.com/downloadS/details.aspx?FamilyID=d24c373e-bafc-4e31-b1b2-d86584a12ca4&displaylang=en


About Temenos Founded in 1993 and listed on the Swiss Stock Exchange (SIX: TEMN), Temenos Group AG is the

market-leading provider of banking software systems to retail, corporate, universal, private,

Islamic, and microfinance and community banks. Headquartered in Geneva with more than 60

offices worldwide, Temenos serves more than 1,500 customers in 125 countries. Temenos’

software products provide advanced technology and rich functionality, incorporating best-practice

processes that take advantage of Temenos’ experience in 700 implementations around the globe.

For more information, visit: www.temenos.com

About Microsoft Founded in 1975, Microsoft (Nasdaq "MSFT") is the worldwide leader in software, services, and

solutions that help people and businesses realize their full potential.

For more information, visit: www.microsoft.com

http://www.temenos.com/

http://www.microsoft.com

http://www.microsoft.com/utilities

http://www.microsoft.com/utilities

Documents

Temenos T24 and Microsoft SQL Server HADR White Paper