Upload
venkateshwaran-jayaraman
View
151
Download
5
Embed Size (px)
DESCRIPTION
hi
Citation preview
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 i
The Microsoft High Availability and Disaster
Recovery Solution for TEMENOS T24
A deployment reference architecture and guidance for implementing a high-availability and disaster-recovery solution for TEMENOS T24 running on the Microsoft Application Platform Technical White Paper
Published: May 2012
Applies to: Microsoft SQL Server 2012
Authors: Igor Pagliai (Microsoft) Dammika Wickramasinghe (Temenos)
Abstract
Temenos and Microsoft worked together to define a deployment architecture/topology that provides high availability and disaster recovery for the TEMENOS T24 core banking solution using the Microsoft Application platform and Microsoft technologies.
This white paper describes the results of this joint effort.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 ii
©2012 Microsoft Corporation. All rights reserved. This document is provided “as-is.” Information and views
expressed in this document, including URL and other Internet Web site references, may change without notice.
You bear the risk of using it.
This document does not provide you with any legal rights to any intellectual property in any Microsoft product.
You may copy and use this document for your internal, reference purposes.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 iii
Table of Contents
Introduction .................................................................................................................................................. 1
Technical Overview of TEMENOS T24 ............................................................................................................ 5
SQL Server AlwaysOn .................................................................................................................................... 6
Recovery Objectives .......................................................................................................................................... 7
Fault Tolerance and Disaster Recovery Architecture ........................................................................................ 8
High Availability and Disaster Recovery Solution ......................................................................................... 10
Setup and Configuration .............................................................................................................................. 13
SQL Server 2012 HADR Configuration ............................................................................................................ 13
Windows Server Firewall Configurations ........................................................................................................ 14
T24 File Share Configuration .......................................................................................................................... 15
Active Directory Domain Services DNS Configuration .................................................................................... 17
Application-Tier NLB Configuration ................................................................................................................ 18
T24 Application Server Configuration ............................................................................................................. 20
Web-Tier NLB Configuration ........................................................................................................................... 23
T24Browser Configuration.............................................................................................................................. 25
Disaster Recovery Procedures ..................................................................................................................... 27
DNS Switching ................................................................................................................................................ 29
SQL Server 2012 HADR Failover ...................................................................................................................... 31
Findings and Carryovers .............................................................................................................................. 50
Recommended Hotfixes and Service Packs .................................................................................................. 51
Additional Resources ................................................................................................................................... 52
SQL Server 2012 .............................................................................................................................................. 52
Windows Server Failover Cluster .................................................................................................................... 55
Network Load Balancing ................................................................................................................................ 56
About Temenos .............................................................................................................................................. 57
About Microsoft.............................................................................................................................................. 57
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 1
Introduction TEMENOS T24 (T24) is a fully integrated, modular core banking solution that covers a broad
spectrum of functional requirements for the retail, private, corporate, universal, and Islamic
banking and microfinance sectors. T24 provides a single, real-time view of client computers across
the entire enterprise, making it possible for banks to maximize returns while also streamlining costs.
Microsoft SQL Server 2012 data management software provides an ideal data management
framework for T24. With this foundation, T24 customers can experience faster funds transfers,
higher security-trades volumes, and quicker close-of-business processes; they can benefit from
open, state-of-the-art technologies to accelerate innovation, which helps to greatly increase the
speed and effectiveness with which new products and services are created.
As part of their strategic alliance, Microsoft and Temenos worked together to define a
recommended deployment architecture that provides high availability and disaster recovery (HADR)
for T24 running on the Microsoft Application Platform and using Microsoft technologies. This joint
effort was conducted in the Temenos Hemel Hempstead lab.
One of the main drivers for developing the architecture/topology was to reduce the cost of
Microsoft software licenses and the use of specialized hardware (such as load balancers) to
minimize the total cost of ownership (TCO). Therefore, the recommended software topologies can
be customized to meet customer’s needs.
The following considerations apply to the recommended architecture:
The SQL Server 2012 Availability Group feature, part of the AlwaysOn technology set, was
selected instead of storage area network (SAN)–level synchronous storage replication to
avoid the cost of an additional SAN device and the licensing cost for SAN replication
software.
A SQL Server 2012 Failover Cluster Instance (FCI) was adopted for the primary site instead
of two standalone instances to reduce licensing cost, minimize management and
performance overhead, and augment the possibility of using an existing deployment based
on a typical Windows Failover Clustering (WSFC) configuration.
The Network Load Balancing (NLB) feature of Windows Server 2008 R2 was chosen to
eliminate the need for an expensive hardware load balancer device in front of the JBoss
servers.
The NLB feature of Windows Server 2008 R2 was chosen to provide better load balancing
performance than the native T24 capabilities in front of T24 servers.
Two cluster nodes in the primary site with shared SAN storage were used to provide high
availability for the T24 application file share.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 2
The implementation/requirements of HADR solutions can vary based on variety of factors, including
service level agreements (SLAs), cost, number of sites, and network infrastructure. Therefore, the
requirements of individual HADR solutions need to be determined on a case-by-case basis for each
deployment.
Alternatives to the Recommended Architecture
The architecture proposed in this white paper is not the only one possible using SQL Server 2012
AlwaysOn features, but this architecture has been thoroughly tested. Possible alternatives to the
recommended schema can include the following:
Use two standalone SQL Server 2012 instances (in an AlwaysOn Availability Group) instead
of a single SQL Server 2012 Failover Cluster Instance. This lets you avoid using shared SAN
for the cluster nodes in the primary site.
o If you are using an availability group, all nodes in the must still be part of a cluster,
and a standalone SQL Server 2012 instance must still be installed on each node. The
cost savings with this alternative come from eliminating the need for shared
storage.
o To ensure that there is no local data loss if there is local failover between instances
in the primary site, the two standalone SQL Server 2012 instances, along with the
one (or more) in the disaster recovery site, must be configured for synchronous
replication.
o In this configuration, automatic failover can be provided by the AlwaysOn
Availability Group feature, but extra care must be taken to avoid unwanted failover
to the remote disaster recovery site.
Use an existing highly available network storage for the cluster file share witness. Used in
combination with the previous option, a highly available network storage for the cluster file
share witness can render the installation of a Windows Server Failover Cluster unnecessary.
o NOTE Distributed File System Replication (DFS-R) can be used to replicate files
from the primary site to the disaster recovery site with a less frequent schedule.
Use of DFS-R as a solution to avoid a clustered file share by having continuous
replication with local folders, however, is not recommended because of the
possible performance impact.
Use an additional node in the disaster recovery site with shared SAN storage between the
nodes. With this alternative, a second SQL Server 2012 FCI can be used, providing high
availability at the level of the disaster recovery site as well.
o This second instance must be installed only on the nodes in the disaster recovery
site.
o This instance is distinct from the one used in the primary site.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 3
o This instance should be configured for synchronous replication in the availability
group replication.
o The shared SAN storage between the nodes in the disaster recovery site is not
linked/replicated to the shared storage between the nodes in the primary site.
IMPORTANT In the proposed scenario, the minimum number of servers has been used in the
disaster recovery site to reduce costs. This means that in the case of a complete primary site
disaster, the disaster recovery site will operate in an exposed configuration that is not highly
available. For this reason, it is highly recommended that you recover the primary site as soon as
possible or use an additional node in the disaster recovery site with shared SAN storage between
the nodes, as mentioned previously.
Additional SQL Server 2012 HADR Capabilities for Future Consideration
Note that the following SQL Server 2012 HADR capabilities have not been tested prior to publication
of this white paper because of time, resource, and configuration constraints. They should be
considered to be future enhancements to the recommended architecture, and should be tested for
custom deployments and/or lab testing sessions:
Readable secondary for Availability Group replicated databases
This feature presents no theoretical risks and could be used to better utilize hardware
resources in the disaster recovery site (including read-only queries, reporting, backups, and
integrity checks,), but T24 should be modified to take advantage of this capability (for read-
only queries only). The following links provide more information:
o Active Secondaries: Readable Secondary Replicas
(http://msdn.microsoft.com/en-us/library/ff878253.aspx)
o Configure Read-Only Access on an Availability Replica
(http://msdn.microsoft.com/en-us/library/hh213002.aspx)
NOTE In the recommended configuration, the secondary replicas for the
availability group replicated the databases. Read-only access is not enabled, but can
be easily activated with no downtime.
Availability Group Read-Only Routing and Application Intent
These features cannot be used because they require the SQL Server 2012 Native Open
Database Connectivity (ODBC) client to be installed on the T24 servers. As a future
enhancement, this version of the client should be tested for T24 use. The following links
provide more information:
o Configure Read-Only Routing for an Availability Group (SQL Server)
(http://msdn.microsoft.com/en-us/library/hh710054.aspx)
o Client Connection Access to Availability Replicas (SQL Server)
(http://msdn.microsoft.com/en-us/library/hh510184.aspx)
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 4
Multi-subnet failover clustering
Windows Server 2008 R2 and SQL Server 2012 support this type of configuration, but this
has not been tested for using in reducing downtime because of Domain Name System (DNS)
replication latency. The following links provide more information.
o SQL Server Multi-Subnet Clustering
(http://msdn.microsoft.com/en-us/library/ff878716.aspx)
o SQL Server 2012 AlwaysOn: Multisite Failover Cluster Instance
(http://sqlcat.com/sqlcat/b/whitepapers/archive/2011/12/22/sql-server-2012-
alwayson_3a00_-multisite-failover-cluster-instance.aspx)
Flexible failover policy
SQL Server 2012 introduces a new health detection mechanism for clustered installation
that can be modified so that the Windows Failover Clustering is more alert to possible
SQL Server 2012 health problem conditions. The following links provide more information.
o Failover Policy for Failover Cluster Instances
(http://msdn.microsoft.com/en-us/library/ff878664.aspx)
o Configure FailureConditionLevel Property Settings
(http://msdn.microsoft.com/en-us/library/ff878667.aspx)
Document Scope
The following are considered in the scope of this white paper:
This document applies to T24 R11 and R12 (Temenos Application Framework C) with
T24Browser as a channel.
This document focuses only on HADR functionality.
The document applies to following software:
o Windows Server 2008 R2 with Service Pack 1 (SP1)
o Windows Server 2008 R2 Network Load Balancing (NLB)
o Windows Server 2008 R2 clustering
o Windows Server 2008 R2 clustered file share
o Windows Server 2008 R2 Distributed File System (DFS) Replication
o SQL Server 2012 AlwaysOn Availability Group
o Windows Server 2008 R2 domain controller o JBoss 5.1.0 GA
The following are considered out of the scope of this white paper:
Performance tuning recommendations.
T24 channels other than T24Browser, such as TWS.NET, TOCF.NET, and BizTalk Adapter.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 5
Administration and monitoring of the software.
Hardware configurations, such as RAID and network adapter teaming.
Security.
Local area network (LAN)/wide area network (WAN) configurations and recommendations.
Technical Overview of TEMENOS T24 The various components of a T24-based solution are shown in Figure 1.
Figure 1. T24 logical component view
Table 1 provides a description of the components.
Note that the HADR solution recommended in this white paper focuses on T24 with T24Browser as
a channel.
Tem
en
os
T24
Connectivity
TWS.NET TOCF.NET TOCF (EE)TWS (EE)
Ma
na
ge
me
nt
Se
cu
rity
Channels
T24 Browser
ARC IB ARC Mobile
Windows Server 2008 R2 Windows Server 2008 R2Internet Information Services (IIS) 7.5
Application
Message QueueTAFC Agent
T24
TAFC
DCD
C / C++
C / C++
C / C++
C / C++
FXTAFC Agent
T24
TAFC
DCD
C / C++
C / C++
C / C++
C / C++
FXTAFC Agent
T24
TAFC
DCD
C / C++
C / C++
C / C++
C / C++
FXTAFC Agent
T24
TAFC
DCD
C / C++
C / C++
C / C++
C / C++
EBTAFC Agent
T24
TAFC
DCD
C / C++
C / C++
C / C++
C / C++
AATAFC Agent
T24
TAFC
DCD
C / C++
C / C++
C / C++
C / C++
DXT24 Agent
T24
TAFC
Database Driver
AC
T
24
Mo
nito
r
SQL Server 2012
Windows Server 2008 R2
Windows Server 2008 R2Active Directory
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 6
Table 1. Component descriptions
Component Description
T24 Agent T24 Agent is a server-side jBASE component that is
responsible for accepting and processing incoming client
requests. Communication is established via TCP socket
connections and by means of a well-defined protocol. T24
Agent is a socket server listening on a user-defined TCP port,
and has the capability to serve a wide range of client
applications as long as they speak the same protocol.
T24 T24 is the banking business logic written by using jBC, which
is used to generate C / C++ code.
TAFC The Temenos Application Framework C (TAFC) version
provides additional runtime services that are currently not
available in jBC.
Database Driver Direct Connect Driver (DCD) is the T24 data abstraction layer
that decouples T24 business logic from the underlying data
storage/structure.
T24 Monitor T24 Monitor is a Java Management Extensions (JMX) and
web-based online monitoring tool for T24, offering real-time
statistics, as well as historical views of a particular T24
system.
Message Queue Message Queue is an optional middleware infrastructure that
lets T24 use message-driven communication with the
channel layer.
Database The jBASE or vendor-provided relational database
management system (RDBMS); currently supported
platforms are Oracle, Microsoft SQL Server, and IBM DB2.
SQL Server AlwaysOn SQL Server AlwaysOn is a new integrated, flexible, and cost-efficient HADR solution. AlwaysOn can
provide data and hardware redundancy within and across data centers, and it can improve
application failover time to increase the availability of mission-critical applications. AlwaysOn is
flexible and lets you reuse existing hardware investments.
A solution using AlwaysOn can take advantage of two major SQL Server 2012 features for
configuring availability at both the database and the instance level:
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 7
AlwaysOn Availability Groups
AlwaysOn Availability Groups are new in SQL Server 2012. They greatly enhance the
capabilities of database mirroring, help ensure availability of application databases, and
enable zero data loss through log-based data movement for data protection without shared
disks. Availability groups provide an integrated set of options, including automatic and
manual failover of a logical group of databases, support for up to four secondary replicas,
fast application failover, and automatic page repair.
AlwaysOn Failover Cluster Instances (FCIs)
FCIs enhance the SQL Server failover clustering feature and support multi-site clustering
across subnets, which enables cross-data-center failover of SQL Server instances. Faster and
more predictable instance failover is another key benefit that enables faster application
recovery.
Recovery Objectives
Data redundancy is a key component of a high-availability database solution. Transactional activity
on your primary SQL Server instance is synchronously or asynchronously applied to one or more
secondary instances. When an outage occurs, transactions that were in-flight might be rolled back,
or they might be lost on the secondary instances because of delays in data propagation.
You can measure the impact and set recovery goals in terms how long it takes to get back in
business and how much time latency there is in the last transaction recovered:
Recovery Time Objective (RTO)
The RTO is the duration of the outage. The initial goal is to get the system back online in at
least a read-only capacity to facilitate investigation of the failure. However, the primary goal
is to restore full service to the point that new transactions can take place.
Recovery Point Objective (RPO)
The RPO is often referred to as a measure of acceptable data loss. It is the time gap or
latency between the last committed data transaction before the failure and the most recent
data recovered after the failure. The actual data loss can vary depending on the workload
on the system at the time of the failure, the type of failure, and the type of high availability
solution used.
You should use RTO and RPO values as goals that indicate business tolerance for downtime and
acceptable data loss, and as metrics for monitoring availability health.
The business goals for RTO and RPO should be key drivers in selecting a SQL Server technology for
your high-availability and disaster-recovery solution.
Table 2 offers a rough comparison of the type of results that those different solutions may achieve.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 8
Table 2. Comparison of SQL Server HADR solutions
SQL Server HADR Solution Potential Data Loss (RPO)
Potential Recovery Time (RTO)
Automatic Failover
Readable Secondaries1
AlwaysOn Availability
Group—synchronous-
commit
Zero Seconds Yes2 0–2
AlwaysOn Availability
Group—asynchronous-
commit
Seconds Minutes No 0–4
AlwaysOn Failover Cluster
Instance NA3
Seconds -to-minutes
Yes NA
Database Mirroring4—
High-safety (sync + witness) Zero Seconds Yes NA
Database Mirroring2—
High-performance (async) Seconds5 Minutes5 No NA
Log Shipping Minutes5 Minutes -to-hours5
No Not during a restore
Backup, Copy, Restore6 Hours5 Hours -to-days5
No Not during a restore
Fault Tolerance and Disaster Recovery Architecture SQL Server AlwaysOn solutions help provide fault tolerance and disaster recovery across several
logical and physical layers of infrastructure and application components. Historically, it has been
common practice to separate duties and responsibilities for the various audiences and roles
involved, so that each was predominately concerned with only a portion of those solution layers.
This section describes each of those layers and offers guidance for your design discussions and
implementation decisions.
A successful SQL Server AlwaysOn solution requires understanding and collaboration across these
solution layers:
1 An AlwaysOn Availability Group can have no more than a total of four secondary replicas, regardless of type. 2 Automatic failover of an Availability Group is not supported to or from a failover cluster instance. 3 The FCI itself does not provide data protection; data loss is dependent upon the storage system implementation. 4 This feature will be removed in future versions of Microsoft SQL Server. Use AlwaysOn Availability Groups instead. 5 This is highly dependent upon the workload, data volume, and failover procedures. 6 Backup, Copy, Restore is appropriate for disaster recovery, but not for high availability.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 9
Infrastructure level
Server-level fault-tolerance and intra-node network communication use Windows Server
Failover Clustering (WSFC) features for health monitoring and failover coordination.
SQL Server instance level
A SQL Server AlwaysOn Failover Cluster Instance (FCI) is a SQL Server instance that is
installed across and can fail over to server nodes in a WSFC cluster. The nodes that host the
FCI are attached to robust symmetric shared storage (SAN or SMB).
Database level
An availability group is a set of user databases that fail over together. An availability group
consists of a primary replica and one to four secondary replicas. Each replica is hosted by an
instance of SQL Server (FCI or non-FCI) on a different node of the WSFC cluster.
Client connectivity
Database client applications can connect directly to a SQL Server instance network name, or
they may connect to a virtual network name (VNN) that is bound to an availability group
listener. The VNN abstracts the WSFC cluster and Availability Group topology, logically
redirecting connection requests to the appropriate SQL Server instance and database
replica.
Figure 2 shows a logical topology of a representative AlwaysOn solution.
Figure 2. Logical representation of an AlwaysOn solution
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 10
High Availability and Disaster Recovery Solution The recommended HADR solution for T24 deployments was designed based on the following:
Incurring zero data loss when failing over to the disaster recovery site, assuming that there
is a compatible network connection between the sites that are capable of synchronous data
replication.
Reducing the cost of Microsoft software licenses and specialized hardware (such as load
balancers) to minimize the total cost of ownership.
Maximizing use of any Windows Server 2008 R2 features and capabilities that complement
T24.
The following decisions were made in the solution design. Refer to Figure 3 for further information.
The disaster recovery site used for testing had only one server for each tier. If the disaster
recovery site also requires high availability, the configuration used in the primary site
should be used for the disaster recovery site.
The Windows Server 2008 R2 NLB feature is used to load balance the traffic into the JBoss
application servers in the primary site. The same feature can be used for the disaster
recovery site if there will be two or more disaster recovery nodes.
A DNS host record was created for the web-tier NLB IP to make the failover to the disaster
recovery site transparent to the users (for example, T24Browser.CoE.Temenos.com).
T24Browser is a stateful application that normally deploys with a sticky-session
configuration. Although this configuration provides the required functionality, it reduces the
scalability of the T24 web tier. The user might lose the session if an application server goes
down, reducing the availability. The solution presented in this white paper eliminates these
limitations by removing sticky sessions. This is achieved by persisting the JBoss session state
in the SQL Server database and configuring NLB to “Affinity: None”.
Using NLB and DNS host record and avoiding the use of sticky sessions lets you add or
remove web-tier servers transparently, without affecting users.
T24Browser is capable of performing simple load balancing among the available T24
application servers when a load balancing solution is not available in the application tier.
This feature is disabled in the recommended solution, and NLB is used instead with the
“Affinity: None” configuration to achieve the best possible load balancing.
DNS host record was created for the application-tier NLB IP so that you have the option of
failing over only the application tier to the disaster recovery site if necessary (for example,
T24Server.CoE.Temenos.com).
This is an optional configuration that is only required if a facility needs to simplify server
maintenance and keep the T24Browser configurations identical in both sites. However, this
option does create an additional step in the disaster recovery procedures.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 11
Using the NLB “Affinity: None” configuration makes it possible to add or remove
application-tier servers transparently, without affecting online transactions.
The SQL Server 2012 HADR AlwaysOn (HADRON) configuration with a SQL Server 2012
Failover Cluster instance for the primary site is used to reduce the number of required
SQL Server 2012 licenses.
The primary site can have two standalone instances of SQL Server 2012 instead of the
failover cluster instance if you need to remove the shared storage; however, this will
require licenses for each SQL Server 2012 instance, while the failover cluster instance
requires only one license regardless of the number of nodes in the cluster.
The disaster recovery instance of SQL Server 2012 is configured as a SQL Server 2012
HADRON synchronous AlwaysOn replica for zero data loss.
Synchronous replication requires a fast and stable network connection in order to work as
expected. This needs to be taken into account when setting up the network. If you do not
have a fast and stable network connection, implement asynchronous replication instead,
but understand that asynchronous replication does have a possibility of data loss.
The same Windows Server Failover Cluster that hosts the SQL Server 2012 clustered
instance is used to host a clustered file share to keep T24 shared files and folders. The
clustered file share increases the availability of the T24 shared files and folders.
The disaster recovery site has a local folder for T24 shared files/folders. Windows Server
2008 R2 Distributed File System Replication (DFS-R) is implemented with an Active Directory
Domain Services (AD DS)–published namespace to make the file share failover to the
disaster recovery site transparent and to replicate T24 shared files/folders.
Making the T24 shared files available in the disaster recovery site is not mandatory because
T24 can recover without them. However, having the T24 shared files available has a positive
impact. Therefore, DFS-R is scheduled to occur several times per day to reduce the
overhead of the replication.
T24 typically accesses shared files and folders via a mapped drive letter in each T24 server.
Since accidentally removing or changing the mapped drive letter can cause failures, file and
folder symbolic links were created by using the “mklink” utility of Windows and used
instead of the mapped drive letters to avoid unintended mistakes. Symbolic links make the
shared files and folders imitate local entities, and therefore T24 can access them directly.
A JBoss session persistence database was created in the same SQL Server 2012 HADRON
configuration as the T24 database, therefore having the same high availability and disaster
recovery capabilities. This makes management easier and reduces the steps in disaster-
recovery procedures. You can, however implement the JBoss session persistence database
as a different instance, if required.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 12
Figure 3. HADR solution
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 13
Setup and Configuration This section describes how to configure the HADR solution.
SQL Server 2012 HADR Configuration
SQL Server 2012 HADR is configured with a clustered instance for the primary site and a standalone
instance in the disaster recovery site. The configuration uses the AlwaysOn Availability Group to
replicate database content and to provide transparent failover. The disaster recovery instance is
configured as a synchronous replica for zero data loss. Figure 4 shows a schematic of the solution.
Figure 4. SQL Server 2012 HADR solution
The Windows Server Failover Cluster consists of a cluster with three nodes: two nodes in the
primary site and one node in the disaster recovery site with a SAN shared only between the two
nodes in the primary site. The disaster recovery instance has only local storage where the database
content is replicated by using the availability group.
The cost of the solution is reduced because there is no shared storage between nodes in the
primary site and the node in the disaster recovery site, because there is no SAN in the secondary
site, and because you do not need an expensive storage-level synchronization mechanism to
replicate disk data content.
A clustered SQL Server 2012 instance is primarily used to reduce the number of SQL Server 2012
licenses that are required. The primary site could have two standalone instances of SQL Server
2012 instead of the failover cluster instance if this is required to remove the shared storage;
however, this option requires licenses for each SQL Server 2012 instance, while the failover
cluster instance requires only one licence regardless of the number of nodes in the cluster.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 14
If the disaster recovery site also requires high availability, the same configuration used in the
primary site needs to be available in the disaster recovery site.
When the recommended solution was tested, all of the SQL Server instances were created as
named instances to make them easy to identify during maintenance and monitoring. Table 3 lists
the names that were used in the test environment during setup; these names can be used as a
reference guideline.
Table 3. Names of SQL Server instances
Name Description
SQL11HA SQL Server 2012 instance name of the primary site.
Since the named instance uses a dynamic TCP port, static TCP
port 1533 was configured via the SQL Server Configuration
Manager.
SQL11DR SQL Server 2012 instance name of the disaster recovery site.
Since the named instance uses a dynamic TCP port, static TCP
port 1533 was configured via the SQL Server Configuration
Manager.
T24AG SQL Server 2012 AlwaysOn Availability Group name. This name is
not used by T24, and is used in SQL Server Management Studio
when required to fail over to the disaster recovery instance.
The JBoss session persistence database was added to the same
availability group in the test environment. This makes
management easier, and disaster recovery failover becomes a
single process for both the databases.
T24AgListener SQL Server 2012 AlwaysOn Availability Group listener name. This
is the name T24 uses to connect the SQL Server 2012 HADRON
instance.
When creating the listener, 1433 (the SQL Server default port)
was used as the TCP port number to avoid having to change the
T24 connection parameters to use a different port number.
Windows Server Firewall Configurations The Windows Server Firewall is on by default; therefore, you need to create relevant inbound
firewall exceptions in the servers for the configuration to work as expected.
Table 4 shows the inbound firewall rules that need to be created in all the database servers.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 15
Table 4. Firewall rules
Name Description
SQL11 (1533) Inbound firewall exception rule for TCP port 1533, which is the
static port configured for the SQL Server instance.
SQL11 Browser (1434) Inbound firewall exception rule for UDP port 1434, which is
required for the SQL Server Browser when named instances
exist.
SQL11 AG (5022) Inbound firewall exception rule for TCP port 5022, which is
required for the SQL Server 2012 HADRON Availability Group.
SQL11 AG Listener (1433) Inbound firewall exception rule for TCP port 1433, which is
configured for the SQL Server 2012 HADRON Availability Group
Listener.
T24 File Share Configuration
In the multi-server configuration, T24 is required to have a shared location for its working files and
folders. Any single file is created or written by only one T24 instance and is read by all instances.
There is no concern about file write locks; however, the share needs to be resilient for the multi-
server configuration to function properly.
If T24 fails over to the disaster recovery site, making the T24 shared files available in the
disaster recovery site is not mandatory because T24 can recover without them. However,
having the shared files available does have a positive impact.
A resilient file share solution with less frequent (once or twice a day) file replication to the disaster
recovery site is therefore a good solution. Windows Server Clustered File Server, in conjunction with
DFS-R, provides an optimal solution and does not require any additional licenses.
For simplicity, an Active Directory Domain Services (AD DS)—published DFS Namespace is used to
refer the shared file folder. Therefore, T24 can refer the same path (namespace) for shared files,
whether it is in the primary site or in the disaster recovery site.
Figure 5 shows the T24 file share and file replication configuration.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 16
Figure 5. File share and file replication
Windows Server Clustered File Share Configuration
The recommended SQL Server 2012 HADR configuration uses a Windows Server Cluster. Using the
same cluster to host the file server reduces the complexity of the solution and simplifies
management and monitoring.
Since only the primary site servers in the cluster have access to the shared storage, the only
possible owners of the file server are the servers in the primary site. The file server, therefore,
does not fail over to the disaster recovery site, and the disaster recovery instance of T24 will
only have access to its local folders.
A shared folder called “T24FileShare” was created in the file server and used as the resilient file
share location of the primary site.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 17
If the disaster recovery site also uses a T24 multi-server configuration, the same type of file
share needs to be created in the disaster recovery site. However, because the test environment
had only a single T24 instance, a local folder was created with the same shared folder name.
Distributed File System Replication Configuration
DFS-R was used to periodically replicate T24 shared files between the primary site and the disaster
recovery site. The replication frequency was set to the lowest possible (once or twice per day) to
avoid any performance implications, and because having the shared files available in the disaster
recovery site is not mandatory to T24.
The disaster recovery site of the test environment had a single instance of T24; therefore, the folder
for the shared files was created locally in the same server. The DFS replication was set up to
replicate the files between the clustered file share in the primary site and the local folder in the T24
disaster recovery instance.
Active Directory Domain Services DNS Configuration To make the web-tier failover transparent to the users, you must have a DNS host record that can be
referred by the users to reach T24Browser instead of the load balancer IP. Failover to disaster
recovery will therefore only require changing the IP address of the DNS host record, and users do
not need to use a different URL. In the test environment, the DNS host record
“T24Browser.CoE.Temenos.com” was created for the web-tier Network Load Balancing IP.
You can also create a DNS host record for the application-tier servers if it is a requirement to be able
to transparently fail over the application tier independently to the web tier.
Note that this is an optional configuration that is helpful if you need to ease server maintenance
and keep the T24Browser configurations identical in both sites. However, this configuration
does add a step to the disaster recovery procedures.
The DNS host record “T24Server.CoE.Temenos.com” was created for the application-tier NLB IP in
the test environment.
One drawback of using DNS host records is that the client application using the name caches the IP
address. Therefore, even if the IP address of the DNS host record is changed at the server-side in a
disaster recovery failover, the client application might still use the old IP address, and this old IP
address might no longer be available.
To minimize the chance to this happening, the “time to live” (TTL) value of the DNS host record
needs to be adjusted. In the test environment, the TTL value was set to one minute, which means
that the client application verified the DNS host record IP address with the server every one minute.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 18
While shorter TTL values can increase the load on the DNS server, they can be useful with
critical services like web servers, application servers, and load balancers. TTL values are often
reduced by the DNS administrator before service is moved to minimize disruptions.
Table 5 shows the DNS host records that were created in the test environment.
Table 5. DNS host records
DNS Host Record Description
T24Browser.CoE.Temenos.com The Domain Name System (DNS) host record of the T24
web-tier load balancer that was used in the web
browser URL to connect to T24Browser.
The TTL value was set to one minute for testing.
T24Server.CoE.Temenos.com An optional DNS host record created for the T24
application-tier load balancer to test transparent
failover of the application tier independently to the web
tier.
This was used by the T24Browser (configured in t24-
ds.xml) to connect to the load balancer in the test
environment.
The TTL value was set to one minute for testing.
Application-Tier NLB Configuration T24Browser is capable of performing simple load balancing among the available T24 application
servers when a load balancing solution is not available in the application tier. However, specialized
load balancing solutions can provide better load balancing capabilities.
The NLB feature in Windows Server is a software load balancing solution that does not require
additional licenses and complements T24 by providing a specialised load balancing solution.
In the recommended solution, the NLB feature In Windows Server is enabled and configured in the
T24 application servers in the primary site, and created an NLB cluster consisting of the two servers.
Figure 6 shows the application-tier NLB cluster.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 19
Figure 6. Application-tier NLB cluster
If the disaster recovery site has multiple T24 application servers, an NLB cluster needs to be
configured in those servers as well.
Table 6 shows the NLB configurations used.
Table 6. NLB configurations
Configuration Description
Cluster operation mode “Multicast” operation mode was used to keep the network
adapter’s built-in media access control (MAC) address. This
was because the test servers had only one network adapter,
and this network adapter had to be used for server
management as well.
If the server has multiple network adapters, the cluster
operation mode can be set to “Unicast.”
Protocol The protocol used for communication with T24 was TCP/IP.
Port range The port range was limited to 20002, which is the T24 agent
port configuration.
Filtering mode “Affinity: None” was selected to achieve best possible load
balancing.
The simple load balancing feature in T24 of T24Browser is disabled and used NLB cluster name
(T24Server.CoE.Temenos.com) as the T24 instance. This lets the network load balancing route the
connections to the T24 instances in the cluster.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 20
Using NLB with the “Affinity: None” configuration lets you add or remove application-tier
servers transparently, without affecting online transactions.
T24 Application Server Configuration
The T24 application tier is configured with two T24 instance (nodes: App Node 1 and App Node 2) in
the primary site and a single instance (node: App Node 3) in the disaster recovery site.
Note that it is possible to have multiple T24 instances (application server nodes) in the disaster
recovery site if high availability is a requirement for the disaster recovery site.
The Windows Server 2008 R2 NLB feature was used to balance the T24 application servers.
The HADR solution for the T24 file share is implemented by using a Windows Server 2008 R2
clustered file share and DFS-R.
Figure 7 shows the T24 application tier configuration.
Figure 7. T24 application tier
The Temenos Application Framework ‘C’ (TAFC) is the execution environment for the T24
application. Install TAFC and T24 application on all application servers (for installation guidance,
contact Temenos).
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 21
Following is a description of how the T24 application servers were configured:
All the T24 instances in the test environment used multiple server configurations with the
required licenses. To use one instance of T24 on multiple servers, install the multiple
application server module.
When using multiple application servers, define port ranges for each T24 application server
to avoid conflicts or deadlock situation during close of business.
Ports can be assigned by using the following variable in each application server:
JBCPORTNO= port range
The same jbase_agent port must be used on all T24 application servers. The default
jbase_agent port 20002 was used in the test environment.
The same port must be used because requests to the T24 servers are controlled by the
load balancer, and therefore T24Browser sees only a single instance of T24 (load
balancing cluster name), regardless of the number of T24 applications servers available.
Inbound Windows firewall exception rule for TCP port 20002 was created to make the
jbase_agent port accessible from T24Browser.
The T24 database driver (Direct Connect Driver *DCD+) requires the SQL Server client to be
installed on the server. At the time of testing, the DCD for the SQL Server 2012 Native Client
was still in development. For this reason, the SQL Server 2008 R2 Native Client was used.
Because the SQL Server 2012 HADR configuration is used for the database tier, the T24
database must be accessed via the SQL Server 2012 AlwaysOn Availability Group. Therefore,
the availability group listener name was used in the T24 configuration instead of the
database server IP address.
File jedi_config , Record 'XMLMSSQL_FRMWRK'
Command->
0001 R12.100203 Direct connect driver version.
0002 T24AgListener]T24R12 DB Server name] DB name
0003 T24User]uHdE9oJj8B5Y0cUF0hGh0A==] DB User/Password encrypted
Default database locking (SQL Server application lock) was used for the testing.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 22
Limitations of Using the SQL 2008 R2 Native Client
During the testing, the SQL Server 2008 R2 Native Client was used with the T24 database driver
because the DCD did not support SQL Server 2012 client libraries.
The following limitations therefore apply to the SQL Server 2012 HADR AlwaysOn functionalities:
Read-only routing for the availability group is not available.
Application intent is not available.
Optimizations for fast multi-subnet failover clustering are not available.
When the SQL Server 2012 Native Client is certified for use with T24, the considerations for client
availability features shown in Table 7 will apply.
Table 7. Client type considerations
Driver Multi-subnet failover
Application intent
Read-only routing
Multi-subnet failover:
faster single subnet endpoint failover
Multi-subnet failover:
named instance resolution for SQL Server clustered instances
SQL Server Native
Client 11.0 ODBC
Yes Yes Yes Yes Yes
SQL Server Native
Client 11.0 OLE DB
No Yes Yes No No
ADO.NET with
Microsoft .NET
Framework 4.0 update
4.0.2*
Yes Yes Yes Future date Future date
ADO.NET with .NET
Framework 3.5
Future
date
Future date Future
date
Future date Future date
Microsoft Java
Database Connectivity
(JDBC) driver 4.0 for
SQL Server
Yes Yes Yes Yes Future date
*ADO.NET with .NET Framework 4.0.2 patch download for connectivity improvement
(http://support.microsoft.com/kb/2544514).
For more information about connection string keywords, see:
Using Connection String Keywords with SQL Server Native Client (http://msdn.microsoft.com/en-us/library/ms130822(v=sql.110).aspx).
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 23
T24 Shared Files
For T24 multiple server installation, it is necessary to share certain files and folders among T24
application servers.
T24 typically accesses shared files and folders via a mapped drive letter in each T24 server.
However, accidentally removing or changing the mapped drive letter can cause failures. Therefore,
file and folder symbolic links were created by using the Windows “mklink” utility instead of mapped
drive letters to avoid unintended mistakes. Symbolic links make the shared files and folders act as
local entities, so T24 can directly access them. If there are additional folders/files that need to be
shared, appropriate symbolic links should be created.
Web-Tier NLB Configuration
When the web tier has multiple servers (nodes), there needs to be a mechanism to route the
requests to the servers and to provide a single address to the requester (web browser), regardless
of the number of servers in the tier. This functionality is typically provided by using the proxy server
or/and load balancer with redundancy to increase the availability of the service.
The Network Load Balancing (NLB) feature of Windows Server does not have a single point of
failure because the service works on the network layer of all the servers. Because it is a readily
available feature in Windows Server, the NLB feature does not require additional licenses.
The NLB feature is enabled and configured in the web servers in the primary site, and created an
NLB cluster consisting of the two servers.
Figure 8 shows the web-tier NLB cluster.
Figure 8. Web-tier NLB cluster
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 24
If the disaster recovery site has multiple web-tier servers, an NLB cluster needs to be configured
in those servers as well.
Table 8 shows the NLB configurations used. Table 8. NLB configurations
Configuration Description
Cluster operation mode “Multicast” operation mode was used to keep the network
adapter’s built-in media access control (MAC) address. This
was because the test servers had only one network adapter,
and this network adapter had to be used for server
management as well.
If the server has multiple network adapters, the cluster
operation mode can be set to “Unicast.”
Protocol TCP was used as the HTTP traffic transport over TCP/IP.
Port range The port range was limited to 8080, which was the JBoss web
site port range configured in the test environment.
Filtering mode “Affinity: None” was selected to achieve best possible load
balancing.
Typically, the T24Browser requires “Affinity: Single” (sticky-
session) configuration because it is a stateful application.
However, in the recommended solution, JBoss is configured
to persist session states in the SQL Server database;
therefore, it is possible to use the “Affinity: None”
configuration in the load balancer.
To make it possible to fail over the web tier to the disaster recovery site transparently, the DNS
host record (T24Browser.CoE.Temenos.com) is used for the NLB cluster IP address. Therefore, the
web browser URL remains unchanged, even if there is a failover to the disaster recovery site.
Not using sticky-sessions increases the availability of the site; in addition, using NLB with the
DNS host record allows for adding or removing web-tier servers transparently and without
affecting the users.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 25
T24Browser Configuration The T24 web tier is configured with two JBoss instances with T24Browser (nodes: Web Node 1 and
Web Node 2) in the primary site and a single instance (node: Web Node 3) in the disaster recovery
site.
It is possible to have a multiple JBoss/T24Browser instances (web server nodes) in the disaster
recovery site if high availability is a requirement for the disaster recovery site.
The Windows Server 2008 R2 NLB feature was used to balance the loads on the JBoss server nodes.
Figure 9 shows the T24 web tier.
Figure 9. T24 web tier
JBoss Configuration
The JBoss application server 5.1.0 GA was used in the test environment that hosted T24Browser
Java Servlet application. No clustered instance of JBoss was installed in the web-tier servers.
Following is the list of configurations that were made after successfully installing JBoss:
Because of the limitations of JBoss cluster session replication and to avoid using sticky
sessions, JBoss session persistence functionality was implemented using a SQL Server
database. A JBoss session persistence database was created in the same SQL Server 2012
HADR configuration as the T24 database. Therefore, the JBoss session persistence database
has the same high availability and disaster recovery capabilities as the T24 database. This
makes management easier and reduces the number of steps in the disaster recover
procedures. (Note that the JBoss session persistence database can be implemented as a
different instance if required.)
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 26
An inbound Windows firewall exception rule for TCP port 8080 was created to make JBoss
accessible to users.
T24Browser with AGENT Connection Method
After successful installation of the JBoss application server, T24Browser can be deployed and
configured to use one of the two types of supported configurations, AGENT or JMS. Detailed step-
by-step setup and configuration can be requested from Temenos.
For the online transactions used in this testing, the AGENT configuration is recommended. Tables 9
and 10 show the settings that were configured in the T24Browser.
Table 9. Settings in browserParameters.xml
Parameter
name
Description
Server
Connection
Method
Configuration of the connection to the T24 server.
AGENT connection method was used for the testing.
ConnectionTime
out
The connection expiration time if T24Browser does not get a response from the
T24 application server.
This was set to 20 seconds.
RetryCount The number of retry attempts the T24Browser should make if it can’t reach T24
to successfully execute a transaction.
This was set to 20 times.
RetryWait When retrying, the number of seconds to wait before attempting to retry the
transaction.
This was set to 5 seconds.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 27
Table 10. Settings in t24-ds.xml
Property name Description
Host A comma-separated list of available T24 servers.
Because the NLB feature in Windows Server 2008 R2 is configured at the
application tier, the name of the load balancing cluster needs to be used instead
of the names of the T24 servers.
The load balancing cluster “T24Server.CoE.Temenos.com” was used in the test
environment.
Ports The jbase_agent TCP port number.
All T24 instances in the test environment are configured to use TCP port 20002;
therefore, 20002 is used as the jbase_agent port number.
loadBalancing To enable or disable the simple load balancing feature in T24Browser.
This is set to “false” because the NLB feature in Windows Server 2008 R2
performs the load balancing in the recommended solution.
actionTimeout The number of seconds that the jbase_agent waits for a response from T24
application server.
This was set to 60 seconds in the test environment.
Disaster Recovery Procedures The high availability solution described in this document implements “automatic failover” between
the primary site servers (nodes). Human intervention is therefore not required. However, the
disaster recovery failover is intentionally designed to be manual, because this is typically part of the
business continuity plan. Therefore, the disaster recovery failover might require additional
procedures to be followed.
This section describes the disaster recovery procedures that were successfully tested for the
recommended solution.
Figure 10 shows the three failover activities that are required. Note that the second failover activity
is optional, and can be used if application-tier failover is implemented to ease maintenance
activities. In addition, if the optional DFS-R is implemented, the “DFS Namespace” fails over
automatically and manual failover is not required.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 28
Figure 10. Failover to disaster recovery site
The steps required for the failover activities are described in detail in the sections that follow. Note
that the steps in all sections need to be completed to successfully fail over to the disaster recovery
site.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 29
DNS Switching
Web-tier and application-tier DNS switching require changing the IP address of the DNS host
records to the IP address of the relevant server (node) in the disaster recovery site.
Following are the steps that need to be followed to change the IP addresses of the DNS host
records:
1. Log on to the domain controller as the administrator.
2. Navigate to Server Manager.
3. Expand Roles, expand DNS Server, expand DNS, expand Server Name, and then expand
Forward Lookup Zones.
4. Select the domain name (for example, CoE.Temenos.com). Note that T24Browser and
optional T24Server are the DNS host records that require the IP changes (Figure 11).
Figure 11. Select the DNS host record
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 30
5. Right click on the DNS host record T24Browser, and then select Properties (Figure 12).
Figure 12. T24Browser DNS host record properties
6. Change the address in the IP address field to the IP address of the web-tier server in the
disaster recovery site, and then click OK.
If the disaster recovery site has more than one web-tier server, the previous IP address
should be the IP address of the web-tier load balancer (NLB cluster).
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 31
7. If the “T24Server” DNS host record is also available, right-click the DNS host record, and
then select Properties. Change the address in the IP address field to the IP address of the
application-tier server in the disaster recovery site, and then click OK (Figure 13).
Figure 13. T24Server DNS host record properties
If the disaster recovery site has more than one application-tier server, the IP address
should be the IP address of the application-tier load balancer (NLB cluster).
SQL Server 2012 HADR Failover The SQL Server 2012 HADR failover to the disaster recovery site might be required for the following
two scenarios:
Planned manual failover
Primary site database servers are available, but required to fail over to the disaster
recovery site.
Unplanned forced failover
Complete primary site or primary site database server failure, and the database servers in
the primary site are not accessible.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 32
Planned Manual Failover
When the failover is planned, there is no server downtime in the primary site, the Windows Server
Failover Cluster (WSFC) is active, and databases are in “Synchronized” state in both primary and
disaster recovery instances of SQL Server.
Therefore, before starting the failover procedure, make sure that the databases are in
“Synchronized” state in both primary and disaster recovery instances of SQL Server (Figure 14 and
Figure 15).
Figure 14: SQL Server primary instance database status
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 33
Figure 15. SQL Server disaster recovery instance database status
For more information about planned manual failover, see:
Perform a Planned Manual Failover of an Availability Group (SQL Server)
(http://msdn.microsoft.com/en-us/library/hh231018.aspx).
Limitations and Restrictions
A failover command returns as soon as the target secondary replica has accepted the
command. However, database recovery occurs asynchronously after the availability group
has finished failing over.
Cross-database consistency across databases within the availability group is not maintained
during failover.
Cross-database transactions and distributed transactions are not supported by
AlwaysOn Availability Groups.
For more information, see:
Cross-Database Transactions Not Supported for Database Mirroring or AlwaysOn
Availability Groups (SQL Server)
(http://msdn.microsoft.com/en-us/library/ms366279.aspx).
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 34
Prerequisites and Restrictions
The target secondary replica and the primary replica must both be running in synchronous-
commit availability mode.
The target secondary replica must currently be synchronized with the primary replica. This
requires that all the secondary databases on this secondary replica must have been joined
to the availability group and must be synchronized with their corresponding primary
databases (that is, the local secondary databases must be synchronized).
To determine the failover readiness of a secondary replica, query the is_failover_ready
column in the sys.dm_hadr_database_cluster_states dynamic management view (see:
http://msdn.microsoft.com/en-us/library/hh213319.aspx) or look at the Failover
Readiness column of the AlwaysOn Group Dashboard (see:
http://msdn.microsoft.com/en-us/library/hh213474.aspx).
This task is supported only on the target secondary replica. You must be connected to the
server instance that hosts the target secondary replica.
Failover Procedure
Following are the steps that need to be followed to fail over the SQL Server 2012 HADR to the
disaster recovery site.
1. Connect to Primary or Secondary (disaster recovery) instance of SQL Server by using the
SQL Server 2012 Management Studio (Figure 16).
Figure 16. SQL Server 2012 primary instance
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 35
2. Right-click on the Availability Group (for example, T24AG), and then select Failover
(Figure 17).
Figure 17. Select "Failover"
3. In the Fail Over Availability Group Wizard, click Next (Figure 18).
Figure 18. Failover Availability Group wizard – Introduction page
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 36
4. In the Select New Primary Replica page, select the secondary SQL Server instance if it is not
already selected, and then click Next (Figure 19).
Figure 19. Fail Over Availability Group wizard – Select New Primary Replica page
5. In the Connect to Replica page, connect to the secondary instance by providing the
credentials, and then click Next (Figure 20).
Figure 20. Fail Over Availability Group wizard – Connect to Replica page
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 37
6. Click Finish at the Summary page to start the failover (Figure 21).
Figure 21. Fail Over Availability Group wizard – Summary page
7. After the successful failover, the wizard will show a Results page similar to the following
(Figure 22).
Figure 22. Fail Over Availability Group Wizard – Results Page
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 38
The “Validating WSFC quorum vote configuration” warning appears because of the
special quorum configuration used in this solution and is safe to ignore (Figure 23).
Figure 23. Fail Over Availability Group wizard – WSFC quorum configuration warning
8. Check the database status and Availability Group status in SQL Server 2012 Management
Studio to verify the failover (Figure 24).
Figure 24. Management Studio after Fail Over Availability Group wizard
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 39
Unplanned Forced Failover
When the primary site or the database servers (nodes) in the primary site are not available, the
Windows Server Failover Cluster (WSFC) will not have quorum to bring the cluster online. Therefore
WSFC needs to be deliberately started (forced) before the database failover.
After bringing the WSFC online with a forced quorum, the SQL Server 2012 AlwaysOn Availability
Group needs to force failover to the disaster recovery instance.
For more information about unplanned forced failover, see:
Perform a Forced Manual Failover of an Availability Group (SQL Server)
(http://msdn.microsoft.com/en-us/library/ff877957(SQL.110).aspx).
Limitations and Restrictions
Data loss is possible during the forced failover of an availability group. In addition, if the
primary replica is running when you initiate a forced failover, client computers might still
be connected to former primary databases. Therefore, it is strongly recommended that you
force failover only if the primary replica is no longer running and if you are willing to risk
losing data to restore access to databases in the availability group.
When a database on a secondary replica is in the REVERTING or INITIALIZING state, forcing
failover causes the database to fail to start as a primary database. If the database was in
the INITIALIZING state, you will need to apply the missing log records from a database
backup or fully restore the database from scratch. If the database was in the REVERTING
state, you will need to fully restore the database from backups.
A failover command returns as soon as the target secondary replica has accepted the
command. However, database recovery occurs asynchronously after the availability group
has finished failing over.
Cross-database consistency across databases within the availability group is not maintained
upon failover.
Cross-database transactions and distributed transactions are not supported by
AlwaysOn Availability Groups.
For more information, see:
Cross-Database Transactions Not Supported for Database Mirroring or AlwaysOn
Availability Groups (SQL Server)
(http://msdn.microsoft.com/en-us/library/ms366279.aspx).
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 40
Prerequisites and Restrictions
Windows Server Failover Cluster (WSFC) needs to be brought online with a forced quorum.
For more information about the forced quorum procedure, see:
WSFC Disaster Recovery through Forced Quorum (SQL Server)
(http://msdn.microsoft.com/en-us/library/hh270277.aspx).
You must be able to connect to the server instance that hosts the target secondary replica.
Failover Procedure
When the primary site or the primary site database servers are not available, the only accessible
database server will be the disaster recovery instance.
The following shows how Windows Server Failover Cluster and SQL Server instance can be seen in
the disaster recovery database server (Figure 25 and Figure 26).
Figure 25. Windows Server failover Cluster without quorum
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 41
Figure 26. SQL Server 2012 – primary site failure
To bring the database online in the disaster recover site, you first need to start Windows Server
Failover Cluster with forced quorum, followed by SQL Server 2012 availability group forced failover.
The following sub-sections provide the steps required to bring the database online. The steps in
all sections need to be completed to successfully fail over to the disaster recovery site.
Force Cluster Start with Force Quorum
Following are the steps need to be followed to force the cluster to start in the disaster recovery site
with force quorum:
1. Log on to the disaster recovery database server with a domain account that has
administrator privileges to the local computer.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 42
2. Open Server Manager, expand Features, and then expand Failover Cluster Manager. Select
the cluster (Figure 27).
Figure 27. Failed cluster due to quorum vote
3. Click Force Cluster Start in the Actions pane (Figure 28).
Figure 28. Cluster Manager - force cluster start option
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 43
4. Confirm the action by selecting Yes – Force my cluster to start option (Figure 29).
Figure 29. Confirm force cluster start
5. Cluster start will take some time—wait till the cluster starts successfully (Figure 30).
Figure 30.Cluster force start in progress
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 44
6. After the cluster starts, the cluster will look like the following figure in the Failover Cluster
Manager (Figure 31).
Figure 31. Cluster started with force quorum
Force Failover SQL Server 2012 Availability Group
Once the Windows Server Failover Cluster is online with force quorum, the following steps need to
be followed to force failover in the SQL Server 2012 availability group:
1. Open SQL Server 2012 Management Studio and connect to the SQL Server disaster
recovery instance (Figure 32).
Figure 32.SQL Server instance before forced failover
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 45
2. Right-click on the Availability Group (for example, T24AG), and then select Failover (Figure 33).
Figure 33. Start force failover
3. In the Fail Over Availability Group Wizard, click Next (Figure 34).
Figure 34. Fail Over Availability Group wizard – Introduction page
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 46
4. In the Select New Primary Replica page, select the secondary SQL Server instance if it is not
already selected. Also note the warning. Click Next (Figure 35 and Figure 36).
Because the cluster quorum is forced, the quorum status is showing as “Forced Quorum”.
Figure 35. Fail Over Availability Group wizard – Select New Primary Replica page
Figure 36. Fail Over Availability Group wizard – Select New Primary Replica page warning
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 47
5. Select and confirm failover with potential data loss, and then click Next (Figure 37).
Because the database status is not synchronized, SQL Server warns about potential data
loss. However, there is no data loss if the databases were in “Synchronized” state at the
time of the site failure
Figure 37. Fail Over Availability Group wizard – Potential Data Loss confirmation
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 48
6. Click Finish on the Summary page to start the failover (Figure 38).
Figure 38. Fail Over Availability Group wizard – Force Failover Summary page
7. After the successful force failover, wizard will show the “Results” page (Figure 39).
Figure 39. Fail Over Availability Group wizard – Results page
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 49
The “Validating WSFC quorum vote configuration” warning appears because of the
special quorum configuration that is used in the recommended solution and is safe to
ignore (Figure 40).
Figure 40. Fail Over Availability Group wizard – WSFC Quorum Configuration warning
8. After successful force failover, the database status and availability group status in
SQL Server 2012 Management Studio will look like the following figure (Figure 41).
Figure 41. Management Studio after Fail Over Availability Group wizard
Additional Considerations
It is highly recommended that you change the cluster quorum configuration if planned (scheduled
maintenance) or unplanned (primary site disaster) shutdown of all cluster nodes in the primary site
occurs, and if the disaster recovery SQL Server 2012 instance becomes active as the primary
instance for an extended period of time. If you do not change the cluster quorum configuration, the
entire cluster might shut down because of insufficient quorum vote availability.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 50
Change the value for the disaster recovery cluster node property “NodeWeight” to 1, and
change the value for the cluster nodes in the primary site to 0. For more information, see
the Microsoft Support article at http://support.microsoft.com/kb/2494036/en-us.
Shutting down only one node in the primary site will not affect cluster availability as
long as the second node in the primary site will be still up and running along with the
File Share Witness (FSW).
If the FSW in the primary site will not be available and cannot be contacted by the cluster
node in the disaster recovery site, change the FSW location to be in the disaster recovery
site.
Running the entire system with only one node in the disaster recovery site will not guarantee high
availability. Therefore, this should only be done for a limited amount of time. Otherwise, it is highly
recommended that you add a second node in the disaster recovery site and modify the cluster
quorum configuration accordingly.
Findings and Carryovers The following findings and carryovers were noted during the testing of the proposed solution in this
document.
Using the NLB feature in Windows Server provides better stability, better scalability, and
faster failover with no additional cost. NLB also lets you transparently add or remove
nodes in the web and application tiers.
JBoss session persistence increases the reliability and provided better scalability for the
solution.
Removing the sticky-session requirement in T24Browser makes the solution more reliable
and scalable.
A JBoss session persistence database in the same SQL Server 2012 AlwaysOn Availability
Group reduces the administrative work and reduces the steps in the disaster recovery
procedures.
SQL Server 2012 HADR and AlwaysOn provides simplified disaster recovery failover while
maintaining database replica in the disaster recovery site.
T24 works well with a configuration that uses the NLB feature in Windows Server and
provides faster application-tier failover.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 51
Windows Server DFS-R with DFS Namespace published in Active Directory Domain Services
provides a unique URL that can be used to refer the file share, regardless of the system
that is operating in the primary or the disaster recovery environment.
File and folder symbolic links make the shared file/folder access more resilient.
A clustered instance of SQL Server 2012 for high availability reduces licensing
requirements.
A SQL Server 2012 AlwaysOn Availability Group eliminates SAN replications.
DNS host records used for the load balancer IP addresses make disaster recovery failover
transparent at the web and application tiers.
Recommended Hotfixes and Service Packs The following best practices apply to the recommended configuration:
Regularly check and apply all the security hotfixes for Windows Server 2008 R2.
Regularly check and apply the latest available service pack for Windows Server 2008 R2
after checking with Temenos about the supportability.
o NOTE Currently, Service Pack 1 (SP1) for Windows Server 2008 R2 is available and
certified by both Microsoft and Temenos.
Regularly check and apply the pertinent hotfixes mentioned in the following knowledge
base (KB) article to enhance stability and fix known critical bugs (not security related).
Recommended hotfixes and updates for Windows Server 2008 R2–based server
clusters
http://support.microsoft.com/kb/980054/en-us
As a special “out-of-band” recommended hotfix for Windows Server 2008 R2, please install
the following hotfix on all the cluster nodes in the primary and disaster recovery sites.
A hotfix that improves the performance of the "AlwaysOn Availability Group"
feature in SQL Server 2012 is available for Windows Server 2008 R2
http://support.microsoft.com/kb/2687741/en-us
Regularly check and apply all the security hotfixes for SQL Server 2012.
o NOTE Currently, SQL Server 2012 does not have any security hotfixes released.
Regularly check and apply the latest available service pack for SQL Server 2012 after
checking with Temenos about the supportability.
o NOTE Currently there is no released service pack for SQL Server 2012.
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 52
As a special “out-of-band” recommended hotfix for SQL Server 2012, install the following
update package on all the SQL Server 2012 instances in the primary and disaster recovery
sites.
Cumulative update package 1 for SQL Server 2012
http://support.microsoft.com/kb/2679368/en-us
NOTE If a more recent update is available, it is not necessary to install the previous hotfix.
Regularly check for latest “cumulative update” (CU) release for SQL Server 2012, review the
fixed bugs and install only if you are affected and after checking with Temenos about
supportability. For a list of released CUs for SQL Server 2012, see the following KB article.
The SQL Server 2012 builds that were released after SQL Server 2012 was released
http://support.microsoft.com/kb/2692828/en-us
Finally, it is highly recommended that you check periodically with the Microsoft Support Service for
any recommended non-security related hotfixes for Windows Server 2008 R2 and SQL Server 2012.
Additional Resources Following are links for further information.
SQL Server 2012
Books Online for SQL Server 2012
http://msdn.microsoft.com/en-us/library/ms130214.aspx
Database Availability Key Capabilities and Concepts:
o Failover Clustering and AlwaysOn Availability Groups (SQL Server)
http://msdn.microsoft.com/en-us/library/ff929171.aspx
o Active Secondaries: Readable Secondary Replicas (AlwaysOn Availability Groups)
http://msdn.microsoft.com/en-us/library/ff878253.aspx
Database Availability Step-by-Step Guide:
o Deploying a new Availability Group
http://msdnstage.redmond.corp.microsoft.com/en-
us/library/ff877884.aspx#RelatedTasks
o Create or Configure an Availability Group Listener (SQL Server)
http://go.microsoft.com/fwlink/?LinkId=201271
o Perform a Forced Manual Failover of an Availability Group (SQL Server)
http://msdn.microsoft.com/en-us/library/ff877957.aspx
Instance Availability Key Capabilities and Concepts:
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 53
o Failover Policy for Failover Cluster Instances
http://msdn.microsoft.com/en-us/library/ff878664.aspx
Instance Availability Step-by-Step Guide:
o SQL Server Multi-Subnet Clustering
http://msdn.microsoft.com/en-us/library/ff878716.aspx
o Configure FailureConditionLevel Property Settings
http://msdn.microsoft.com/en-us/library/ff878667.aspx
o View and Read Failover Cluster Instance Diagnostics Log
http://msdn.microsoft.com/en-us/library/ff878700.aspx
AlwaysOn FAQ for SQL Server 2012
http://msdn.microsoft.com/en-us/sqlserver/gg508768(l=en-us)
Hardware and Software Requirements for Installing SQL Server 2012
http://msdn.microsoft.com/en-us/library/ms143506.aspx
Introducing SQL Server AlwaysOn
http://msdn.microsoft.com/en-us/sqlserver/gg490638
Overview of AlwaysOn Availability Groups
http://msdn.microsoft.com/en-us/library/ff877884.aspx
Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups
http://msdn.microsoft.com/en-us/library/ff878487.aspx#SystemReqsForAOAG
Before Installing Failover Clustering
http://msdn.microsoft.com/en-us/library/ms189910.aspx
Create a New SQL Server Failover Cluster (Setup)
http://msdn.microsoft.com/en-us/library/ms179530.aspx
Add or Remove Nodes in a SQL Server Failover Cluster (Setup)
http://msdn.microsoft.com/en-us/library/ms191545.aspx
Microsoft SQL Server AlwaysOn Solutions Guide for High Availability and Disaster Recovery
http://download.microsoft.com/download/D/2/0/D20E1C5F-72EA-4505-9F26-
FEF9550EFD44/Microsoft%20SQL%20Server%20AlwaysOn%20Solutions%20Guide%20for%
20High%20Availability%20and%20Disaster%20Recovery.docx
Availability Modes
http://msdn.microsoft.com/en-us/library/ff877931.aspx
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 54
AlwaysOn Failover Cluster Instances
http://msdn.microsoft.com/en-us/library/ms189134.aspx
Enable and Disable AlwaysOn Availability Groups (SQL Server)
http://msdn.microsoft.com/en-us/library/ff878259.aspx
Creating an Availability Group (SQL Server)
http://msdn.microsoft.com/en-us/library/ff878176.aspx
Create or Configure an Availability Group Listener (SQL Server)
http://msdn.microsoft.com/en-us/library/hh213080.aspx
Monitor Availability Groups
http://msdn.microsoft.com/en-us/library/ff878305.aspx
AlwaysOn Availability Groups Dynamic Management Views and Functions
http://msdn.microsoft.com/en-us/library/ff877943.aspx
Manually Prepare a Secondary Database for an Availability Group (SQL Server)
http://msdn.microsoft.com/en-us/library/ff878349.aspx
SQL Server 2012 AlwaysOn: Multisite Failover Cluster Instance
http://sqlcat.com/sqlcat/b/whitepapers/archive/2011/12/22/sql-server-2012-
alwayson_3a00_-multisite-failover-cluster-instance.aspx
Perform a Forced Manual Failover of an Availability Group
http://msdn.microsoft.com/en-us/library/ff877957.aspx
Availability Group Listeners, Client Connectivity, and Application Failover (SQL Server)
http://msdn.microsoft.com/en-us/library/hh213417.aspx
Configure Read-Only Access on an Availability Replica (SQL Server)
http://msdn.microsoft.com/en-us/library/hh213002.aspx
Configure Read-Only Routing on an Availability Group (SQL Server)
http://msdn.microsoft.com/en-us/library/hh710054.aspx
Client Connection Access to Availability Replicas (SQL Server)
http://msdn.microsoft.com/en-us/library/hh510184.aspx
Configure Read-Only Access on an Availability Replica
http://msdn.microsoft.com/en-us/library/hh213002.aspx
Configure the Windows Firewall to Allow SQL Server Access
http://msdn.microsoft.com/en-us/library/cc646023.aspx
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 55
How to use Kerberos authentication in SQL Server
http://support.microsoft.com/kb/319723/en-us
How to transfer the logins and the passwords between instances of SQL Server 2005 and
SQL Server 2008
http://support.microsoft.com/kb/918992/en-us
SQL Server Web site
http://www.microsoft.com/sqlserver
SQL Server Tech Center
http://technet.microsoft.com/en-us/sqlserver
SQL Server Dev Center
http://msdn.microsoft.com/en-us/sqlserver
Windows Server Failover Cluster
Windows Server | Failover Clustering and Node Balancing
http://www.microsoft.com/windowsserver2008/en/us/failover-clustering-main.aspx
Checklist: Create a Failover Cluster
http://technet.microsoft.com/en-us/library/cc755009.aspx
Failover Cluster Step-by-Step Guide: Validating Hardware for a Failover Cluster
http://technet.microsoft.com/en-us/library/cc732035(WS.10).aspx
Failover Cluster Step-by-Step Guide: Configuring the Quorum in a Failover Cluster
http://technet.microsoft.com/en-us/library/cc770620(v=ws.10).aspx
Failover Cluster Step-by-Step Guide: Configuring Accounts in Active Directory
http://technet.microsoft.com/en-us/library/cc731002(WS.10).aspx
Configure Cluster Quorum NodeWeight Settings
http://msdn.microsoft.com/en-us/library/hh270281(SQL.110).aspx
Force a WSFC Cluster to Start Without a Quorum
http://msdn.microsoft.com/en-us/library/hh270275(v=SQL.110).aspx
Failover Policy for Failover Cluster Instances
http://msdn.microsoft.com/en-us/library/ff878664(SQL.110).aspx
Checklist: Create a Clustered File Server
http://technet.microsoft.com/en-us/library/cc753969.aspx
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 56
Recommended hotfixes and updates for Windows Server 2008 R2-based server clusters
http://support.microsoft.com/kb/980054/en-us
A hotfix that improves the performance of the "AlwaysOn Availability Group" feature in
SQL Server 2012 is available for Windows Server 2008 R2
http://support.microsoft.com/kb/2687741/en-us
Network Load Balancing
Network Load Balancing
http://technet.microsoft.com/en-us/library/cc770558(v=ws.10).aspx
NLB 101: How NLB balances network traffic
http://blogs.technet.com/b/networking/archive/2008/10/01/nlb-101-how-nlb-balances-
network-traffic.aspx
Network Load Balancing parameters
http://technet.microsoft.com/en-us/library/cc778263.aspx
Specifying the Affinity and Load-Balancing Behavior of the Custom Port Rule
http://technet.microsoft.com/en-us/library/cc759039.aspx
Upgrading the Network Load Balancing Cluster (to 2008)
http://technet.microsoft.com/en-us/library/cc755161.aspx
Network Load Balancing: Configuration Best Practices for Windows 2000 and Windows
Server 2003
http://www.microsoft.com/downloadS/details.aspx?FamilyID=d24c373e-bafc-4e31-b1b2-
d86584a12ca4&displaylang=en
The Microsoft High Availability and Disaster Recovery Solution for TEMENOS T24 57
About Temenos Founded in 1993 and listed on the Swiss Stock Exchange (SIX: TEMN), Temenos Group AG is the
market-leading provider of banking software systems to retail, corporate, universal, private,
Islamic, and microfinance and community banks. Headquartered in Geneva with more than 60
offices worldwide, Temenos serves more than 1,500 customers in 125 countries. Temenos’
software products provide advanced technology and rich functionality, incorporating best-practice
processes that take advantage of Temenos’ experience in 700 implementations around the globe.
For more information, visit: www.temenos.com
About Microsoft Founded in 1975, Microsoft (Nasdaq "MSFT") is the worldwide leader in software, services, and
solutions that help people and businesses realize their full potential.
For more information, visit: www.microsoft.com