Upload
rahul-mehta
View
226
Download
0
Embed Size (px)
Citation preview
7/31/2019 HA Cluster
1/14
HA CLUSTER
Prepared By:Dhairya Giri
Rahul Mehta
Smit Gohel
Shruj Dabhi
Zishan Murji
7/31/2019 HA Cluster
2/14
INTRODUCTION
High-availability (HA) clustering is a solution that uses clusteringsoftware and special purpose hardware to minimize system downtime
HA clusters are groups of computing resources that are implemented
to provide high availability of software and hardware computingservices
7/31/2019 HA Cluster
3/14
Basic Work Done
Putting together a group of computers which trust each other to
provide a service even when system components fail
When one machine goes down, others take over its work. This
involves IP address takeover, service takeover, etc.
If 1 node shuts down or fails, another node takes over application
load and facilitates planned maintenance
Performs its function continuously for a significantly longer period
of Time
7/31/2019 HA Cluster
4/14
WHY HA CLUSTER???
HA clusters usually use a Heartbeat private network connection
which is used to monitor the health and status of each node in the
cluster
HA Cluster consists of
R.A.S.Reliability: High degree of protection for corporate data as
information is a crucial business asset
Availability: Continuous data access
Serviceability: Procedures to correct problems with minimal
business impact
7/31/2019 HA Cluster
5/14
HA Cluster Categories
There are two main Categories of HA Cluster:
Shared Disk: There is only one SHARED Disk. All nodes have access
to that same storage. A locking mechanism protects against race.
Shared Nothing clusters: At any given time, only one node owns a
disk. When a node fails, another owns it.
7/31/2019 HA Cluster
6/14
CONCEPTS & COMPLICATIONS
HA Clusters introduce concepts and complications around:
Split-Brain
Quorum
Fencing
One subtle, but serious condition all clustering software must be
able to handle is split-brain
7/31/2019 HA Cluster
7/14
Split Brain
Split-brain occurs when all of the private links go down simultaneously,
but the cluster nodes are still running
If that happens, each node in the cluster may mistakenly decide that
every other node has gone down and attempt to start services that othernodes are still running
Having duplicate instances of services may cause data corruption on
the shared storage
This condition is called SPLIT BRAIN condition
7/31/2019 HA Cluster
8/14
Quorum
Quorum is an attempt to avoid split brain for most kinds of failures
Typically one tries to make sure only one partition can be active and
Quorum is term for methods for ensuring this
One disadvantage is that this doesn't work very well for 2 nodes
7/31/2019 HA Cluster
9/14
Fencing
Fencing tries to put a fence around an errant node or nodes to keepthem from accessing cluster resources
This way one doesn't have to rely on correct behaviour or timing ofthe errant node.
We use STONITH to do this
STONITH: Shoot The Other Node In The Head
7/31/2019 HA Cluster
10/14
NODE CONFIGURATION
The most common size for an HA Cluster is a two-node cluster and
such configuration can sometimes be categorized into:
Active/Active: Traffic intended for the failed node is either passed
onto an existing node or load balanced across the remaining nodes
Active/Passive: Provides a fully redundant instance of each node,
which is only brought online when its associated primary node fails
N-to-1: Allows the failover standby node to become the active one
temporarily, until the original node can be restored or brought back
online
7/31/2019 HA Cluster
11/14
Virtualization
The usual goal of virtualization is to centralize administrative tasks
while improving scalability and work loads.
They allow to run multiple virtual servers on a single physical machine.
By combining virtualization and HA clustering, it is possible to benefit
from increased manageability and savings from server consolidation
through virtualization without decreasing uptime of critical services.
7/31/2019 HA Cluster
12/14
FAILOVER STRATEGIES
Systems that handle failures have different strategies to get rid of a
failure, these are three ways to configure a failover:
FAIL_FAST: The try fails, if the first node cannot be reached
ON_FAIL_TRY_ONE_NEXT_AVAILABLE: Tries one more host before
giving up
ON_FAIL_TRY_ALL_AVAILABLE: Tries all existing nodes before giving
up
7/31/2019 HA Cluster
13/14
Benefits
Supports many operating systems like Windows, Linux, Sun Solaris,
etc.
Simple to install, configure and maintain
Often used for critical databases, file-sharing on a network, business
applications, etc.
Handles and solves Split-Brain condition easily
Provides facility like Heartbeat private Network to maintain the
health on cluster nodes
7/31/2019 HA Cluster
14/14