Galera cluster for high availability

Preview:

Citation preview

MySQL Consulting Team

Galera Cluster for High Availability

About MyDBOPS

• MySQLConsulting• MySQLSupport• RemoteDBAsupport.• ExpertMySQLsolutions.• 24/7MySQLMonitoringandSupport• MariaDB,Percona,Galera,TokuDBweresupportedtoo.

Agenda

•  Need for HA •  Principle of Distributed computing. •  MySQL HA Solutions Available. •  Introduction to Galera. •  Percona Cluster. •  Node Recovery. •  Backup and Recovery. •  Load Balancers. •  Things to be considered.

Need For HA

•  Site Reliability. •  Ensure Uptime •  Failover. •  Disaster Recovery. •  Scheduled / Unscheduled downtime. •  Avoid Single Point Failure

Principle of Distributed Computing

CAP Theorem

Consistency , Availability and Partitioning Only Two out of three is possible in any distributed computing system. AP – MySQL Replication CA – Galera Cluster

Principle of Distributed Computing

CAP Theorem

MySQL HA Solutions Available

•  Master – Slave •  Master – Master •  Single Writer •  NDB Cluster ( Oracle MySQL ) •  Galera Cluster •  Tungsten Replicator •  Storage Level Replication ( DRDB)

Introduction to Galera

•  Founded By Codership •  Synchronous Replication •  Parallel Replication •  Multi-Threaded •  Automated node recovery •  Zero slave lag •  Read/ Write Scalable •  WAN Based Optimization ( Galera 3.0) •  A True Open Source. •  Support InnoDB and TokuDB ( MyISAM Experimental )

Introduction to Galera

What is Galera ? Galera is a replication plugin for the synchronous and multi-master replication to achieve HA. “Wsrep_provider_options” controls library. Available Distributions •  Percona XtraDB Cluster •  MariaDB Cluster •  As a plugin over MySQL • 

Introduction to Galera

•  Shared nothing Architecture. •  Network is the heart. What is wsrep ? (Write Set REPlication )

It is an API to connect the Galera library and control characteristics. It helps to implement synchronous replication and certification based multi-master-replication.

Introduction to Galera

Simple Architecture

•  Use Galera Library •  XtraDB •  Xtrabackup

=

+ +

Percona Cluster

Percona Cluster

Why Percona Server ? •  Enhanced InnoDB (XtraDB ) •  Performance Improvement •  Xtrabackup-v2 ( Makes SST better ) •  Better Bug fixes. •  A better MySQL for scalability. •  Aligned with Oracle MySQL with better instrumentation and

performance patches.

Transaction in Galera

•  Transaction is handled by Galera Plugin. •  Uses traditional dual phase commit. •  It also handles locking. •  Uses the optimistic locking method. •  The commits are based on certifications (keys). •  Smaller transaction are always better. •  Increase in network latency increases query time. •  Support InnoDB and TokuDB. MyISAM is still experimental.

Transaction in Galera

Transaction in Galera

•  Synchronous ( Virtual ) Replication. •  Wsrep_causal_reads=ON ( true synchronous). •  Auto_increment_* is handled by cluster. •  Uses GTID for Transaction. ( Not the GTID in MySQL 5.6 ) •  Transaction latency increases with increase in nodes.

Galera Ports

As Galera is complex beyond standard MySQL .It needs multiple ports too for its successful operation •  3306: Standard MySQL port •  4567: Group Communication •  4568: IST •  4444: SST

The firewall rule can be designed based these ports.

Node Recovery

•  Node recovery is automated. •  Validates the gcache for state files. •  Chooses the State Transfer method

1) IST (Incremental State Transfer) 2) SST (State Snapshot Transfer )

Node Recovery

IST : •  Recover from write sets in gcache ( memory ). •  Faster recovery method. •  Have good gcache size. •  Does affect the node state.

Node Recovery

SST : State Snapshot Transfer the complete transfer ( Cloning ) of data to recreate a node. u  When not in Gcache ( wsrep_local_cached_downto) u  Adding a new node Different Methods of SST Xtrabackup-v2 (best ) , rsync , mysqldump .

Node Recovery

SST : u  SST will cause a node in cluster to Donor state. u  Donor selection is automatic. u  Donor selection can be forced by wsrep_sst_donor Hack: SST can be avoided by Full and Incremental hot backups in node recovery. ( It forces the IST ).

Blog by Jay Janssen on bypassing SST.

Node Recovery

Validate Node after recovery Wsrep_local_state=4 Wsrep_local_state_Comment 1) Joining 2) Donor/desynced 3) Joined 4) Synced

Wsrep_local_state_Comment 1) Joining : Initial state of node 2) Donor/desynced : Huge delay in replication write set 3) Joined : Delay less than 1000 write sets 4) Synced : Sync with all nodes

Node Recovery

Load Balancers

•  HAProxy •  Pen Proxy •  Galera Load Balancer ( based on Pen ) •  Max Scale •  ProxySQL Note : Make sure Load Balancer aware of the Galera State. Use the appropriate Load balancer Algorithm.

Things should be considered

•  Support only Transactional engines. •  Row Based replication. •  Read committed Isolation. •  Innodb_autoinc_lock_mode=2. •  Avoid huge transactions. •  Wsrep_max_ws_rows (128K) •  Wsrep_max_ws_size. (1G) •  Network is the heart. •  Keep the DB design simple

Things should be considered

•  Foreign Keys can cause error ( Bug ) •  Maintain Quorum •  Check for application error after commit •  Keep odd number of nodes

IMAGE Courtesy

•  Galera and Percona cluster documentation •  http://opentodo.net/2012/12/mysql-multi-master-replication-with-galera/

Thanks!!!

Email : mysqlsupport@mydbops.com

Recommended