34
Galera Cluster Introduction Pierre Mavro www.enovance.com August 1, 2014

Galera Cluster Introduction

Embed Size (px)

DESCRIPTION

This slides are talking about Galera Cluster. How it works, how to install and configure it for production usages. This is an introduction and basics concepts.

Citation preview

Page 1: Galera Cluster Introduction

Galera Cluster Introduction

Pierre Mavro

www.enovance.com

August 1, 2014

Page 2: Galera Cluster Introduction

Galera Cluster Introduction: Summary

Summary

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Installation and Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4 Recover and Troubleshoot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Pierre Mavro www.enovance.comGalera Cluster Introduction 2 / 34

Page 3: Galera Cluster Introduction

Galera Cluster Introduction: Introduction

Plan

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3I Galera: Features, benefits and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3I Use cases and Common architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Pierre Mavro www.enovance.comGalera Cluster Introduction 3 / 34

Page 4: Galera Cluster Introduction

Galera Cluster Introduction: Introduction

What is Galera Cluster ?

Galera Cluster provides high system uptime with no data loss and scalability for thefuture growth. Galera is open-source product and we offer high quality support to helpcustomers to increase service availability and lower total cost of ownership.Galera Cluster is a synchronous multi-master cluster for MariaDB and MySQL. Itneeds at least 3 nodes to work and exclusively works with InnoDB engine today (othershould come in the future).http://www.codership.com

Pierre Mavro www.enovance.comGalera Cluster Introduction 4 / 34

Page 5: Galera Cluster Introduction

Galera Cluster Introduction: Introduction

Features

Galera is synchronous multi-master cluster having features like:

I Synchronous replication

I Active-active multi-master topology

I Read and write to any cluster node

I Automatic membership control, failed nodes drop from the cluster

I Automatic node joining

I True parallel replication, on row level

I Direct client connections, native MySQL look & feel

Pierre Mavro www.enovance.comGalera Cluster Introduction 5 / 34

Page 6: Galera Cluster Introduction

Galera Cluster Introduction: Introduction

Benefits

These features yield un-seen benefits for a DBMS clustering solution:

I No slave lag

I No lost transactions

I Both read and write scalability

I Smaller client latencies

Pierre Mavro www.enovance.comGalera Cluster Introduction 6 / 34

Page 7: Galera Cluster Introduction

Galera Cluster Introduction: Introduction

Limitations

To have a correct database replication with Galera, some requirements/limitationsexists :

I InnoDB engine only

I Primary keys on tables are required. Rows in tables without primary key mayappear in different order on different nodes.

I Unsupported queries :

I LOCK/UNLOCK TABLESI Lock functions (GET_LOCK(), RELEASE_LOCK()...)

I Query log cannot be directed to table

I XA transactions can not be supported due to possible rollback on commit

I Can’t limit transaction size

Pierre Mavro www.enovance.comGalera Cluster Introduction 7 / 34

Page 8: Galera Cluster Introduction

Galera Cluster Introduction: Introduction

Plan

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3I Galera: Features, benefits and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3I Use cases and Common architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Pierre Mavro www.enovance.comGalera Cluster Introduction 8 / 34

Page 9: Galera Cluster Introduction

Galera Cluster Introduction: Introduction

Use cases

Galera replication works for a wide variety of use cases, here are some common usecases we have identified in the open source community:

I Read Master: Traditional MySQL master-slave topology, but with Galera all"slave" nodes are capable masters at all times, it is just the application whotreats them as slaves. Galera replication can guarantee 0 slave lag for suchinstallations and due to parallel slave applying, much better throughput for thecluster.

I Write Availability: Distributing writes across the cluster will harness the CPUpower in slave nodes for better use to process client write transactions. Due tothe row based replication method, only changes made during a client transactionwill be replicated and applying such a transaction in slave applier is much fasterthan the processing of the original transaction. Therefore the cluster candistribute the heavy client transaction processing across many master nodes andthis yields in better write transaction throughput overall.

Pierre Mavro www.enovance.comGalera Cluster Introduction 9 / 34

Page 10: Galera Cluster Introduction

Galera Cluster Introduction: Introduction

Use cases

I WAN Clustering: Synchronous replication works fine over the WAN network.There will be a delay, which is proportional to the network round trip time(RTT), but it only affects the commit operation.

I Disaster Recovery: Disaster recovery is a sub-class of WAN replication. Hereone data center is passive and only receives replication events, but does notprocess any client transactions. Such a remote data center will be up to date atall times and no data loss can happen. During recovery, the spare site is justnominated as primary and application can continue as normal with a minimal failover delay.

I Latency Eraser: With WAN replication topology, cluster nodes can be locatedclose to clients. Therefore all read & write operations will be super fast with thelocal node connection. The RTT related delay will be experienced only atcommit time, and even then it can be generally accepted by end user, usuallythe kill-joy for end user experiences is the slow browsing response time, and readoperations are as fast as they possibly can be.

Pierre Mavro www.enovance.comGalera Cluster Introduction 10 / 34

Page 11: Galera Cluster Introduction

Galera Cluster Introduction: Introduction

Common ArchitectureHere is a classical case for a distributed solution :

I The load balancers located oneach App servers can scale to XGalera servers

I There is no SPOF

I Client-server communicationlatencies are lower

Pierre Mavro www.enovance.comGalera Cluster Introduction 11 / 34

Page 12: Galera Cluster Introduction

Galera Cluster Introduction: Installation and Configuration

Plan

2 Installation and Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12I Repository and installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12I MariaDB Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15I Galera Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Pierre Mavro www.enovance.comGalera Cluster Introduction 12 / 34

Page 13: Galera Cluster Introduction

Galera Cluster Introduction: Installation and Configuration

Repository and installation

To install MariaDB and Galera Cluster, the simplest way is to use official MariaDBrepository :

Install dependencies# aptitude install python -software -properties

And then add the repository with the key:

Adding repository# apt -key adv --recv -keys --keyserver keyserver.ubuntu.com \0xcbcb082a1bb943db# add -apt -repository ’deb http :// mirrors.linsrv.net/mariadb/repo \/10.0/ debian wheezy main ’

Pierre Mavro www.enovance.comGalera Cluster Introduction 13 / 34

Page 14: Galera Cluster Introduction

Galera Cluster Introduction: Installation and Configuration

Repository and installation

You’re now ready to install Galera Cluster:

Install dependencies# aptitude update# aptitude install mariadb -galera -server galera rsync openntpd

Notes

Openntpd is necessary to avoid replication problems. All servers should be at thesame time !

Pierre Mavro www.enovance.comGalera Cluster Introduction 14 / 34

Page 15: Galera Cluster Introduction

Galera Cluster Introduction: Installation and Configuration

Plan

2 Installation and Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12I Repository and installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12I MariaDB Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15I Galera Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Pierre Mavro www.enovance.comGalera Cluster Introduction 15 / 34

Page 16: Galera Cluster Introduction

Galera Cluster Introduction: Installation and Configuration

MariaDB Configuration

Before starting Galera configuration, you can take a look at InnoDB configuration tomake Galera Cluster work properly. You may have already done it and if it’s the case,adapt to your configuration :

/etc/mysql/my.cnf[mysqld]innodb_buffer_pool_size = 256Minnodb_log_buffer_size = 8Minnodb_log_file_size = 256Mthread_concurrency = 64innodb_thread_concurrency = 64innodb_read_io_threads = 16innodb_write_io_threads = 16innodb_flush_log_at_trx_commit = 2innodb_file_per_table = 1innodb_open_files = 400innodb_io_capacity = 600innodb_lock_wait_timeout = 60innodb_flush_method = O_DIRECTinnodb_doublewrite = 0innodb_additional_mem_pool_size = 20Minnodb_buffer_pool_restore_at_startup = 500innodb_file_per_table

Pierre Mavro www.enovance.comGalera Cluster Introduction 16 / 34

Page 17: Galera Cluster Introduction

Galera Cluster Introduction: Installation and Configuration

MariaDB Configuration

To apply your configuration, you have to remove "ib_logfile*" if you changed the"innodb_log_file_size" :

Restart MariaDB# service mysql stop# rm /var/lib/mysql/ib_logfile*# service mysql start

Pierre Mavro www.enovance.comGalera Cluster Introduction 17 / 34

Page 18: Galera Cluster Introduction

Galera Cluster Introduction: Installation and Configuration

Plan

2 Installation and Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12I Repository and installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12I MariaDB Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15I Galera Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Pierre Mavro www.enovance.comGalera Cluster Introduction 18 / 34

Page 19: Galera Cluster Introduction

Galera Cluster Introduction: Installation and Configuration

Galera Configuration

For the Galera configuration, there is a dedicated file in mysql folder. This part is therequired InnoDB configuration to make Galera Cluster working properly.Placing those InnoDB options in that file will override the ones present in my.cnf.This to avoid any future error if my.cnf file became to be changed:

/etc/mysql/conf.d/mariadb.cnf[mysqld]binlog_format = ROWinnodb_autoinc_lock_mode = 2innodb_flush_log_at_trx_commit = 2innodb_locks_unsafe_for_binlog = 1

Pierre Mavro www.enovance.comGalera Cluster Introduction 19 / 34

Page 20: Galera Cluster Introduction

Galera Cluster Introduction: Installation and Configuration

Galera Configuration

All configuration settings starting with "wsrep_*" belong to Galera Cluster.

/etc/mysql/conf.d/mariadb.cnf[mysqld]wsrep_provider = /usr/lib/galera/libgalera_smm.sowsrep_cluster_name = ’mariadb_cluster’wsrep_node_name = node1wsrep_node_address = "10.0.0.1"wsrep_cluster_address = ’gcomm://10.0.0.1,10.0.0.2,10.0.0.3,10.0.0.4’wsrep_retry_autocommit = 0wsrep_sst_method = rsyncwsrep_provider_options = ”gcache.size = 1G; gcache.name = /tmp/galera.cache”#wsrep_replication_myisam = 1#wsrep_sst_receive_address = <x.x.x.x>#wsrep_notify_cmd = "script.sh"

There are not that many options, however they are very important.

Pierre Mavro www.enovance.comGalera Cluster Introduction 20 / 34

Page 21: Galera Cluster Introduction

Galera Cluster Introduction: Installation and Configuration

Galera Configuration

I wsrep_cluster_name : set the cluster name (needed if you have multiple GaleraCluster in the same subnet)

I wsrep_node_name : the node name (as a rule, use the server hostname)

I wsrep_node_address : the list of all cluster nodes

I wsrep_retry_autocommit : in transaction fail case, retry to commit once more

I wsrep_sst_method :

I xtrabackup: this is a fast solution that minimise the blocking time of thesource node (donor). This is in the majority case the most appropriatesolution.

I rsync: this is the fastest solution but this will block the source node (donor)longer (could be problematic in WAN architecture if bandwidth is thebottleneck)

I mysqldump: slowest solution (avoid it)

Pierre Mavro www.enovance.comGalera Cluster Introduction 21 / 34

Page 22: Galera Cluster Introduction

Galera Cluster Introduction: Installation and Configuration

Galera Configuration

I wsrep_provider_options : provides other interesting options

I gcache.size: Galera cache size used for inter cluster transfer. Grow it on hugeusage

I gcache.name: where to store this cache

Pierre Mavro www.enovance.comGalera Cluster Introduction 22 / 34

Page 23: Galera Cluster Introduction

Galera Cluster Introduction: Initialization

Plan

3 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23I Initialize Galera Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23I Galera status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Pierre Mavro www.enovance.comGalera Cluster Introduction 23 / 34

Page 24: Galera Cluster Introduction

Galera Cluster Introduction: Initialization

Initialize Galera Cluster

The first thing to do is stop all your MariaDB instances and start only the firstnode like this:

Initialize Galera Cluster# service mysql start --wsrep_cluster_address=’gcomm://’

Indicating an empty gcomm will initialize a new cluster.

DANGER

Never initialize a new cluster on a running one! You may loose data

Now start all other MariaDB services normally, they will integrate the cluster bythemselves:

Start MariaDB# service mysql start

Pierre Mavro www.enovance.comGalera Cluster Introduction 24 / 34

Page 25: Galera Cluster Introduction

Galera Cluster Introduction: Initialization

Plan

3 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23I Initialize Galera Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23I Galera status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Pierre Mavro www.enovance.comGalera Cluster Introduction 25 / 34

Page 26: Galera Cluster Introduction

Galera Cluster Introduction: Initialization

Galera status

If you look at this sample output:

Get Galera statusMariaDB [(none)]> show status like ’wsrep_%’;+----------------------------+-------------------------------------------+| Variable_name | Value |+----------------------------+-------------------------------------------+| wsrep_local_send_queue_avg | 0.000000 || wsrep_local_recv_queue_avg | 0.000000 || wsrep_flow_control_paused | 0.000000| wsrep_local_state_comment | Synced || wsrep_incoming_addresses | 10.0.0.1:3306 ,10.0.0.2:3306 ,10.0.0.3:3306 || wsrep_cluster_size | 3 || wsrep_cluster_status | Primary || wsrep_connected | ON || wsrep_ready | ON |+----------------------------+-------------------------------------------+

You can see here the most important information for your Galera Cluster status.

Pierre Mavro www.enovance.comGalera Cluster Introduction 26 / 34

Page 27: Galera Cluster Introduction

Galera Cluster Introduction: Initialization

Galera status

I wsrep_local_send_queue_avg: Average length of the send queue since the laststatus query. When cluster experiences network throughput issues or replicationthrottling this value will be significantly bigger than 0.

I wsrep_local_recv_queue_avg: Average length of the receive queue since thelast status query. When this number is bigger than 0 this means node can’tapply writesets as fast as they’re received. This could be sign that this node isoverloaded and it will cause the replication throttling.

I wsrep_flow_control_paused: Time since the last status query that replicationwas paused due to flow control.

I wsrep_local_state_comment: current node status. Available status:I Joining (requesting/receiving State Transfer) : the node is currently joining the

clusterI Donor/Desynced: node is the donor to the node joining the clusterI Joined: node has joined the clusterI Synced: node is synced with the cluster

Pierre Mavro www.enovance.comGalera Cluster Introduction 27 / 34

Page 28: Galera Cluster Introduction

Galera Cluster Introduction: Initialization

Galera status

I wsrep_incoming_addresses: Shows the comma-separated list of incoming nodeaddresses in the cluster.

I wsrep_cluster_size: Current number of nodes in the cluster.

I wsrep_cluster_status: replication status. Available status:

I Primary: the node is in a master stateI Non-primary: the node is not a masterI Disconnected: the node is not connected to cluster

I wsrep_connected: network connectivity for Galera replication

I wsrep_ready: node ready to handle SQL transactions

Pierre Mavro www.enovance.comGalera Cluster Introduction 28 / 34

Page 29: Galera Cluster Introduction

Galera Cluster Introduction: Initialization

Galera status

The other way to know the status of Galera, is to run this script (thanks to fridim):

Get Galera status# galera -status

NODE STATUS

cluster status: Primarycluster size: 3

Ready: ONconnected: ON

state comment: Synced--------------------------------------------------------

REPLICATION HEALTH (The lower the better)

fraction replication pause: 0.000000flow control sent: 0

local send queue average: 0.000000local receive queue average: 0.004253

--------------------------------------------------------CLUSTER INTEGRITY (should be the same on all nodes)

local state UUID: 05745e78 -989f-11e2 -0800- aa6f19ca749ccluster conf ID: 1371

Pierre Mavro www.enovance.comGalera Cluster Introduction 29 / 34

Page 30: Galera Cluster Introduction

Galera Cluster Introduction: Recover and Troubleshoot

Plan

4 Recover and Troubleshoot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30I High load traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30I Full reboot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Pierre Mavro www.enovance.comGalera Cluster Introduction 30 / 34

Page 31: Galera Cluster Introduction

Galera Cluster Introduction: Recover and Troubleshoot

High load traffic

If there is a high load traffic on your servers, the wsrep_flow_control_paused cangrow up to 1. This is generally due to an overload of outgoing traffic(wsrep_local_send_queue_avg ) or ingoing traffic (wsrep_local_recv_queue_avg).If one of your Galera node takes long time to answers others and slowdown yourcluster, follow those steps:

I Look at the current running queries why is it slow

I Look at the Galera logs what’s happening and try to correct manually

I Move this node out of the load balancer to avoid incoming traffic and let itfinish operation more smoothly

I If all the nodes are too slow because of one, you should consider rebootingMariaDB on this last one to recover a normal state.

Pierre Mavro www.enovance.comGalera Cluster Introduction 31 / 34

Page 32: Galera Cluster Introduction

Galera Cluster Introduction: Recover and Troubleshoot

Plan

4 Recover and Troubleshoot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30I High load traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30I Full reboot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Pierre Mavro www.enovance.comGalera Cluster Introduction 32 / 34

Page 33: Galera Cluster Introduction

Galera Cluster Introduction: Recover and Troubleshoot

Full reboot

If for any bad reasons you need to completely reboot or start your Galera cluster, youneed to start it as seen above during the initialization state.That means you have to initialize a node:

Initialize Galera Cluster# service mysql start --wsrep_cluster_address=’gcomm://’

and start all others normally:

Initialize Galera Cluster# service mysql start

Pierre Mavro www.enovance.comGalera Cluster Introduction 33 / 34

Page 34: Galera Cluster Introduction

www.enovance.com