37
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN TechEvent April 2013 Galera Cluster Synchronous Multi-Master Replication for MySQL HA Ludovico CALDARA LS-IMS 27.04.2013 27.04.2013 Galera Cluster Synchronous Multi-Master Replication for MySQL HA 1

Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

Embed Size (px)

DESCRIPTION

Some slides I've presented at TechEvent 04.2013 and LinuxDay 2013

Citation preview

Page 1: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN

TechEventApril 2013

Galera ClusterSynchronous Multi-Master Replication for MySQL HA

Ludovico CALDARA

LS-IMS

27.04.2013

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

1

Page 2: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

MySQL forks: which one is better?

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

2

� New forks

� Many new features

� Improved instrumentation

� New solutions for DEVs and DBAs

� Fast-paced competition between forks’ developers

� Recent evolutions in HA and scalability have made MySQL enterprise ready

MySQL Oracle MySQL

Percona Server

MariaDB

Drizzle

Page 3: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

There is no recipe that can satisfy all tastes

Percona Server MariaDB MySQL

Multi source replication NO YES (rel. 10) NO

NoSQL integration YES (cassandra) YES (cassandra) YES (memcached)

Virtual Columns NO YES NO

Improved diagnostics YES NO NO

Online DDL NO YES YES

Galera Cluster YES YES YES (codership patch)

Many many others YES/NO YES/NO YES/NO

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

3

Page 4: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Your real requirements will let you choose… Need HA?

• How will react your customer if there is an important loss of service?

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

4

Page 5: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

MySQL Multi-Master Replication• Complex and not reliable• Concurrent writes lead to logical corruption

Native MySQL Replication• Doesn’t scale writes• Complex to promote slaves

Old-school solutions have weaknesses

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

5

DRBD Replication• Standby is offline, doesn’t scale at all• Poor performance

NDB

NDB

NDB

MySQL Cluster• Very complex• It’s not InnoDB!

Page 6: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Continuent Tungsten Replicator• Similar to Golden Gate• Heterogeneous databases• Provides complex topologies• Asynchronous• Conflicts are complex to resolve• Complex to maintain• Not free

New school solutions: 3rd parties are playing a decisive role

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

6

ORACLE

MYSQL

MYSQL

ORACLEMYSQL

Galera Cluster Replication• Transparent Multi-Master easy to mantain• (Virtually) Synchronous• It’s InnoDB (only InnoDB)• Great and easy scalability• Optimistic locking (side effects)• At least 3 nodes for good HA

Page 7: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Multi-Master and virtually synchronous: it’s transparent

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

7

R/W R/W R/W R/W R/W

Page 8: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Cluster implementation - Ingredients

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

8

• One or more standalone servers (either physical or virtual)

• Linux (other operating systems are not yet available)

• “Permissive” Firewall between nodes

• Codership’s Galera Library package

• A package of your choice:

• Percona XtraDB Cluster

• MariaDB Galera Cluster

• MySQL with wsrep patch (patched by Codership)

Page 9: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Cluster implementation - Variables

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

9

• Each server’s my.cnf must contain:

• wsrep_cluster_address=gcomm://192.168.1.100,…,192.168.1.10x

• wsrep_provider=/usr/lib64/libgalera_smm.so

• binlog_format=ROW

• default_storage_engine=InnoDB

• innodb_autoinc_lock_mode=2

• innodb_locks_unsafe_for_binlog=1 #disables gap locking

Page 10: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Cluster implementation – Start the cluster

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

10

mysqld_safe --wsrep_cluster_address=gcomm:// &

[…]130220 17:56:46 [Note] WSREP: Starting new group from scratch : […]

The empty gcomm:// address starts the node as the first of the cluster

NEVER USE IT TO JOIN AN EXISTING CLUSTER

Page 11: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Cluster implementation – Adding nodes to the cluster

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

11

mysqld_safe \

--wsrep_cluster_address=gcomm://host1,host2… &

[…]130220 18:01:56 [Note] WSREP: Shifting OPEN -> PRIMARY (TO:…)130220 18:01:56 [Note] WSREP: State transfer required: […]

The address should be already present in the my.cnf!

Page 12: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Server State Transfer

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

12

• The joiner asks for a SST

R/W R/W R/W

Page 13: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Server State Transfer

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

13

• The joiner asks for a SST

R/W R/W

• The cluster chooses a donor, the donor is taken offline

DONOR

Page 14: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Server State Transfer

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

14

• The joiner asks for a SST

R/W R/W R/W

• The cluster chooses a donor, the donor is taken offline

DONOR

• The donor is backed up

• The donor comes online again and the joiner is loaded

Page 15: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Server State Transfer

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

15

• The joiner asks for a SST

R/W R/W R/W

• The cluster chooses a donor, the donor is taken offline

DONOR

• The donor is backed up

• The donor comes online again and the joiner is loaded

• The joiner replays the missing transactions and joins the cluster

R/W

Page 16: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Server State Transfer

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

16

• The joiner asks for a SST

R/W R/W R/W

• The cluster chooses a donor, the donor is taken offline

• The donor is backed up

• The donor comes online again and the joiner is loaded

• The joiner replays the missing transactions and joins the cluster

R/W• The cluster can also do

Incremental State Transfers (IST)

Page 17: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Split-Brain

• The majority of nodes wins

• Complete loss of network: all nodes go offline

• The offline nodes will respond:mysql> select * from emp;

ERROR 1047 (08S01): Unknown command

• Galera arbitrator (garbd) can join the cluster and count as a member in split brain resolution.

• NEW: Galera 2.4 intruduces weighted quorum

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

17

gararbitrator

Page 18: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

WAN

Example 1: Arbitrator in Trivadis Swiss

… sorry for German/Austrian attenders ☺

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

18

arbitrator

ZURICH

BASEL

BERN

LAUSANNE

• If the WAN connection is lost, Zurich survives

• If the Zurich site is lost, the cluster will be off lined

Page 19: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

WAN

Example 2: Arbitrator in Trivadis Swiss

… sorry for German/Austrian attenders ☺

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

19

arbitrator

ZURICH

BASEL

BERN

LAUSANNE

• If the Zurich site is lost, the other sites survive

• If the WAN connection is lost, the cluster will be off lined

Page 20: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

What does “Virtually synchronous” mean? In brief:

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

20

Write

Page 21: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

What does “Virtually synchronous” mean? In brief:

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

21

Write

Commit

WS

Page 22: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

What does “Virtually synchronous” mean? In brief:

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

22

Write

Commit

WS WS WS

Page 23: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

What does “Virtually synchronous” mean? In brief:

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

23

Write

Commit

WS WS

Commit OK

Page 24: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

What does “Virtually synchronous” mean? In brief:

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

24

Write

Commit

WS WS

Commit OK

• Writes are as fast as if they were local

• Commits take just the time of a network roundtrip: if acceptable then the cluster can be spread geographically

Page 25: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Optimistic locking leads to side effects

mysql> update emp set salary=‘peanuts’ where name=‘Caldara' ;Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

25

Page 26: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Optimistic locking leads to side effects

mysql> update emp set salary=‘peanuts’ where name=‘Caldara' ;Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

26

mysql> update emp set salary=‘one billion' where name=‘Cald ara';Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0

Page 27: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Optimistic locking leads to side effects

mysql> update emp set salary=‘peanuts’ where name=‘Caldara' ;Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

27

mysql> update emp set salary=‘one billion' where name=‘Cald ara';Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0mysql> commit ;

WS

Query OK , 0 rows affected (0.01 sec

WS

Page 28: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Optimistic locking leads to side effects

mysql> update emp set salary=‘peanuts’ where name=‘Caldara' ;Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

28

mysql> update emp set salary=‘one billion' where name=‘Cald ara';Query OK , 1 row affected (0.03 sec)Rows matched: 1 Changed: 1 Warnings: 0mysql> commit ;

mysql> commit ;ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

Query OK , 0 rows affected (0.01 sec

WS

mysql> select salary from emp where name=‘Caldara’;+-------------+| salary |+-------------+| one billion |+-------------+

Page 29: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Conclusions on optimistic locking…

• Cluster-wise, the first that broadcasts its commit wins (it’s Galera…)

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

29

• The application should not have hotspots...

• … or it should retry the transaction after the deadlock occurs…

• … or, for each database, you can elegy one node as the master

• Locally, the first that acquires the lock wins (it’s InnoDB…)

Page 30: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

About performance

• Commit performance loss is between 5% and 10% plus the network RTT

• Write workloads scale to up to 8 nodes

• >8 nodes: it scales reads, not writes

• Many benchmarks show that Galera overcomes NDB with few nodes

• NDB scales out more with many nodes thanks to data sharding

• Benchmarks on internet are not always reliable… test the performance of YOUR application

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

30

Page 31: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

How to migrate

• Converts all your tables to InnoDB

• Double-check that all tables have primary keys

• Think about potential problems caused by triggers (if you have any)

• Create a new empty Galera Cluster

• Setup MySQL native replication between the old database and the Galera cluster

• Once all is aligned, direct your clients on the new cluster

• Setup the old node to join the cluster

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

31

NATIVEREPLICATION

JOIN

Page 32: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Load balancing

• HAProxy is the most used solution so far

• Codership is actively developing his own load balancer: Galera Load Balancer (glbd)

• Several balancing modes: round robin, custom, least connected, …

• Automatically drains disconnected nodes

• New nodes can be added with a single tcpcall

• Release 1.0 (now rc1) will support watchdog and automatic discover of nodes composing the cluster

• Other methods possible (e.g. java connector properties, HW load balancer)

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

32

Page 33: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Conclusions on Galera Cluster

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

33

• At least 3 nodes to have good HA

• Optimistic locking (side effects)

• Explicit locking doesn’t work

• Only InnoDB is replicated

• Primary keys are mandatory

• Not yet available for MySQL 5.6

• Linux only

• Multi-master

• Easy to add/remove nodes

• Multi-master

• Shared-nothing

• Great performances and scalability

• «Virtually» synchronous

• It uses InnoDB!!

• Conflict prevention

• Split-brain (no inconsistencies)

• Easy to add/remove nodes

Page 34: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Links

http://www.slideshare.net/skysql/galera-cluster-by-seppo-jaakola-codership-at-skysql-roadshow-in-stuttgart-2013

http://www.codership.com/files/presentations/Galera_Replication_PLL_2011.pdf

http://www.mysqlperformanceblog.com/2013/01/31/feature-in-details-incremental-state-transfer-after-a-node-crash-in-percona-xtradb-cluster/

http://www.percona.tv/percona-webinars/migrating-to-percona-xtradb-cluster

http://www.codership.com/content/5-tips-migrating-your-mysql-server-galera-cluster

http://www.mysqlperformanceblog.com/2012/08/17/percona-xtradb-cluster-multi-node-writing-and-unexpected-deadlocks/

http://www.mysqlperformanceblog.com/2012/11/20/understanding-multi-node-writing-conflict-metrics-in-percona-xtradb-cluster-and-galera/

http://www.mysqlperformanceblog.com/2011/10/13/benchmarking-galera-replication-overhead/

http://karlssonondatabases.blogspot.ch/2012/12/galera-features-beyond-just-ha.html

http://infoscience.epfl.ch/record/52305/files/IC_TECH_REPORT_199908.pdf

http://www.inf.usi.ch/faculty/pedone/Paper/2005/2005WDIDDR.pdf

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

34

Page 35: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

Little demo?

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

35

Page 36: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

?

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

36

Page 37: Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

2012 © Trivadis

BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN

THANK YOU.Trivadis SA

Ludovico CaldaraSenior Consultant

[email protected]

27.04.2013Galera Cluster Synchronous Multi-Master Replication for MySQL HA

37