197
Advanced Percona XtraDB Cluster in a nutshell... la suite Hands on tutorial for advanced users! 1

Advanced Percona XtraDB Cluster in a nutshell... la suite

Embed Size (px)

Citation preview

Page 1: Advanced Percona XtraDB Cluster in a nutshell... la suite

Advanced Percona XtraDB Clusterin a nutshell... la suite

Hands on tutorial for advanced users!

1

Page 3: Advanced Percona XtraDB Cluster in a nutshell... la suite

Setting UpEnvironmentBootstrappingCertification ErrorsReplication FailuresGalera CacheFlow Control

ReplicationThroughputWAN ReplicationConsistent ReadsBackupsLoad BalancersChallenges

Table of Contents

3

Page 4: Advanced Percona XtraDB Cluster in a nutshell... la suite

Are you ready?Setting Up The Environment

4

Page 5: Advanced Percona XtraDB Cluster in a nutshell... la suite

Setting Up The Environment

Fetch a USB Stick, Install VirtualBox & Copy the 3 images

5

Page 6: Advanced Percona XtraDB Cluster in a nutshell... la suite

Testing The EnvironmentStart all 3 VirtualBox imagesssh/putty to:

pxc1: ssh root@localhost -p 8821pxc2: ssh root@localhost -p 8822pxc3: ssh root@localhost -p 8823

root password is vagrantHAProxy is running on pxc1 (http://localhost:8881/)Verify ssh between nodesOpen 2 ssh sessions to every node

6

Page 7: Advanced Percona XtraDB Cluster in a nutshell... la suite

Attention - Hands On!

When you see in the right bottom, there is an exercise that you should do!

7

Page 8: Advanced Percona XtraDB Cluster in a nutshell... la suite

Easy, no?Bootstrap A Cluster

8

Page 9: Advanced Percona XtraDB Cluster in a nutshell... la suite

Bootstrapping PXCYou all should know this already...

# service mysql bootstrap-pxc# /etc/init.d/mysql boostrap-pxc# /etc/init.d/mysql start --wsrep-new-cluster

or with systemd environments like Centos 7:

# systemctl mysql@bootstrap start

Today we are using 32-bit Centos 6.

For Emily, just to be complete....

pxc1# service mysql bootstrap-pxcpxc2# service mysql startpxc3# service mysql start

9

Page 10: Advanced Percona XtraDB Cluster in a nutshell... la suite

Bootstrapping PXCBootstrapping a node gives it permission to form a new clusterBootstrapping should NOT happen automatically without a systemwith split-brain protection that can coordinate it. Usually this isdone manuallyThe bootstrapped node is the source of truth for all nodes goingforward

10

Page 11: Advanced Percona XtraDB Cluster in a nutshell... la suite

Bootstrapping PXCRecap On IST/SST

IST: Incremental State TransferOnly transfer missing transactionsSST: State Snapshot TransferSnapshot the whole database and transfer, using:

Percona XtraBackuprsyncmysqldump

One node of the cluster is DONOR

11

Page 12: Advanced Percona XtraDB Cluster in a nutshell... la suite

Bootstrapping PXCWith Stop/Start Of MySQLWhen you need to start a new cluster from scratch, you decide whichnode to start with and you bootstrap it

# /etc/init.d/mysql start --wsrep-new-cluster

That node becomes the cluster source of truth (SSTs for all new nodes)

12

Page 13: Advanced Percona XtraDB Cluster in a nutshell... la suite

Bootstrapping PXCWithout Restarting MySQLWhen a cluster is already partitioned and you want to bring it up again.

1 or more nodes need to be in Non-Primary state.Choose the node that is newest and can be enabled (to work withapplication)To bootstrap online:mysql> set global wsrep_provider_options="pc.bootstrap=true";

be sure there is NO OTHER PRIMARY partition or there will be asplit brain!!

13

Page 14: Advanced Percona XtraDB Cluster in a nutshell... la suite

Bootstrapping PXCWithout Restarting MySQLUse Case:

Only 1 of the 3 nodes is available and the other 2 nodes crashed,causing node 1 to go Non-Primary.In Multi Datacenter environments:

DC1 has 2 nodes, DC2 has 1 node,If DC1 dies, the single node in DC2 will go Non-Primary. Toactivate secondary DC, a bootstrap is necessary

14

Page 15: Advanced Percona XtraDB Cluster in a nutshell... la suite

Recover Cleanly Shutdown ClusterRun the application (run_app.sh haproxy-all) on pxc1One by one, stop mysql on all 3 nodes

How can you know which node to bootstrap?

15

Page 16: Advanced Percona XtraDB Cluster in a nutshell... la suite

Recover Cleanly Shutdown ClusterRun the application (run_app.sh haproxy-all) on pxc1One by one, stop mysql on all 3 nodes

How can you know which node to bootstrap?

Solution# cat /var/lib/mysql/grastate.dat# GALERA saved stateversion: 2.1uuid: 3759f5c0-56f6-11e5-ad87-afbd92f4dcd2seqno: 1933471cert_index:

Bootstrap node with highest seqno and start other nodes.

16

Page 17: Advanced Percona XtraDB Cluster in a nutshell... la suite

Recover Unclean Stopped ClusterRun the application (run_app.sh haproxy-all) on pxc1On all nodes at the same time run:# killall -9 mysqld mysqld_safe

How can you know which node has the latest commit?

17

Page 18: Advanced Percona XtraDB Cluster in a nutshell... la suite

Recover Unclean Stopped ClusterRun the application (run_app.sh haproxy-all) on pxc1On all nodes at the same time run:# killall -9 mysqld mysqld_safe

How can you know which node has the latest commit?

Solution# mysqld_safe --wsrep-recoverLogging to '/var/lib/mysql/error.log'.Starting mysqld daemon with databases from /var/lib/mysqlWSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.Lnv9dv'WSREP: Recovered position 44e54b4b-5c69-11e5-83a3-8fc879cb495e:1719976mysqld from pid file /var/lib/mysql/pxc1.pid ended

18

Page 19: Advanced Percona XtraDB Cluster in a nutshell... la suite

Recover Unclean Stopped ClusterWhat methods can we use to bring back the cluster?

19

Page 20: Advanced Percona XtraDB Cluster in a nutshell... la suite

Recover Unclean Stopped ClusterWhat methods can we use to bring back the cluster?

SolutionsBootstrap the most accurate server

Since PXC 5.6.19-25.6 we have pc.recovery (enabled by default)that uses the information stored in gvwstate.dat. We can thenjust start MySQL on all 3 nodes at the same time

# service mysql restart

20

Page 21: Advanced Percona XtraDB Cluster in a nutshell... la suite

Avoiding SSTWhen MySQL cannot start due to an error, such as aconfiguration error, an SST is always performed.

21

Page 22: Advanced Percona XtraDB Cluster in a nutshell... la suite

Avoiding SSTWhen MySQL cannot start due to an error, such as aconfiguration error, an SST is always performed.

Let's try, stop MySQL on pxc2, modify my.cnf and add foobar under[mysqld] section.

Then start MySQL, does it fail? Check /var/lib/mysql/error.log.

# /etc/init.d/mysql stop

# cat >> /etc/my.cnf << EOF[mysqld]foobarEOF

# /etc/init.d/mysql start

22

Page 23: Advanced Percona XtraDB Cluster in a nutshell... la suite

Avoiding SSTWhen MySQL cannot start due to an error, such as aconfiguration error, an SST is always performed.

Fix the error (remove the foobar configuration) and restart MySQL.

Does it perform SST?

Check /var/lib/mysql/error.log. Why?

23

Page 24: Advanced Percona XtraDB Cluster in a nutshell... la suite

Avoiding SSTWhen MySQL cannot start due to an error, such as aconfiguration error, an SST is always performed.

Fix the error (remove the foobar configuration) and restart MySQL.

Does it perform SST?

Check /var/lib/mysql/error.log. Why?

SST is done, as we can see in the error.log:

[Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (93a81eed-57b2-11e5-8f5e-82e53aab8d35): 1 (Operation not permitted)...WSREP_SST: [INFO] Proceeding with SST (20150916 19:13:50.990)

24

Page 25: Advanced Percona XtraDB Cluster in a nutshell... la suite

Avoiding SSTWhen MySQL cannot start due to an error, such as aconfiguration error, an SST is always performed.

So how can we avoid SST?

It's easy, you need to hack /var/lib/mysql/grastate.dat.

Create the error again:

Bring node back in clusterAdd foobar to the configuration againStart MySQL

25

Page 26: Advanced Percona XtraDB Cluster in a nutshell... la suite

Avoiding SSTWhen MySQL cannot start due to an error, such as aconfiguration error, an SST is always performed.

# cat /var/lib/mysql/grastate.dat# GALERA saved stateversion: 2.1uuid: 93a81eed-57b2-11e5-8f5e-82e53aab8d35seqno: 1300762cert_index:

26

Page 27: Advanced Percona XtraDB Cluster in a nutshell... la suite

Avoiding SSTWhen MySQL cannot start due to an error, such as aconfiguration error, an SST is always performed.

# cat /var/lib/mysql/grastate.dat# GALERA saved stateversion: 2.1uuid: 93a81eed-57b2-11e5-8f5e-82e53aab8d35seqno: 1300762cert_index:

When it fails due to an error, grastate.dat is reset to:

# GALERA saved stateversion: 2.1uuid: 00000000-0000-0000-0000-000000000000seqno: -1cert_index:

27

Page 28: Advanced Percona XtraDB Cluster in a nutshell... la suite

Avoiding SSTWhen MySQL cannot start due to an error, such as aconfiguration error, an SST is always performed.

You need then to set the right uuid and seqno in grastate.dat

Run mysqld_safe --wsrep-recover to find the values to set:

[root@pxc2 mysql]# mysqld_safe --wsrep-recover... 2015-09-16 19:26:14 6133 [Note] WSREP: Recovered position: 93a81eed-57b2-11e5-8f5e-82e53aab8d35:1300762

28

Page 29: Advanced Percona XtraDB Cluster in a nutshell... la suite

Avoiding SSTWhen MySQL cannot start due to an error, such as aconfiguration error, an SST is always performed.

Create grastate.dat with info from wsrep-recover:

# GALERA saved stateversion: 2.1uuid: 93a81eed-57b2-11e5-8f5e-82e53aab8d35seqno: 1300762cert_index:

Start MySQL again and check /var/lib/mysql/error.log:

[root@pxc2 mysql]# /etc/init.d/mysql start...150916 19:27:53 mysqld_safe Assigning 93a81eed-57b2-11e5-8f5e-82e53aab8d35:1300762 to wsrep_start_position...WSREP_SST: [INFO] xtrabackup_ist received from donor: Running IST (20150916 19:34:05.545

29

Page 30: Advanced Percona XtraDB Cluster in a nutshell... la suite

When putting in production unprepared...Certification Errors

30

Page 31: Advanced Percona XtraDB Cluster in a nutshell... la suite

CertificationWhat it does:

Determine if writeset can be applied.Based on unapplied earlier transactions on masterSuch conflicts must come from other nodes

Happens on every node, individuallyDeterministicResults are not reported to other nodes in the cluster, as every nodedoes certification and is a determinstic process.

Pass: enter apply queue (commit success on master)Fail: drop transaction (or return deadlock on master)

Serialized by GTIDCost based on # of keys or # of rows

31

Page 32: Advanced Percona XtraDB Cluster in a nutshell... la suite

Certification

32

Page 33: Advanced Percona XtraDB Cluster in a nutshell... la suite

Certification

33

Page 34: Advanced Percona XtraDB Cluster in a nutshell... la suite

Certification

34

Page 35: Advanced Percona XtraDB Cluster in a nutshell... la suite

Certification

35

Page 36: Advanced Percona XtraDB Cluster in a nutshell... la suite

Certification

36

Page 37: Advanced Percona XtraDB Cluster in a nutshell... la suite

Certification

37

Page 38: Advanced Percona XtraDB Cluster in a nutshell... la suite

Certification

38

Page 39: Advanced Percona XtraDB Cluster in a nutshell... la suite

Conflict DetectionLocal Certification Failure (lcf)

Transaction fails certificationPost-replicationDeadlock/Transaction RollbackStatus Counter: wsrep_local_cert_failures

Brute Force Abort (bfa)(Most Common)Deadlock/Transaction rolled back by applier threadsPre-commitTransaction RollbackStatus Counter: wsrep_local_bf_aborts

39

Page 40: Advanced Percona XtraDB Cluster in a nutshell... la suite

Conflict Deadlock/Rollbacknote: Transaction Rollback can be gotten on any statement, includingSELECT and COMMIT

Example:

pxc1 mysql> commit;ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

40

Page 41: Advanced Percona XtraDB Cluster in a nutshell... la suite

Multi-writer Conflict TypesBrute Force Abort (bfa)

Transaction rolled back by applier threadsPre-commitTransaction Rollback can be gotten on any statement, includingSELECT and COMMITStatus Counter: wsrep_local_bf_aborts

41

Page 42: Advanced Percona XtraDB Cluster in a nutshell... la suite

Brute Force Abort (bfa)

42

Page 43: Advanced Percona XtraDB Cluster in a nutshell... la suite

Brute Force Abort (bfa)

43

Page 44: Advanced Percona XtraDB Cluster in a nutshell... la suite

Brute Force Abort (bfa)

44

Page 45: Advanced Percona XtraDB Cluster in a nutshell... la suite

Brute Force Abort (bfa)

45

Page 46: Advanced Percona XtraDB Cluster in a nutshell... la suite

Brute Force Abort (bfa)

46

Page 47: Advanced Percona XtraDB Cluster in a nutshell... la suite

Brute Force Abort (bfa)

47

Page 48: Advanced Percona XtraDB Cluster in a nutshell... la suite

Brute Force Abort (bfa)

48

Page 49: Advanced Percona XtraDB Cluster in a nutshell... la suite

Multi-writer Conflict TypesLocal Certification Failure (lcf)

Transaction fails certificationPost-replicationDeadlock on commitStatus Counter: wsrep_local_cert_failures

49

Page 50: Advanced Percona XtraDB Cluster in a nutshell... la suite

Local Certification Failure (lcf)

50

Page 51: Advanced Percona XtraDB Cluster in a nutshell... la suite

Local Certification Failure (lcf)

51

Page 52: Advanced Percona XtraDB Cluster in a nutshell... la suite

Local Certification Failure (lcf)

52

Page 53: Advanced Percona XtraDB Cluster in a nutshell... la suite

Local Certification Failure (lcf)

53

Page 54: Advanced Percona XtraDB Cluster in a nutshell... la suite

Local Certification Failure (lcf)

54

Page 55: Advanced Percona XtraDB Cluster in a nutshell... la suite

Local Certification Failure (lcf)

55

Page 56: Advanced Percona XtraDB Cluster in a nutshell... la suite

Local Certification Failure (lcf)

56

Page 57: Advanced Percona XtraDB Cluster in a nutshell... la suite

Local Certification Failure (lcf)

57

Page 58: Advanced Percona XtraDB Cluster in a nutshell... la suite

Local Certification Failure (lcf)

58

Page 59: Advanced Percona XtraDB Cluster in a nutshell... la suite

Local Certification Failure (lcf)

59

Page 60: Advanced Percona XtraDB Cluster in a nutshell... la suite

Local Certification Failure (lcf)

60

Page 61: Advanced Percona XtraDB Cluster in a nutshell... la suite

Local Certification Failure (lcf)

61

Page 62: Advanced Percona XtraDB Cluster in a nutshell... la suite

Conflict DetectionExercises!

62

Page 63: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing Conflicts - 1On pxc1, create test table:

pxc1 mysql> CREATE TABLE test.deadlocks ( i INT UNSIGNED NOT NULL PRIMARY KEY, j varchar(32), t datetime);pxc1 mysql> INSERT INTO test.deadlocks VALUES (1, NULL, NULL);

Run myq_status on pxc1:

# myq_status wsrepmycluster / pxc1 (idx: 1) / Galera 3.11(ra0189ab)Cluster Node Repl Queue Ops Bytes Conflct Gcache Window Flow cnf # Stat Laten Up Dn Up Dn Up Dn lcf bfa ist idx dst appl comm p_ms 12 3 Sync N/A 0 0 0 0 0.0 0.0 0 0 1.8m 0 0 0 12 3 Sync N/A 0 0 0 0 0.0 0.0 0 0 1.8m 0 0 0 12 3 Sync N/A 0 0 0 0 0.0 0.0 0 0 1.8m 0 0 0

63

Page 64: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing Conflicts - 1On pxc1:

pxc1 mysql> BEGIN;pxc1 mysql> UPDATE test.deadlocks SET j='pxc1', t=now() WHERE i=1;

Before commit, go to pxc3:

pxc3 mysql> BEGIN;pxc3 mysql> UPDATE test.deadlocks SET j='pxc3', t=now() WHERE i=1;pxc3 mysql> COMMIT;

Now commit the transaction on pxc1:

pxc1 mysql> COMMIT;pxc1 mysql> SELECT * FROM test.deadlocks;

64

Page 65: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing Conflicts - 1It fails:

pxc1 mysql> commit;ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

65

Page 66: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing Conflicts - 1Which commit succeeded?Is this a lcf or a bfa?How would you diagnose this error?

66

Page 67: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing Conflicts - 1Which commit succeeded? PXC3, first one that got in cluster.Is this a lcf or a bfa? BFAHow would you diagnose this error?

show global status like 'wsrep_local_bf%';show global status like 'wsrep_local_cert%';+---------------------------+-------+| Variable_name | Value |+---------------------------+-------+| wsrep_local_bf_aborts | 1 || wsrep_local_cert_failures | 0 |+---------------------------+-------+

# myq_status wsrepmycluster / pxc1 (idx: 1) / Galera 3.11(ra0189ab)Wsrep Cluster Node Repl Queue Ops Bytes Conflct Gcache Window Flow time P cnf # Stat Laten Up Dn Up Dn Up Dn lcf bfa ist idx dst appl comm p_ms10:49:43 P 12 3 Sync 1.1ms 0 0 0 0 0.0 0.0 0 0 3 3 10:49:44 P 12 3 Sync 1.1ms 0 0 0 0 0.0 0.0 0 0 3 3 10:49:45 P 12 3 Sync 1.1ms 0 0 0 0 0.0 0.0 0 0 3 3 10:49:46 P 12 3 Sync 1.1ms 0 0 0 1 0.0 0.3K 0 1 4 3

67

Page 68: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing Conflicts - 1Log Conflictspxc1 mysql> set global wsrep_log_conflicts=on;

*** Priority TRANSACTION:TRANSACTION 7743569, ACTIVE 0 sec starting index readMySQL thread id 2, OS thread handle 0x93e78b70, query id 1395484 System lock*** Victim TRANSACTION:TRANSACTION 7743568, ACTIVE 9 secMySQL thread id 89984, OS thread handle 0x82bb1b70, query id 1395461 localhost root cleaning up *** WAITING FOR THIS LOCK TO BE GRANTED:RECORD LOCKS space id 80 page no 3 n bits 72 index PRIMARY of table test.deadlocks trx id 7743568 lock_mode X locks rec but not gap2015-09-19 12:36:17 4285 [Note] WSREP: cluster conflict due to high priority abort for threads:2015-09-19 12:36:17 4285 [Note] WSREP: Winning thread: THD: 2, mode: applier, state: executing, conflict: no conflict, seqno: 1824234 SQL: (null)2015-09-19 12:36:17 4285 [Note] WSREP: Victim thread: THD: 89984, mode: local, state: idle, conflict: no conflict, seqno: -1 SQL: (null)

68

Page 69: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing Conflicts - 1Log Conflicts - Debugpxc1 mysql> set global wsrep_debug=on;

[Note] WSREP: BF kill (1, seqno: 1824243), victim: (90473) trx: 7743601[Note] WSREP: Aborting query: void[Note] WSREP: kill IDLE for 7743601[Note] WSREP: enqueuing trx abort for (90473)[Note] WSREP: signaling aborter[Note] WSREP: WSREP rollback thread wakes for signal[Note] WSREP: client rollback due to BF abort for (90473), query: (null)[Note] WSREP: WSREP rollbacker aborted thd: (90473 2649955184)[Note] WSREP: Deadlock error for: (null)

69

Page 70: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing Conflicts - 2rollback; all transactions on all mysql clientsensure SET GLOBAL wsrep_log_conflicts=on; on all nodes;run myq_status wsrep on pxc1 and pxc2run run_app.sh lcf on pxc1 to reproduce a LCFcheck:

output of run_app.sh lcfmyq_status/var/lib/mysql/error.log

70

Page 71: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing Conflicts - 2[root@pxc2 ~]# myq_status wsrepmycluster / pxc2 (idx: 0) / Galera 3.12(r9921e73)Wsrep Cluster Node Repl Queue Ops Bytes Conflct Gcache Window Flow time P cnf # Stat Laten Up Dn Up Dn Up Dn lcf bfa ist idx dst appl comm p_ms13:28:15 P 47 3 Sync 1.1ms 0 0 0 0 0.0 0.0 0 0 7433 101 13:28:16 P 47 3 Sync 1.1ms 0 409 0 4 0.0 1.1K 0 0 7436 5 13:28:17 P 47 3 Sync 1.1ms 0 947 0 0 0.0 0.0 0 0 7436 5 13:28:18 P 47 3 Sync 1.1ms 0 1470 0 0 0.0 0.0 0 0 7436 5 13:28:19 P 47 3 Sync 1.1ms 0 1892 0 0 0.0 0.0 0 0 7436 5 13:28:20 P 47 3 Sync 1.1ms 0 2555 0 0 0.0 0.0 0 0 7436 5 13:28:21 P 47 3 Sync 1.1ms 0 3274 0 0 0.0 0.0 0 0 7436 5 13:28:22 P 47 3 Sync 1.1ms 0 3945 0 0 0.0 0.0 0 0 7436 5 13:28:23 P 47 3 Sync 1.1ms 0 4663 0 0 0.0 0.0 0 0 7436 5 13:28:24 P 47 3 Sync 1.1ms 0 5400 0 0 0.0 0.0 0 0 7436 5 13:28:25 P 47 3 Sync 1.1ms 0 6096 0 0 0.0 0.0 0 0 7436 5 13:28:26 P 47 3 Sync 1.1ms 0 6839 0 0 0.0 0.0 0 0 7436 5 13:28:27 P 47 3 Sync 1.1ms 0 6872 0 0 0.0 0.0 0 0 7436 5 13:28:28 P 47 3 Sync 1.1ms 0 6872 0 0 0.0 0.0 0 0 7436 5 13:28:29 P 47 3 Sync 1.1ms 0 6872 0 0 0.0 0.0 0 0 7436 5 13:28:30 P 47 3 Sync 1.1ms 0 5778 1 1102 0.3K 0.3M 0 0 8537 5 13:28:31 P 47 3 Sync 1.1ms 0 978 0 4838 0.0 1.4M 0 0 13k 5 13:28:32 P 47 3 Sync 1.1ms 0 0 1 985 0.3K 0.3M 2 0 14k 5 13:28:33 P 47 3 Sync N/A 0 0 0 0 0.0 0.0 0 0 14k 5 13:28:34 P 47 3 Sync N/A 0 0 0 0 0.0 0.0 0 0 14k 5 13:28:35 P 47 3 Sync N/A 0 0 0 0 0.0 0.0 0 0 14k 5

71

Page 72: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing Conflicts - 2*** Priority TRANSACTION:TRANSACTION 7787747, ACTIVE 0 sec starting index readmysql tables in use 1, locked 11 lock struct(s), heap size 312, 0 row lock(s)MySQL thread id 1, OS thread handle 0x93e78b70, query id 301870 System lock*** Victim TRANSACTION:TRANSACTION 7787746, ACTIVE 0 secmysql tables in use 1, locked 12 lock struct(s), heap size 312, 1 row lock(s), undo log entries 1MySQL thread id 2575, OS thread handle 0x82369b70, query id 286919 pxc1 192.168update test.test set sec_col = 0 where id = 1[Note] WSREP: Winning thread: THD: 1, mode: applier, state: executing, conflict: no conflict, seqno: 1846028, SQL: (null)[Note] WSREP: Victim thread: THD: 2575, mode: local, state: committing, conflict: no conflict, seqno: -1, SQL: update test.test set sec_col = 0 where id = 1[Note] WSREP: BF kill (1, seqno: 1846028), victim: (2575) trx: 7787746[Note] WSREP: Aborting query: update test.test set sec_col = 0 where id = 1[Note] WSREP: kill trx QUERY_COMMITTING for 7787746[Note] WSREP: trx conflict for key (1,FLAT8)258634b1 d0506abd: source: 5cb369ab-5eca-11e5-8151-7afe8943c31a version: 3 local: 1 state: MUST_ABORT flags: 1 conn_id: 2575 trx_id: 7787746 seqnos (l: 21977, g: 1846030, s: 1846027, d: 1838545, ts: 161299135504641) <--X--> source: 9b376860-5e09-11e5-ac17-e6e46a2459ee version: 3 local: 0 state: APPLYING flags: 1 conn_id: 95747 trx_id: 7787229

72

Page 73: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing ConflictsSummaryConflicts are of a concern when determining if PXC is a fit for theapplication:

Long running transactions increase chance of conflictsHeavy write workload on multiple nodesLarge transactions increase chance of conflictsMark Callaghan's law a given row can't be modified more oftenthan 1/RTT times a second

These issues can usually be resolved by writing to 1 node only.

73

Page 74: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing Conflicts - Summary

74

Page 75: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing Conflicts - Summary

75

Page 76: Advanced Percona XtraDB Cluster in a nutshell... la suite

Reproducing Conflicts - Summary

76

Page 77: Advanced Percona XtraDB Cluster in a nutshell... la suite

What the ...?Replication Failures

77

Page 78: Advanced Percona XtraDB Cluster in a nutshell... la suite

Replication FailureWhen Do They Happen?

When a Total Order Isolation (TOI) error happenedDDL error: CREATE TABLE, ALTER TABLE...GRANT failed

When there was a node inconsistencyBug in Galera replicationHuman Error, for example skipping binary log(SQL_LOG_BIN=0) when doing writes

78

Page 79: Advanced Percona XtraDB Cluster in a nutshell... la suite

Replication FailureWhat Happens?

At every error:A GRA_*.log file is created into the MySQL datadir[root@pxc1 ~]# ls -alsh1 /var/lib/mysql/GRA_*-rw-rw----. 1 mysql 89 Sep 15 10:26 /var/lib/mysql/GRA_1_127792.log-rw-rw----. 1 mysql 83 Sep 10 12:00 /var/lib/mysql/GRA_2_5.log

A message is written to the errorlog /var/lib/mysql/error.log

It's possible to decode them, they are binary logsThey can be safely removed

79

Page 80: Advanced Percona XtraDB Cluster in a nutshell... la suite

Replication FailureReading GRA ContentsRun Application only on pxc1 using only pxc1 as writer:

# run_app.sh pxc1

pxc1 mysql> create table test.nocolumns;

What do you get?

80

Page 81: Advanced Percona XtraDB Cluster in a nutshell... la suite

Replication FailureReading GRA ContentsRun Application only on pxc1 using only pxc1 as writer:

# run_app.sh pxc1

pxc1 mysql> create table test.nocolumns;

What do you get?

ERROR 1113 (42000): A table must have at least 1 column

81

Page 82: Advanced Percona XtraDB Cluster in a nutshell... la suite

Replication FailureReading GRA ContentsRun Application only on pxc1 using only pxc1 as writer:

# run_app.sh pxc1

pxc1 mysql> create table test.nocolumns;

What do you get?

ERROR 1113 (42000): A table must have at least 1 column

Error Log Other Nodes:

[ERROR] Slave SQL: Error 'A table must have at least 1 column' on query. Default database: ''. Query: 'create table test.nocolumns', Error_code: 1113[Warning] WSREP: RBR event 1 Query apply warning: 1, 1881065[Warning] WSREP: Ignoring error for TO isolated action: source: 9b376860-5e09-11e5-ac17-e6e46a2459ee version: 3 local: 0 state: APPLYING flags: 65 conn_id: 106500 trx_id: -1 seqnos (l: 57292, g: 1881065, s: 1881064, d: 1881064, ts: 76501555632582)

82

Page 83: Advanced Percona XtraDB Cluster in a nutshell... la suite

Replication FailureMaking GRA Header Filenote: Binary log headers only differ between versions, they can bereused

Get a binary log without checksums:

Create the GRA_header file with the new binary log:

dd if=/var/lib/mysql/pxc2-bin.000018 bs=120 count=1 \ of=/root/GRA_header

pxc2 mysql> set global binlog_checksum=0;pxc2 mysql> flush binary logs;pxc2 mysql> show master status;+-----------------+----------+--------------+------------------+-------------------+| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |+-----------------+----------+--------------+------------------+-------------------+| pxc2-bin.000018 | 790333 | | | |+-----------------+----------+--------------+------------------+-------------------+

83

Page 84: Advanced Percona XtraDB Cluster in a nutshell... la suite

Replication FailureReading GRA ContentsJoin the header and one GRA*.log file (note the seqno of 1881065)

cat /root/GRA_header /var/lib/mysql/GRA_1_1881065.log \ >> /root/GRA_1_1881065-bin.log

View the content with mysqlbinlog

mysqlbinlog -vvv /root/GRA_1_1881065-bin.log

84

Page 85: Advanced Percona XtraDB Cluster in a nutshell... la suite

Replication FailureNode Consistency CompromisedDelete some data while skipping binary log completely:

pxc2 mysql> set sql_log_bin=0;Query OK, 0 rows affected (0.00 sec)pxc2 mysql> delete from sbtest.sbtest1 limit 100;Query OK, 100 rows affected (0.00 sec)

Repeat the DELETE until pxc2 crashes...

85

Page 86: Advanced Percona XtraDB Cluster in a nutshell... la suite

Replication FailureNode Consistency CompromisedError:

[ERROR] Slave SQL: Could not execute Update_rows event on table sbtest.sbtest1; Can't find record in 'sbtest1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 540, Error_code: 1032[Warning] WSREP: RBR event 3 Update_rows apply warning: 120, 1890959[Warning] WSREP: Failed to apply app buffer: seqno: 1890959, status: 1 at galera/src/trx_handle.cpp:apply():351Retrying 2th time ... Retrying 4th time ...[ERROR] WSREP: Failed to apply trx: source: 9b376860-5e09-11e5-ac17-e6e46a2459ee version: 3 local: 0 state: APPLYING flags: 1 conn_id: 117611 trx_id: 7877213 seqnos (l: 67341, g: 1890959, s: 1890958, d: 1890841, ts: 78933399926835)[ERROR] WSREP: Failed to apply trx 1890959 4 times[ERROR] WSREP: Node consistency compromized, aborting......[Note] WSREP: /usr/sbin/mysqld: Terminated.

86

Page 87: Advanced Percona XtraDB Cluster in a nutshell... la suite

Replication FailureNode Consistency Compromised[root@pxc2 ~]# cat /root/GRA_header /var/lib/mysql/GRA_1_1890959.log | \ mysqlbinlog -vvv - BINLOG '...MDQwNjUyMjMwMTQ=...'/*!*/;### UPDATE sbtest.sbtest1### WHERE### @1=3528 /* INT meta=0 nullable=0 is_null=0 */### @2=4395 /* INT meta=0 nullable=0 is_null=0 */### @3='01945529982-83991536409-94055999891-11150850160-46682230772-19159811582-78798392206-09488391775-93806504421-87794774637' /* STRING(120) meta=65144 nullable=0 is_null=0 */### @4='92814455222-06024456935-25380449439-64345775537-04065223014' /* STRING(60) meta=65084 nullable=0 is_null=0 */### SET### @1=3528 /* INT meta=0 nullable=0 is_null=0 */### @2=4396 /* INT meta=0 nullable=0 is_null=0 */### @3='01945529982-83991536409-94055999891-11150850160-46682230772-19159811582-78798392206-09488391775-93806504421-87794774637' /* STRING(120) meta=65144 nullable=0 is_null=0 */### @4='92814455222-06024456935-25380449439-64345775537-04065223014' /* STRING(60) meta=65084 nullable=0 is_null=0 */# at 660#150919 15:56:36 server id 1 end_log_pos 595 Table_map: sbtest.sbtest1 mapped to number 70# at 715#150919 15:56:36 server id 1 end_log_pos 1005 Update_rows: table id 70 flags: STMT_END_F...

87

Page 88: Advanced Percona XtraDB Cluster in a nutshell... la suite

Gcache... what's that ?Galera Cache

88

Page 89: Advanced Percona XtraDB Cluster in a nutshell... la suite

Galera CacheAll nodes contain a Cache of recent writesets, used to perform IST

used to store the writesets in circular buffer style

89

Page 90: Advanced Percona XtraDB Cluster in a nutshell... la suite

Galera CacheAll nodes contain a Cache of recent writesets, used to perform IST

used to store the writesets in circular buffer stylepreallocated file with a specific size, configurable: wsrep_provider_options = "gcache.size=1G"default size is 128M

90

Page 91: Advanced Percona XtraDB Cluster in a nutshell... la suite

Galera CacheAll nodes contain a Cache of recent writesets, used to perform IST

used to store the writesets in circular buffer stylepreallocated file with a specific size, configurable: wsrep_provider_options = "gcache.size=1G"default size is 128MGalera Cache is mmaped (I/O buffered to memory)

So OS might swap (set vm.swappiness to 10)use fincore-linux or dbsake fincore to see how much of thefile is cached in memory

91

Page 92: Advanced Percona XtraDB Cluster in a nutshell... la suite

Galera CacheAll nodes contain a Cache of recent writesets, used to perform IST

used to store the writesets in circular buffer stylepreallocated file with a specific size, configurable: wsrep_provider_options = "gcache.size=1G"default size is 128MGalera Cache is mmaped (I/O buffered to memory)

So OS might swap (set vm.swappiness to 10)use fincore-linux or dbsake fincore to see how much of thefile is cached in memorystatus counter wsrep_local_cached_downto to find last seqnoin the gcachewsrep_gcache_pool_size shows the size of the page pooland/or dynamic memroy allocated for gcache (since PXC 5.6.24)

92

Page 93: Advanced Percona XtraDB Cluster in a nutshell... la suite

Galera CacheCalculating Optimal SizeIt would be great that we could handle 1 hour of changes in the galeracache for IST.

How large does the Galera cache need to be?

93

Page 94: Advanced Percona XtraDB Cluster in a nutshell... la suite

Galera CacheCalculating Optimal SizeIt would be great that we could handle 1 hour of changes in the galeracache for IST.

How large does the Galera cache need to be?

We can calculate how much writes happen over time

wsrep_replicated_bytes: writesets sent to other nodeswsrep_received_bytes: writesets received from other nodes

94

Page 95: Advanced Percona XtraDB Cluster in a nutshell... la suite

Galera CacheCalculating Optimal SizeIt would be great that we could handle 1 hour of changes in the galeracache for IST.

How large does the Galera cache need to be?

We can calculate how much writes happen over time

wsrep_replicated_bytes: writesets sent to other nodeswsrep_received_bytes: writesets received from other nodes

SHOW GLOBAL STATUS LIKE 'wsrep_%d_bytes';SELECT SLEEP(60);SHOW GLOBAL STATUS LIKE 'wsrep_%d_bytes';

Sum up both replicated and received for each status and subtract.

95

Page 96: Advanced Percona XtraDB Cluster in a nutshell... la suite

Galera CacheCalculating Optimal SizeEasier to do is:

SELECT ROUND(SUM(bytes)/1024/1024*60) AS megabytes_per_hour FROM (SELECT SUM(VARIABLE_VALUE) * -1 AS bytes FROM information_schema.GLOBAL_STATUS WHERE VARIABLE_NAME IN ('wsrep_received_bytes', 'wsrep_replicated_bytes') UNION ALL SELECT sleep(60) AS bytes UNION ALL SELECT SUM(VARIABLE_VALUE) AS bytes FROM information_schema.GLOBAL_STATUS WHERE VARIABLE_NAME IN ('wsrep_received_bytes', 'wsrep_replicated_bytes') ) AS COUNTED+--------------------+| megabytes_per_hour |+--------------------+| 302 |+--------------------+1 row in set (1 min 0.00 sec)

96

Page 97: Advanced Percona XtraDB Cluster in a nutshell... la suite

Galera CacheHow Much Filesystem Cache Used?Check galera cache's memory usage using dbsake:

dbsake fincore /var/lib/mysql/galera.cache

/var/lib/mysql/galera.cache: total_pages=32769 cached=448 percent=1.37

97

Page 98: Advanced Percona XtraDB Cluster in a nutshell... la suite

Hey! Wait for me!Flow Control

98

Page 99: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow ControlAvoiding nodes drift behind (slave lag?)

Any node in the cluster can ask the other nodes to pause writes if itlags behind too much.Caused by wsrep_local_recv_queue exceeding a node’sgcs.fc_limitCan cause all writes on all nodes in the entire cluster to bestalled.

99

Page 100: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

100

Page 101: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

101

Page 102: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

102

Page 103: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

103

Page 104: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

104

Page 105: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

105

Page 106: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

106

Page 107: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

107

Page 108: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

108

Page 109: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

109

Page 110: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

110

Page 111: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

111

Page 112: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

112

Page 113: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

113

Page 114: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

114

Page 115: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

115

Page 116: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

116

Page 117: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

117

Page 118: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

118

Page 119: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow Control

119

Page 120: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow ControlStatus Counters

wsrep_flow_control_paused_ns: nanoseconds since starts of node did the cluster get stalled.wsrep_flow_control_recv: Amount of flow control messages received from other nodeswsrep_flow_control_sent: Amount of flow control messages sent to other nodes(wsrep_flow_control_paused: Only use in Galera 2 % of the time the cluster was stalled since last SHOW GLOBALSTATUS)

120

Page 121: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow ControlObserving Flow Control

Run the application (run_app.sh)Run myq_status wsrep on all nodes.Take a read lock on pxc3 and observe its effect on the cluster.FLUSH TABLES WITH READ LOCK

121

Page 122: Advanced Percona XtraDB Cluster in a nutshell... la suite

pxc1

run_app.sh

all nodes:

myq_status wsrep

Flow ControlObserving Flow Control

Run the application (run_app.sh)Run myq_status wsrep on all nodes.Take a read lock on pxc3 and observe its effect on the cluster.FLUSH TABLES WITH READ LOCK

pxc3 mysql> flush tables with read lock;wait until flow control kicks in...pcx3 mysql> unlock tables;

122

Page 123: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow ControlIncrease the limitIncrease the flow control limit on pxc3 to 20000 and perform the sameexercise as previously.

123

Page 124: Advanced Percona XtraDB Cluster in a nutshell... la suite

pxc1

run_app.sh

all nodes:

myq_status wsrep

Flow ControlIncrease the limitIncrease the flow control limit on pxc3 to 20000 and perform the sameexercise as previously.

pxc3 mysql> set global wsrep_provider_options="gcs.fc_limit=20000";

pxc3 mysql> flush tables with read lock;wait until flow control kicks in...pcx3 mysql> unlock tables;

124

Page 125: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow ControlIncrease the limitWhat do you see?

A node can lag behind more before sending flow control messages.This can be controlled per node.

Is there another alternative?

125

Page 126: Advanced Percona XtraDB Cluster in a nutshell... la suite

Flow ControlDESYNC mode

It's possible to let a node going behind the flow control limit.This can be performed by setting wsrep_desync=ON

Try the same exercises but enable DESYNC on pxc3.

126

Page 127: Advanced Percona XtraDB Cluster in a nutshell... la suite

pxc1

run_app.sh

all nodes:

myq_status wsrep

Flow ControlDESYNC mode

It's possible to let a node going behind the flow control limit.This can be performed by setting wsrep_desync=ON

Try the same exercises but enable DESYNC on pxc3.

pxc3 mysql> set global wsrep_provider_options="gcs.fc_limit=16";pxc3 mysql> set global wsrep_desync=on;

Don't forget when done:

pcx3 mysql> unlock tables;

127

Page 128: Advanced Percona XtraDB Cluster in a nutshell... la suite

How much more can we handle?Max Replication Throughput

128

Page 129: Advanced Percona XtraDB Cluster in a nutshell... la suite

Max Replication ThroughputWe can measure the write throughput of a node/cluster:

Put a node in wsrep_desync=on to avoid flow control messagesbeing sentLock the replication with FLUSH TABLES WITH READ LOCKWait and build up a queue for a certain amount of timeUnlock replication again with UNLOCK TABLESMeasure how fast it syncs up againCompare with normal workload

129

Page 130: Advanced Percona XtraDB Cluster in a nutshell... la suite

Max Replication ThroughputMeasureOn pxc2 run

show global status like 'wsrep_last_committed'; select sleep(60); show global status like 'wsrep_last_committed';

One Liner:

SELECT ROUND(SUM(trx)/60) AS transactions_per_second FROM (SELECT VARIABLE_VALUE * -1 AS trx FROM information_schema.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_last_committed' UNION ALL SELECT sleep(60) AS trx UNION ALL SELECT VARIABLE_VALUE AS trx FROM information_schema.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_last_committed') AS COUNTED;+-------------------------+| transactions_per_second |+-------------------------+| 185 |+-------------------------+

130

Page 131: Advanced Percona XtraDB Cluster in a nutshell... la suite

Max Replication ThroughputMeasureThe following stored function is already installed:

USE test; DROP FUNCTION IF EXISTS galeraWaitUntilEmptyRecvQueue; DELIMITER $$CREATE DEFINER=root@localhost FUNCTION galeraWaitUntilEmptyRecvQueue() RETURNS INT UNSIGNED READS SQL DATABEGIN DECLARE queue INT UNSIGNED; DECLARE starttime TIMESTAMP; DECLARE blackhole INT UNSIGNED; SET starttime = SYSDATE(); SELECT VARIABLE_VALUE AS trx INTO queue FROM information_schema.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_local_recv_queue'; WHILE queue > 1 DO /* we allow the queue to be 1 */ SELECT VARIABLE_VALUE AS trx INTO queue FROM information_schema.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_local_recv_queue'; SELECT SLEEP(1) into blackhole; END WHILE; RETURN SYSDATE() - starttime;END$$

131

Page 132: Advanced Percona XtraDB Cluster in a nutshell... la suite

Max Replication ThroughputMeasureSET GLOBAL wsrep_desync=on;FLUSH TABLES WITH READ LOCK;...wait until the queue rises to be quite high, about 20.000UNLOCK TABLES; use test;SELECT sum(trx) as transactions, sum(duration) as time, IF(sum(duration) < 5, 'DID NOT TAKE LONG ENOUGH TO BE ACCURATE', ROUND(SUM(trx)/SUM(duration))) AS transactions_per_second FROM (SELECT VARIABLE_VALUE * -1 AS trx, null as duration FROM information_schema.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_last_committed' UNION ALL SELECT null as trx, galeraWaitUntilEmptyRecvQueue() AS duration UNION ALL SELECT VARIABLE_VALUE AS trx, null as duration FROM information_schema.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_last_committed' ) AS COUNTED;+--------------+------+-------------------------+| transactions | time | transactions_per_second |+--------------+------+-------------------------+| 17764 | 11 | 1615 |

132

Page 133: Advanced Percona XtraDB Cluster in a nutshell... la suite

Max Replication ThroughputMeasure

Normal Workload: 185 tpsDuring Catchup: 1615 tpsCapacity: 185/1615= 11,5% of capacity

133

Page 134: Advanced Percona XtraDB Cluster in a nutshell... la suite

A wired worldNetworking

134

Page 135: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingWith Synchronous Replication, It Matters

Network issues cause cluster issues, much faster compared toasynchronous replication.

Network PartitioningNodes joining/leavingCausing clusters to go Non-Primary, not accepting any reads and writes anymore.

Latency has an impact on response time:at COMMIT of a transactiondepending on wsrep_sync_wait setting for other statementstoo.

135

Page 136: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingStatus Variablespxc2 mysql> show global status like 'wsrep_evs_repl_latency';+------------------------+-------------------------------------------------+| Variable_name | Value |+------------------------+-------------------------------------------------+| wsrep_evs_repl_latency | 0.000745194/0.00175792/0.00832816/0.00184453/16 |+------------------------+-------------------------------------------------+

Reset Interval with evs.stats_report_period=1min

# myq_status wsrep_latency`:mycluster / pxc2 (idx: 2) / Galera 3.12(r9921e73)Wsrep Cluster Node Ops Latencies time P cnf # Stat Up Dn Size Min Avg Max Dev22:55:48 P 53 3 Sync 0 65 9 681µs 1307µs 4192µs 1032µs22:55:49 P 53 3 Sync 0 52 10 681µs 1274µs 4192µs 984µs22:55:50 P 53 3 Sync 0 47 10 681µs 1274µs 4192µs 984µs22:55:51 P 53 3 Sync 0 61 11 681µs 1234µs 4192µs 947µs

136

Page 137: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingLatencyOn pxc1, start the 'application':

# myq_status wsrep_latencymycluster / pxc2 (idx: 2) / Galera 3.12(r9921e73)Wsrep Cluster Node Ops Latencies time P cnf # Stat Up Dn Size Min Avg Max Dev23:02:44 P 53 3 Sync 0 48 7 777µs 1236µs 2126µs 434µs23:02:45 P 53 3 Sync 0 47 7 777µs 1236µs 2126µs 434µs23:02:46 P 53 3 Sync 0 58 7 777µs 1236µs 2126µs 434µs

run_app.sh pxc1[1125s] tps: 51.05, reads: 687.72, writes: 204.21, response time: 10.54ms (95%), errors: [1126s] tps: 33.98, reads: 475.77, writes: 135.94, response time: 15.07ms (95%), errors: [1127s] tps: 42.01, reads: 588.19, writes: 168.05, response time: 12.79ms (95%), errors:

# myq_status wsrepWsrep Cluster Node Repl Queue Ops Bytes Conflct Gcache Window Flow time P cnf # Stat Laten Up Dn Up Dn Up Dn lcf bfa ist idx dst appl comm p_ms23:02:44 P 53 3 Sync 1.2ms 0 0 0 49 0.0 80K 0 0 77k 178 23:02:45 P 53 3 Sync 1.2ms 0 0 0 43 0.0 70K 0 0 77k 176 23:02:46 P 53 3 Sync 1.2ms 0 0 0 55 0.0 90K 0 0 77k 164

137

Page 138: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingWAN Impact on LatencyChange from a LAN setup into a cluster across 2 datacenters

last_node_to_dc2.sh enable

138

Page 139: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingWAN Impact on Latencylast_node_to_dc2.sh enable

What can we observe in the cluster after running this command?

139

Page 140: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingWAN Impact on Latencylast_node_to_dc2.sh enable

What can we observe in the cluster after running this command?

myq_status wsrep_latency is up 200msrun_app.sh throughput is a lot lowerrun_app.sh response time is a lot higher

mycluster / pxc3 (idx: 0) / Galera 3.12(r9921e73)Wsrep Cluster Node Ops Latencies time P cnf # Stat Up Dn Size Min Avg Max Dev23:23:34 P 53 3 Sync 0 14 6 201ms 202ms 206ms 2073µs23:23:35 P 53 3 Sync 0 16 6 201ms 202ms 206ms 2073µs23:23:36 P 53 3 Sync 0 14 6 201ms 202ms 206ms 2073µs23:23:37 P 53 3 Sync 0 14 6 201ms 202ms 206ms 2073µs23:23:38 P 53 3 Sync 0 14 6 201ms 202ms 206ms 2073µs

140

Page 141: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingWAN Impact on LatencyWhy Is that?

141

Page 142: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingWAN Impact on LatencyWhy Is that?

Don't forget this is synchronous replication, the writeset is replicatedsynchronously.

Delivers the writeset to all nodes in the cluster at trx commit.And all nodes acknowledging the writeset

Generates a GLOBAL ORDER for that transaction (GTID)Cost is ~roundtrip latency for COMMIT to furthest nodeGTID serialized, but many writesets can be replicating in parallelRemember Mark Callaghan's Law a given row can't be modifiedmore often than 1/RTT times a second

142

Page 143: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingWAN ConfigurationDon't forget in WAN to use higher timeouts and send windows:

evs.user_send_window=2 ~> 256evs.send_window=4 ~> 512evs.keepalive_period=PT1S ~> PT1Sevs.suspect_timeout=PT5S ~> PT15Sevs.inactive_timeout=PT15S ~> PT45S

Don't forget to disable the WAN:

last_node_to_dc2.sh disable

143

Page 144: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingWAN Configuration - BandwidthHow to reduce bandwith used between datacenters?

144

Page 145: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingWAN Configuration - BandwidthHow to reduce bandwith used between datacenters?

Use segments (gmcast.segment) to reduce network trafficbetween datacentersUse minimal binlog_row_image to reduce binary log sizerepl.key_format = FLAT8 which by default is already smallest

145

Page 146: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingReplication Without SegmentsHere we have a cluster spread across 2 datacenters

146

Page 147: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingReplication Without SegmentsA transaction executed on node1

147

Page 148: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingReplication Without SegmentsA transaction executed on node1 will be sent to all other nodes

148

Page 149: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingReplication Without SegmentsAs writes are accepted everywhere, every node therefore communicatewith all nodes, including arbitrator nodes

149

Page 150: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingReplication With Segments

Galera 3.0 introduced the segment conceptReplication traffic is minimized between segmentsDonor selection is preferred on local segment

150

Page 151: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingReplication With SegmentsTransactions are only sent once to other segments

151

Page 152: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingReplication With SegmentsThey do not always go through the same nodes, they all still need to beable to connect to eachother.

152

Page 153: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingReplication With Segments

Run the run_app.sh on pxc1On another terminal on pxc1, run speedometerspeedometer -r eth1 -t eth1 -l -m 524288

Change the segment on pxc2 an pxc3 in /etc/my.cnf and restartMySQL (this is not dynamic)wsrep_provider_options='gmcast.segment=2'

Check the bandwidth usage again.

How do you explain this?

153

Page 154: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingReplication With SegmentsBandwidth Transmit usage drops a lot (/2)

154

Page 155: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingBinlog Row Image Format

On pxc1 run speedometer again::speedometer -r eth1 -t eth1 -l -m 262144

On pxc1, set the binlog_row_image=minimal:pxc1 mysql> SET GLOBAL binlog_row_image=minimal;

Check the bandwith usage

155

Page 156: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingBinlog Row Image Format

On pxc1 run speedometer again::speedometer -r eth1 -t eth1 -l -m 262144

On pxc1, set the binlog_row_image=minimal:pxc1 mysql> SET GLOBAL binlog_row_image=minimal;

Check the bandwith usage

156

Page 157: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingNot Completely Synchronous

Applying transactions is asynchronousBy default, reads on different nodes might show stale data.Practically, flow control prevents this from lagging too muchbehind, reducing stale data.Read Consistency can be configured: We can enforce a read readsthe latest committed data, cluster wide.

What if we absolutely need consistency?

157

Page 158: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingNot Completely SynchronousWhat if we absolutely need consistency?

Since PXC 5.6.20-27.7:SET <session|global> wsrep_sync_wait=[1|2|4];

1 Indicates check on READ statements, including SELECT,SHOW, BEGIN/START TRANSACTION.2 Indicates check on UPDATE and DELETE statements.4 Indicates check on INSERT and REPLACE statements

Before: <session|global> wsrep_causal_reads=[1|0];

158

Page 159: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingConsistent Reads & LatencyHow does enabling WSREP_SYNC_WAIT consistent reads affect WANenvironments?

Stop the application run_app.shMove the last node to DC2:last_node_to_dc2.sh enable

On pxc1, run:pxc1 mysql> select * from sbtest.sbtest1 where id = 4;...1 row in set (0.00 sec)

159

Page 160: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingConsistent Reads & LatencyNow change the causality check to ensure that READ statements are insync, and perform the same SELECT:

pxc1 mysql> SET SESSION wsrep_sync_wait=1;pxc1 mysql> select * from sbtest.sbtest1 where id = 4;

What do you see ?

160

Page 161: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingConsistent Reads & LatencyNow change the causality check to ensure that READ statements are insync, and perform the same SELECT:

pxc1 mysql> SET SESSION wsrep_sync_wait=1;pxc1 mysql> select * from sbtest.sbtest1 where id = 4;

What do you see ?

...1 row in set (0.20 sec)

161

Page 162: Advanced Percona XtraDB Cluster in a nutshell... la suite

NetworkingConsistent Reads & LatencyNow change the causality check to ensure that READ statements are insync, and perform the same SELECT:

pxc1 mysql> SET SESSION wsrep_sync_wait=1;pxc1 mysql> select * from sbtest.sbtest1 where id = 4;

What do you see ?

...1 row in set (0.20 sec)

Put back pxc3 on dc1:

last_node_to_dc2.sh disable

162

Page 163: Advanced Percona XtraDB Cluster in a nutshell... la suite

Save My DataBackups

163

Page 164: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsFull: Percona XtraBackup

Feature-rich online physical BackupsSince PXC 5.6.21-25.8, There is LOCK TABLES FOR BACKUP

No FLUSH TABLES WITH READ LOCK anymoreLocks only DDL and MyISAM, leaves InnoDB fully unlockedNo more need to set the backup node in DESYNC to avoidFlow Control

164

Page 165: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsFull: Percona XtraBackup

Feature-rich online physical BackupsSince PXC 5.6.21-25.8, There is LOCK TABLES FOR BACKUP

No FLUSH TABLES WITH READ LOCK anymoreLocks only DDL and MyISAM, leaves InnoDB fully unlockedNo more need to set the backup node in DESYNC to avoidFlow Control

Point In Time Recovery: Binary LogsIt's also recommended to save the binary logs to perform point-in-time recoveryWith mysqlbinlog 5.6, it's possible to stream them to another'backup' host.

165

Page 166: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsFull BackupOn pxc1, run the application:

run_app.sh pxc1

On pxc3, take a full backup with Percona Xtrabackup

166

Page 167: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsFull BackupOn pxc1, run the application:

run_app.sh pxc1

On pxc3, take a full backup with Percona Xtrabackup

# innobackupex --galera-info --no-timestamp /root/backups/ xtrabackup version 2.2.12 based on MySQL server 5.6.24 Linux (i686) (revision id: [01] Copying ./ibdata1 to /root/backups/ibdata1[01] ...done[01] Copying ./sbtest/sbtest1.ibd to /root/backups/sbtest/sbtest1.ibd...150920 08:31:01 innobackupex: Executing LOCK TABLES FOR BACKUP......150920 08:31:01 innobackupex: Executing LOCK BINLOG FOR BACKUP......150920 08:31:01 innobackupex: All tables unlockedinnobackupex: MySQL binlog position: filename 'pxc3-bin.000001',` position 3133515150920 08:31:01 innobackupex: completed OK!

167

Page 168: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsFull BackupApply the logs and get the seqno:

# innobackupex --apply-log /root/backups/

# cat /root/backups/xtrabackup_galera_infob55685a3-5f70-11e5-87f8-2f86c54ca425:1945

# cat /root/backups/xtrabackup_binlog_info pxc3-bin.000001 3133515

We now have a full backup ready to be used.

168

Page 169: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsStream Binary LogsNow setup mysqlbinlog to stream the binlogs in /root/binlogs.

As requirement, ensure the following is configured:

log_slave_updatesserver-id=__ID__

169

Page 170: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsStream Binary LogsNow setup mysqlbinlog to stream the binlogs in /root/binlogs.

As requirement, ensure the following is configured:

log_slave_updatesserver-id=__ID__

Get mysqlbinlog running:

# mkdir /root/binlogs# mysql -BN -e "show binary logs" | head -n1 | cut -f1pxc3-bin.000001# mysqlbinlog --read-from-remote-server --host=127.0.0.1 \ --raw --stop-never --result-file=/root/binlogs/ pxc3-bin.000001 &

170

Page 171: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsPoint-in-Time RecoveryOn pxc2 we update a record:

pxc2 mysql> update sbtest.sbtest1 set pad = "PLAM2015" where id = 999;Query OK, 1 row affected (0.01 sec)Rows matched: 1 Changed: 1 Warnings: 0

171

Page 172: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsPoint-in-Time RecoveryOn pxc2 we update a record:

pxc2 mysql> update sbtest.sbtest1 set pad = "PLAM2015" where id = 999;Query OK, 1 row affected (0.01 sec)Rows matched: 1 Changed: 1 Warnings: 0

And now it's time to break things, on pxc2, TRUNCATE the sbtest1table.

pxc2 mysql> truncate table sbtest.sbtest1;Query OK, 0 rows affected (0.06 sec)

172

Page 173: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsPoint-in-Time RecoveryBROKEN! What now?

173

Page 174: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsPoint-in-Time RecoveryBROKEN! What now?

Let's stop MySQL on all nodes and restore from backup.

service mysql stop

Restore the backup on pxc3:

[root@pxc3 ~]# rm -rf /var/lib/mysql/*[root@pxc3 ~]# innobackupex --copy-back /root/backups/[root@pxc3 ~]# chown mysql. -R /var/lib/mysql/

174

Page 175: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsPoint-in-Time RecoveryBROKEN! What now?

Let's stop MySQL on all nodes and restore from backup.

service mysql stop

Restore the backup on pxc3:

[root@pxc3 ~]# rm -rf /var/lib/mysql/*[root@pxc3 ~]# innobackupex --copy-back /root/backups/[root@pxc3 ~]# chown mysql. -R /var/lib/mysql/

On pxc3, we bootstrap a completely new cluster:

[root@pxc3 ~]# service mysql bootstrap-pxc

175

Page 176: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsPoint-in-Time RecoveryThe full backup is restored, now we need to do point in time recovery....

Find the position of the "event" that caused the problems

176

Page 177: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsPoint-in-Time RecoveryThe full backup is restored, now we need to do point in time recovery....

Find the position of the "event" that caused the problems

We know the sbtest.sbtest1 table got truncated. Let's find thatstatement:

[root@pxc3 ~]# mysqlbinlog /root/binlogs/pxc3-bin.* | grep -i truncate -B10OC05MzAyOTU2MDQzOC0xNzU5MDQyMTM1NS02MDYyOTQ1OTk1MC0wODY4ODc0NTg2NTCjjIc='/*!*/;# at 13961536#150920 8:33:15 server id 1 end_log_pos 13961567 CRC32 0xc97eb41f Xid = 8667COMMIT/*!*/;# at 13961567#150920 8:33:15 server id 2 end_log_pos 13961659 CRC32 0x491d7ff8 Query thread_id=100 exec_time=0 error_code=0SET TIMESTAMP=1442737995/*!*/;SET @@session.sql_mode=1073741824/*!*/;SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;truncate table sbtest.sbtest1

177

Page 178: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsPoint-in-Time RecoveryWe need to recover up to TRUNCATE TABLE, which was position13961567

We can replay the binary log(s) from the last position we backupped

178

Page 179: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsPoint-in-Time RecoveryWe need to recover up to TRUNCATE TABLE, which was position13961567

We can replay the binary log(s) from the last position we backupped

# cat /var/lib/mysql/xtrabackup_info | grep binlogbinlog_pos = filename 'pxc3-bin.000001', position 3133515

179

Page 180: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsPoint-in-Time RecoveryWe need to recover up to TRUNCATE TABLE, which was position13961567

We can replay the binary log(s) from the last position we backupped

# cat /var/lib/mysql/xtrabackup_info | grep binlogbinlog_pos = filename 'pxc3-bin.000001', position 3133515

Note that if we don't have streamed the binary logs from the backupserver... it can happen, then you need to find the position from the Xid,which is the galera seqno:

#150920 8:33:15 server id 1 end_log_pos 13961567 CRC32 0xc97eb41f Xid = 8667COMMIT/*!*/;# at 13961567

180

Page 181: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsPoint-in-Time RecoveryLet's replay it now:

# mysqlbinlog /root/binlogs/pxc3-bin.000001 \ --start-position=3133515 --stop-position=13961567 | mysql

181

Page 182: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsPoint-in-Time RecoveryLet's replay it now:

# mysqlbinlog /root/binlogs/pxc3-bin.000001 \ --start-position=3133515 --stop-position=13961567 | mysql

Let's Verify:

pxc3 mysql> select id, pad from sbtest.sbtest1 where id =999;+-----+----------+| id | pad |+-----+----------+| 999 | PLAM2015 |+-----+----------+1 row in set (0.00 sec)

182

Page 183: Advanced Percona XtraDB Cluster in a nutshell... la suite

BackupsPoint-in-Time RecoveryLet's replay it now:

# mysqlbinlog /root/binlogs/pxc3-bin.000001 \ --start-position=3133515 --stop-position=13961567 | mysql

Let's Verify:

pxc3 mysql> select id, pad from sbtest.sbtest1 where id =999;+-----+----------+| id | pad |+-----+----------+| 999 | PLAM2015 |+-----+----------+1 row in set (0.00 sec)

You can now restart the other nodes and they will perform SST.

183

Page 184: Advanced Percona XtraDB Cluster in a nutshell... la suite

Spread the loadLoad Balancers

184

Page 185: Advanced Percona XtraDB Cluster in a nutshell... la suite

Load BalancersWith PXC a Load Balancer is commonly used:

Layer 4Lot's of choiceUsually HAProxy (most-common)

Layer 7:MariaDB MaxScaleScaleArc (proprietary)ProxySQLmysql-proxy (beta)

185

Page 186: Advanced Percona XtraDB Cluster in a nutshell... la suite

Load BalancersUsually with Galera, people uses a load balancer to route the MySQLrequests from the application to a node

Redirect writes to another node when problems happenMostly 1 node for writes, others for reads

Layer 4: 1 TCP port writes, 1 TCP port readsLayer 7: Automatic (challenging)

186

Page 187: Advanced Percona XtraDB Cluster in a nutshell... la suite

Load BalancersHAProxyOn pxc1, we have HA Proxy configured like this when listening on port3308:

## active-passivelisten 3308-active-passive-writes 0.0.0.0:3308 mode tcp balance leastconn option httpchk

server pxc1 pxc1:3306 check port 8000 inter 1000 rise 3 fall 3 server pxc2 pxc2:3306 check port 8000 inter 1000 rise 3 fall 3 backup server pxc3 pxc3:3306 check port 8000 inter 1000 rise 3 fall 3 backup

187

Page 188: Advanced Percona XtraDB Cluster in a nutshell... la suite

Load BalancersHAProxyOn pxc2 and pxc3, we connect to the loadbalancer and run a SELECT:

mysql -h pxc1 -P 3308 -utest -ptest -e "select @@wsrep_node_name, sleep(100)"

And on pxc1 while the previous command is running we check theprocesslist:

pxc1 mysql> SELECT PROCESSLIST_ID AS id, PROCESSLIST_USER AS user, PROCESSLIST_HOST AS host, PROCESSLIST_INFO FROM performance_schema.threads WHERE PROCESSLIST_INFO LIKE 'select @% sleep%';+------+------+------+-------------------------------------+| id | user | host | PROCESSLIST_INFO |+------+------+------+-------------------------------------+| 294 | test | pxc1 | select @@wsrep_node_name, sleep(10) || 297 | test | pxc1 | select @@wsrep_node_name, sleep(10) |+------+------+------+-------------------------------------+

188

Page 189: Advanced Percona XtraDB Cluster in a nutshell... la suite

Load BalancersHAProxyOn pxc2 and pxc3, we connect to the loadbalancer and run a SELECT:

mysql -h pxc1 -P 3308 -utest -ptest -e "select @@wsrep_node_name, sleep(100)"

And on pxc1 while the previous command is running we check theprocesslist:

pxc1 mysql> SELECT PROCESSLIST_ID AS id, PROCESSLIST_USER AS user, PROCESSLIST_HOST AS host, PROCESSLIST_INFO FROM performance_schema.threads WHERE PROCESSLIST_INFO LIKE 'select @% sleep%';+------+------+------+-------------------------------------+| id | user | host | PROCESSLIST_INFO |+------+------+------+-------------------------------------+| 294 | test | pxc1 | select @@wsrep_node_name, sleep(10) || 297 | test | pxc1 | select @@wsrep_node_name, sleep(10) |+------+------+------+-------------------------------------+

What do you notice ?

189

Page 190: Advanced Percona XtraDB Cluster in a nutshell... la suite

Load BalancersHA Proxy & Proxy ProtocolSince Percona XtraDB Cluster 5.6.25-73.1 we support proxyprotocol! (Almost released)

190

Page 191: Advanced Percona XtraDB Cluster in a nutshell... la suite

Load BalancersHA Proxy & Proxy ProtocolSince Percona XtraDB Cluster 5.6.25-73.1 we support proxyprotocol! (Almost released)

Let's enable this in my.cnf on all 3 nodes:

[mysqld]...proxy_protocol_networks=*...

191

Page 192: Advanced Percona XtraDB Cluster in a nutshell... la suite

Load BalancersHA Proxy & Proxy ProtocolSince Percona XtraDB Cluster 5.6.25-73.1 we support proxyprotocol! (Almost released)

Let's enable this in my.cnf on all 3 nodes:

[mysqld]...proxy_protocol_networks=*...

Restart them one by one:

[root@pxc1 ~]# /etc/init.d/mysql restart...[root@pxc2 ~]# /etc/init.d/mysql restart...[root@pxc3 ~]# /etc/init.d/mysql restart

192

Page 193: Advanced Percona XtraDB Cluster in a nutshell... la suite

Load BalancersHA Proxy & Proxy ProtocolOn pxc1, we have HAProxy configured like this when listening on port3310 to support proxy protocol:

listen 3310-active-passive-writes 0.0.0.0:3310mode tcpbalance roundrobinoption httpchk server pxc1 pxc1:3306 send-proxy-v2 check port 8000 inter 1000 rise 3 fall 3 server pxc2 pxc2:3306 send-proxy-v2 check port 8000 inter 1000 backup server pxc3 pxc3:3306 send-proxy-v2 check port 8000 inter 1000 backup

And restart HAProxy:

service haproxy restart

193

Page 194: Advanced Percona XtraDB Cluster in a nutshell... la suite

Load BalancersHA Proxy & Proxy ProtocolOn pxc2 and pxc3, we connect to the loadbalancer (using the newport) and run a SELECT:

mysql -h pxc1 -P 3310 -utest -ptest -e "select @@wsrep_node_name, sleep(10)"

And on pxc1 while the previous command is running we check theprocesslist:

pxc1 mysql> SELECT PROCESSLIST_ID AS id, PROCESSLIST_USER AS user, PROCESSLIST_HOST AS host, PROCESSLIST_INFO FROM performance_schema.threads WHERE PROCESSLIST_INFO LIKE 'select @% sleep%';+------+------+------+-------------------------------------+| id | user | host | PROCESSLIST_INFO |+------+------+------+-------------------------------------+| 75 | test | pxc2 | select @@wsrep_node_name, sleep(10) || 76 | test | pxc3 | select @@wsrep_node_name, sleep(10) |+------+------+------+-------------------------------------+

194

Page 195: Advanced Percona XtraDB Cluster in a nutshell... la suite

Load BalancersHA Proxy & Proxy ProtocolTry to connect from pxc1 to pxc1, not using a load balancer:

pxc1 # mysql -h pxc1 -P 3306

What happens?

195

Page 196: Advanced Percona XtraDB Cluster in a nutshell... la suite

Load BalancersHA Proxy & Proxy ProtocolTry to connect from pxc1 to pxc1, not using a load balancer:

pxc1 # mysql -h pxc1 -P 3306

What happens?

You can't connect to mysql anymore.

When proxy_protocol_network is enabled it won't connect if youdon't send TCP proxy header!

196

Page 197: Advanced Percona XtraDB Cluster in a nutshell... la suite

Load BalancersHA Proxy & Proxy ProtocolTry to connect from pxc1 to pxc1, not using a load balancer:

pxc1 # mysql -h pxc1 -P 3306

What happens?

You can't connect to mysql anymore.

When proxy_protocol_network is enabled it won't connect if youdon't send TCP proxy header!

Let's cleanup the proxy_protocol_network and restart all nodesbefore continuing.

197