Upload
kenny-gryp
View
277
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
Synchronous Replica1on for MySQL
Kenny Gryp <[email protected]> 15 05 2014 1
Agenda
• Default asynchronous MySQL replication • Percona XtraDB Cluster: • Introduction / Features / Load Balancing
• Use Cases: • High Availability / WAN Replication / Read Scaling
• Limitations
2
Percona
• Percona is the oldest and largest independent MySQL Support, Consulting, Remote DBA, Training, and Software Development company with a global, 24x7 staff of over 100 serving more than 2,000 customers in 50+ countries since 2006 !
• Our contributions to the MySQL community include: • Percona Server, Percona XtraDB Cluster • Percona XtraBackup: online backup • Percona Toolkit, Percona Playback… • books, and research published on the MySQL Performance Blog.
3
Agenda
• Default asynchronous MySQL Replication • Percona XtraDB Cluster: • Introduction / Features / Load Balancing
• Use Cases: • High Availability / WAN Replication / Read Scaling
• Limitations
4
MySQL Replication
If your HA is based on MySQL Replication -!You may be playing a dangerous game !
5
Traditional Replication Approach
Server 1 Server 2replication stream
“master” “slave”
6
MySQL Replication 7
1 2
3 4
5 6
• Common Topologies: • Master-‐Master (Only 1 active master) • 1 or more layers of replication
• Slaves can be used for reads: • asynchronous, stale data is the rule • data loss possible (*semi-‐sync)
MySQL Replication 8
1 2
3 4
5 6
• non-‐trivial: • external monitoring • scripts for failover • add node == restore backup • much better in
MySQL 5.6: GTIDs
MySQL Replication
1 2
3 4
5 6
9
Agenda
• Default asynchronous MySQL Replication • Percona XtraDB Cluster: • Introduction / Features / Load Balancing
• Use Cases: • High Availability / WAN Replication / Read Scaling
• Limitations
10
11Data Centric
DATA
Server 1 Server 2 Server 3 Server N ...
12Percona XtraDB Cluster
13Percona XtraDB Cluster
• All nodes have a full copy of the data • Every node is equal • No central management, no SPOF
14Understanding Galera
38The cluster can be !seen as a meeting !
PXC (Galera cluster) is a meeting
bfb912e5-f560-11e2-0800-1eefab05e57d
15
PXC (Galera cluster) is a meeting
bfb912e5-f560-11e2-0800-1eefab05e57d
16
PXC (Galera cluster) is a meeting
bfb912e5-f560-11e2-0800-1eefab05e57d
17
PXC (Galera cluster) is a meeting
bfb912e5-f560-11e2-0800-1eefab05e57d
18
PXC (Galera cluster) is a meeting
bfb912e5-f560-11e2-0800-1eefab05e57d
19
PXC (Galera cluster) is a meeting
bfb912e5-f560-11e2-0800-1eefab05e57d
20
PXC (Galera cluster) is a meeting
bfb912e5-f560-11e2-0800-1eefab05e57d
21
PXC (Galera cluster) is a meeting 22
PXC (Galera cluster) is a meeting
???
23
PXC (Galera cluster) is a meeting
4fd8824d-ad5b-11e2-0800-73d6929be5cf
New meeting !
24
25CAP Theorem
• MySQL (Asynchronous) Replication: • Availability • Partition Tolerance
• Percona XtraDB Cluster • Consistency • Availability
Consistency
AvailabilityPartition
Tolerance
What is Percona XtraDB Cluster ?
• Percona Server • + WSREP patches • + Galera library • + Utilities (init, SST and cluster check scripts)
26
27
• This is a free open source solution, Percona Server is a MySQL alternative which offers breakthrough performance, scalability, features, and instrumentation. Self-‐tuning algorithms and support for extremely high-‐performance hardware make it the clear choice for organisations that demand excellent performance and reliability from their MySQL database server.
Percona Server
WSREP and Galera
• WSREP API is a project to develop generic replication plugin interface for databases (WriteSet Replication)
• Galera is a wsrep provider that implements multi-‐master, synchronous replication
• Other Implementation: MariaDB Galera Cluster
28
What is Percona XtraDB Cluster ?
Full compatibility with existing systems
29
What is Percona XtraDB Cluster ?
Minimal efforts to migrate
30
What is Percona XtraDB Cluster ?
Minimal efforts to return back to MySQL
31
32Features
• Synchronous Replication • Multi Master • Parallel Applying • Quorum Based • Certification/Optimistic Locking • Automatic Node Provisioning
33Features
• Synchronous Replication • Multi Master • Parallel Applying • Quorum Based • Certification/Optimistic Locking • Automatic Node Provisioning
34(Virtual) Synchronous Replication
• Writesets (transactions) are replicated to all available nodes on commit (and queued on each)
• Writesets are individually “certified” on every node, deterministically.Either it is committed on all nodes or no node at all (NO 2PC)
• Queued writesets are applied on those nodes independently and asynchronously
• Flow Control avoids too much ‘lag’
35(Virtual) Synchronous Replication
36(Virtual) Synchronous Replication
• Reads can read old data • Flow Control (by default 16 trx) avoids lag • wsrep_causal_reads can be enabled to
ensure full synchronous reads • Latency: writes are fast, only at COMMIT,
communication with other nodes happen
37Stale Reads
38Latency
39Features
• Synchronous Replication • Multi Master • Parallel Applying • Quorum Based • Certification/Optimistic Locking • Automatic Node Provisioning
Multi-‐Master Replication
• You can write to any node in your cluster* • Writes are ordered inside the cluster
writes
writeswrites
40
41Features
• Synchronous Replication • Multi Master • Parallel Applying • Quorum Based • Certification/Optimistic Locking • Automatic Node Provisioning
Parallel Replication
• Standard MySQL
Application writes N threads Apply 1 thread
(MySQL 5.6: max 1 thread per schema)
42
Parallel Replication
• PXC / Galera
Application writes N threads Apply M threads
(wsrep_slave_threads)
43
44Features
• Synchronous Replication • Multi Master • Parallel Applying • Quorum Based • Certification/Optimistic Locking • Automatic Node Provisioning
Quorum Based
• If a node does not see more than 50% of the total amount of nodes: reads/writes are not accepted.
• Split brain is prevented • This requires at least 3 nodes to be effective • a node can be an arbitrator (garbd), joining the communication, but not having any MySQL running
• Can be disabled (but be warned!)
45
• Loss of connectivity
Quorum Based 46
Network Problem
Does not accept Reads & Writes
• 4 Nodes
Quorum Based 47
• Default quorum configuration:4 Nodes, 0 Nodes have quorum
Quorum Based 48
Network Problem
49Features
• Synchronous Replication • Multi Master • Parallel Applying • Quorum Based • Certification/Optimistic Locking • Automatic Node Provisioning
50Certification
51Optimistic Locking
• Communication to the other nodes of the cluster only happens during COMMIT, this affects locking behavior.
• Optimistic Locking is done: • InnoDB Locking happens local to the node • During COMMIT/Certification, the other
nodes bring deadlocks
52Optimistic Locking
• Some Characteristics: • also COMMIT and SELECT’s can fail on
deadlock • Might require application changes:
Not all applications handle this properly
53Traditional InnoDB Locking
system 1Transaction 1
Transaction 2BEGIN
Transaction1
BEGINUPDATE t WHERE id=14
UPDATE t WHERE id=14...
COMMIT
Waits on COMMIT in trx 1
54InnoDB Locking With PXC
system 1Transaction 1
Transaction 2BEGIN
Transaction1
BEGINUPDATE t WHERE id=14
UPDATE t WHERE id=14...
COMMITDML,COMMIT or SELECT
ERROR due row conflict
system 2
...
55InnoDB Locking With PXC
system 1Transaction 1
Transaction 2BEGIN
Transaction1
BEGINUPDATE t WHERE id=14
UPDATE t WHERE id=14...
COMMITCOMMIT
ERROR due row conflict
system 2
...ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
56Optimistic Locking
57Features
• Synchronous Replication • Multi Master • Parallel Applying • Quorum Based • Certification/Optimistic Locking • Automatic Node Provisioning
Automatic Node Provisioning
• When a node joins the cluster: • the data is automatically copied • when finished: the new node is automatically ready and accepting connections
• 2 different types of joining: – SST (state snapshot transfer): full copy of the data – IST (incremental state transfer): send only the missing writesets (if available)
58
StateTransfer Summary
Full data!SST
Incremental!IST
New node
Node long!time!
disconnected
Node!disconnected!
short time
59
Snapshot State Transfer
mysqldump
Small!databases
rsync
Donor!disconnected!for copy time
Faster
XtraBackup
Donor!available
Slower
60
Incremental State Transfer
Node was!in the cluster
Disconnected!for maintenance
Node!crashed
61
Automatic Node Provisioning
writes
writeswrites
new node joining
data is copied via SST or IST
62
Automatic Node Provisioning
writes
writeswrites
new node joiningwhen ready
writes
63
PXC with a Load balancer
• PXC is often integrated with a load balancer • service can be checked using clustercheck or pyclustercheck
• The load balancer can • be a dedicated layer • integrated at application layer • integrated at database layer
64
Dedicated shared HAProxy
application server 1 application server 2 application server 3
PXC node 1 PXC node 2 PXC node 3
HA PROXY
65
Dedicated shared HAProxy
application server 1 application server 2 application server 3
PXC node 1 PXC node 2 PXC node 3
HA PROXY
66
Dedicated shared HAProxy
application server 1 application server 2 application server 3
PXC node 1 PXC node 2 PXC node 3
HA PROXY
SST
available_when_donor=0
67
HAProxy on application side
application server 1 application server 2 application server 3
PXC node 1 PXC node 2 PXC node 3
68
HA PROXY HA PROXY HA PROXY
Agenda
• Default asynchronous MySQL Replication • Percona XtraDB Cluster: • Introduction / Features / Load Balancing
• Use Cases: • High Availability / WAN Replication / Read Scaling
• Limitations
69
70Use Cases
• High Availability • WAN Replication • Read Scaling
High Availability 71
• Each node is the same (no master-‐slave) • Consistency ensured, no data loss • Quorum avoids split-‐brain • Cluster issues are immediately handled on • no ‘failover’ necessary • no external scripts, no SPOF
72WAN replication
MySQL
MySQL
MySQL
• No impact on reads • No impact within a trx • Communication only happens during
COMMIT (or if autocommit=1) • Use higher timeouts and
send windows
• Beware of increased latency • Within EUROPE EC2 • COMMIT: 0.005100 sec
• EUROPE <-‐> JAPAN EC2 • COMMIT: 0.275642 sec
73WAN replication -‐ latency
MySQL
MySQL
MySQL
74WAN replication with MySQL asynchronous replication
MySQL
MySQL
MySQL
•You can mix both types of replication •Good option on slow WAN link •Requires more nodes
MySQL
MySQL
MySQL
MySQL
MySQL MySQL
Cluster Segmentation -‐ WAN 75
76Read Scaling
• No Read/Write Splitting necessary in App. • Write to all nodes(*)
Agenda
• Default asynchronous MySQL Replication • Percona XtraDB Cluster: • Introduction / Features / Load Balancing
• Use Cases: • High Availability / WAN Replication / Read Scaling
• Limitations
77
78Limitations
• Supports only InnoDB tables • MyISAM support will most likely stay in alpha.
• The weakest node limits write performance • All tables must have a Primary Key!
79Limitations
• Large Transactions are not recommended if you write on all nodes simultaneously
• Long Running Transactions • If the workload has a hotspot then
(frequently writing to the same rows across multiple nodes)
• Solution: Write to only 1 node
80Credits
• WSREP patches and Galera library is developed by Codership Oyhttp://www.codership.com
• Percona & Codership will present on Percona Live UK 2014, Nov 11-‐12 http://www.percona.com/live/london-‐2013/
Summary
• Default asynchronous MySQL Replication • Percona XtraDB Cluster: • Introduction / Features / Load Balancing
• Use Cases: • High Availability / WAN Replication / Read Scaling
• Limitations
81
82Resources
• Percona XtraDB Cluster website: http://www.percona.com/software/percona-‐xtradb-‐cluster/
• Codership website: http://galeracluster.com
• PXC articles on mysqlperformanceblog: http://www.mysqlperformanceblog.com/category/percona-‐xtradb-‐cluster/
• Test it now using Vagrant ! https://github.com/jayjanssen/vagrant-‐perconahttps://github.com/lefred/percona-‐clusterhttps://github.com/percona/xtradb-‐cluster-‐tutorial/tree/v2
Questions?Kenny Gryp