Upload
laurynas-biveinis
View
426
Download
4
Embed Size (px)
DESCRIPTION
Percona Live 2014 presentation https://www.percona.com/live/mysql-conference-2014/sessions/fast-incremental-backups-percona-server-and-percona-xtrabackup
Citation preview
Fast Incremental Backupswith Percona Server and Percona XtraBackup
Laurynas Biveinis
Agenda
• Incremental XtraBackup: performance
• Incremental XtraBackup with bitmaps: performance
• The cost of the feature
• INFORMATION_SCHEMA.INNODB_CHANGED_PAGES
• Implementation
– Bitmap file format– New server thread
2
Incremental XtraBackup: Performance
3
0.10% 1.00% 10.00% 100.00%0%
10%20%30%40%50%60%70%80%90%
100%
Delta Size
Back
up T
ime
• Does time to backup depend on the % of changed data?
Incremental XtraBackup: How Data Page Copying Works
4
LSN = 950LSN = 960LSN = 960LSN = 1002LSN = 1003LSN = 940LSN = 1010
table.ibd
LSN>
1000?
Base BackupLSN = 1000
readreadreadreadread
read
writewrite
writeTable.ibd.delta
Can we avoid readingthe old pages?
MySQL
Incremental XtraBackup: Can We Avoid Reading the Old Pages?
• http://bit.ly/FBIncBackup
5
Incremental XtraBackup: Can We Avoid Reading the Old Pages?
• How do we know which pages to read then?
• Two ways to get the modification LSN of a page:
– It is written on the page, - or -
– We can figure it out from the redo log
• The log is cyclical, we must, in the server, save the info before it is overwritten
6
Changed Page Tracking
• Server:
– --innodb-track-changed-pages=TRUE
– Documentation at http://bit.ly/psbmpdoc
– 5.1/5.5/5.6
• XtraBackup:
– Zero configuration!
7
Incremental XtraBackup with Changed Page Tracking
8
LSN = 950LSN = 960LSN = 960LSN = 1002LSN = 1003LSN = 940LSN = 1010
table.ibd
LSN>
1000?
Base BackupLSN = 1000
readread
read
writewrite
writeTable.ibd.delta
PerconaServer
…Changed pages betweenLSNs 980 and 1020:1002, 1003, 1010...
Incremental XtraBackup with Changed Page Tracking: Performance
9
0.00% 0.01% 1.00% 100.00%0%
10%20%30%40%50%60%70%80%90%
100%
Full Scan
Bitmap
Delta Size
Back
up T
ime
Percona Server with Changed Page Tracking: Server Overhead
• Nothing is ever free!
– But the price might be very well acceptable
• Potential overhead #1: extra disk space requirements
• Potential overhead #2: extra code running in the server
10
Percona Server with Changed Page Tracking: Server Overhead
11
1 2 3 4 5 6 7 80
100
200
300
400
500
600
700
800
Log and bitmap file size comparison
Bitmap file #
Log b
yte
s /
bit
map b
yte
• A good case: > 100 log bytes for 1 bmp byte
Percona Server with Changed Page Tracking: Server Overhead
12
• A bad case: 3-15 log bytes per 1 bmp byte
• https://bugs.launchpad.net/bugs/1269547
– We are considering fix options
Percona Server with Changed Page Tracking: Server Overhead
• Impact on TPS and response time:
– Couldn't find it
– If you ever do find it, report it to us and try --innodb_log_checksum_algorithm=crc32
● http://bit.ly/pslogcrc32
13
Bitmap File Naming & Sizing
• ib_modified_log_<seq>_<LSN>.xdb
– <Seq>: 1, 2, 3, ...
– <LSN>: the server LSN at the file create time
• Rotated on
–Server start
–innodb_max_bitmap_file_size
14
Bitmap File Management
• PURGE CHANGED_PAGE_BITMAPS BEFORE <lsn>
– ib_1_8192.xdb
– ib_2_10000.xdb
– ib_3_20000.xdb
– Full backup taken, LSN = 22000
– PURGE C_P_B BEFORE 22000;
– ib_4_30000.xdb
– Incremental backup taken, LSN = 33000
– PURGE C_P_B BEFORE 33000;
15
INFORMATION_SCHEMA.INNODB_CHANGED_PAGES
• Percona Server can read the bitmaps too
16
SHOW CREATE TABLE INFORMATION_SCHEMA.INNODB_CHANGED_PAGES;CREATE TABLE `INNODB_CHANGED_PAGES` ( `space_id` int(11) unsigned NOT NULL DEFAULT '0', `page_id` int(11) unsigned NOT NULL DEFAULT '0', `start_lsn` bigint(21) unsigned NOT NULL DEFAULT '0', `end_lsn` bigint(21) unsigned NOT NULL DEFAULT '0')
• start_lsn and end_lsn are always at the checkpoint boundary
• Does not show the exact LSN of a change
• Does not show the number of changes for one page
• Does show the number of flushes for a page over the workload
INFORMATION_SCHEMA.INNODB_CHANGED_PAGES
17
SELECT * FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES;space_id page_id start_lsn end_lsn0 0 8204 384700 1 8204 384705 0 8204 384705 3 8204 384700 1 38471 500005 3 38471 500005 3 50001 60000
• Don't query like that in production!
– It will read all the bitmaps you have. Gigabytes, terabytes, ...
– Add WHERE start_lsn > X AND end_lsn < Y (index condition pushdown implemented for this case)
INFORMATION_SCHEMA.INNODB_CHANGED_PAGES
• Which tables are written to?
18
SELECT DISTINCT space_id FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES WHERE ...;space_id010
SELECT DISTINCT t1.space_id AS space_id, t2.schema AS db, t2.name AS tname FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES AS t1, INFORMATION_SCHEMA.INNODB_SYS_TABLES AS t2 WHERE t1.space_id = t2.space AND t1.start_lsn >...space_id db tname0 SYS_FOREIGN0 SYS_FOREIGN_COLS10 test foo
INFORMATION_SCHEMA.INNODB_CHANGED_PAGES
• What are the hottest tables?
19
SELECT space_id, COUNT(space_id) AS number_of_flushes FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES GROUP BY space_id ORDER BY number_of_flushes DESC;space_id number_of_flushes0 6510 511 4
INFORMATION_SCHEMA.INNODB_CHANGED_PAGES
• What are the hottest pages?
20
SELECT space_id, page_id, COUNT(page_id) AS number_of_flushes FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES GROUP BY space_id, page_id HAVING number_of_flushes > 2 ORDER BY number_of_flushes DESC LIMIT 8;space_id page_id number_of_flushes0 5 30 7 30 0 20 11 210 3 20 1 20 12 20 2 2
INFORMATION_SCHEMA.INNODB_CHANGED_PAGES
• For complex queries, copy data first
21
CREATE TEMPORARY TABLE icp (space_id INT(11) NOT NULL, page_id INT(11) NOT NULL, start_lsn BIGINT(21) NOT NULL, end_lsn BIGINT(21) NOT NULL, INDEX page_id(space_id, page_id), INDEX start_lsn(start_lsn), INDEX end_lsn(end_lsn)) ENGINE=InnoDB;
INSERT INTO icp SELECT * FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES WHERE start_lsn > 8000;
INFORMATION_SCHEMA.INNODB_CHANGED_PAGES
• For complex queries, copy data first
22
EXPLAIN SELECT DISTINCT space_id FROM INFORMATION_SCHEMA.INNODB_CHANGED_PAGES;id select_type table type possible_keys key key_lenref rows Extra1 SIMPLE INNODB_CHANGED_PAGES ALL NULL NULL NULLNULL NULL Using temporary
EXPLAIN SELECT DISTINCT space_id FROM icp;id select_type table type possible_keys key key_lenref rows Extra1 SIMPLE icp index NULL page_id 8 NULL 74 Using index
Implementation: File Format
23
Data for checkpoint at LSN 9000
LSN 10000
LSN 10500
A sequence of per-checkpoint varying number of data pages:
For each checkpoint:
space, start page space, start page space, start page
4KB
Each page contains a bitmap for the next 32480 pages in space starting from start page
Implementation: Server Side
• A new XtraDB thread
– 1. Wait for log checkpoint completed event
– 2. Read the log up to the checkpoint, write the bitmap
– 3. goto 1
• Little data sharing with the rest of XtraDB
– log_sys->mutex for:● setting and getting LSNs;● calculating log read offset from LSN.
• Little extra code for the query threads
– Unread log overwrite check– Firing of the log checkpoint completed event
24
Implementation: Things We Had to Account For
• Maximum checkpoint age violation
– Destroys untracked log data
– Make effort to avoid, but in the end we allow to overwrite it
– Responding server > fast backups
• Crash recovery
– Re-read the log if available
25
Conclusions
• Percona Server together with Percona XtraBackup:
• Enable faster incremental backups
• Enable more frequent incremental backups
• Does not hurt server operation, but have to manage the bitmaps now
• New INFORMATION_SCHEMA table for gaining insight into data change patterns
• Is actually being used, http://bit.ly/psbmpbugs
• Thank you! Questions?
26